Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap #4

Open
3 of 14 tasks
Tokazama opened this issue Jan 23, 2020 · 6 comments
Open
3 of 14 tasks

Roadmap #4

Tokazama opened this issue Jan 23, 2020 · 6 comments

Comments

@Tokazama
Copy link
Member

Tokazama commented Jan 23, 2020

General Approach

In the absence of a formal write up (which I will upload shortly before JuliaCon) the strategy for development throughout this org is this: Write the code we need to do neuroscience research and do it well. Any code that ends up being useful to a wider community (e.g., plotting, stats, etc.) may eventually become part of a different ecosystem (e.g., JuliaPlots, JuliaStats, etc.).

From a pure user perspective this means that if code move from this org to another it won't be noticeable. From a developer perspective it means that code developed here might end up somewhere else eventually if it means it will be maintained by a more appropriately focused team of developers. Therefore, we use the commonly used MIT license to make this sort of thing easy and we play nice with our peers.

For example, at the time of writing this I have a substantial PR providing data type (time, space, observation, etc.) methods. Instead of waiting for the entire Julia community to agree upon a standard for referring to time data we can move the code to a broader community once it appears stable and useful. This means some packages may end up simply end up being short scripts that bind a variety of packages together in a useful way (see Makie.jl for an example) or just formal documentation on how to perform analyses with a few convenience functions for learning (see StatisticalRethinking.jl).

The end result should look the same to users, a Julian approach to neuroinformatics.

Public Release

These are things that need to be taken care of before a wider public release of the package can happen.

  • Basic types
    • AbstractArray subtype
  • Comprehensive testing: This is a core package that I anticipate very few new types being a part of. Therefore, I anticipated code coverage to be crucial and measured accurately by Codecov measures.
    • Type stability: Where appropriate we should ensure that users don't have to worry about NeuroCore.jl causing type stability issues. (Resolved using FieldProperties.jl)
    • Sensible defaults: I don't anticipate many to be implemented. Right now the default is to error unless it's part of a dictionary look-up for the property.
    • User level named dimension manipulation: There is some important user level functionality dedicated to manipulating named dimensions that are important to neuroscience data.
  • Documentation:
    • Document all user facing methods
    • Formal documentation manual
    • Readme/introduction with intention/purposes/goals of NeuroCore.jl

File Format Support

I plan on getting the following supported (mostly because I regularly use these):

  • NIfTI
    • I will be personally taking care of this one (will be ready very soon after NeuroCore is ready for wide public use)
  • CIFTI
    • Requires geometry and graph types. This will come as an extension on NIfTI support
  • GIFTI
    • There is limited support exist in GIFTI.jl.
  • Some DICOM conversion
    • A lot of this could be accomplished by simply loading DICOMs into a format that's compatible with NeuroCore and then mapping various scanner model's data to properties
  • BDF/EDF
    • This exists but we need to either 1) make some changes to the existing packages so that files load into structures that conform NeuroCore or 2) heavily rely on FieldProperties to map NeuroCore properties

Here are some other formats that may be worth supporting but would require someone else to take the lead.

  • .vhdr
  • .vmrk
  • .eeg
  • .set
  • .fdt
  • .mef
  • .nwb
  • .fif

Type Interfaces

AbstractArray support

We need an AbstractArray subtype with the following features:

  1. Flexible metadata storage
  2. Named dimensions
  3. Indexing by units/keys

This is largely accomplished by AxisIndices.jl. I've been heavily testing it and hope to have it at a point where we can simply perform things like PCA and ICA on types present through this.

Geometry Types

I'm leaning towards using GeometryTypes.jl, which will provide support for plotting with very little effort.

Graph Types

Connectomes and potentially data access patterns.

Plotting and Visualization

The backend for this will likely be all or mostly Makie.jl.

Note that work for this specific objective will start rolling out after March 16th (after VizCon 2). This will ensure that developments in this area are in harmony greater harmony with future directions of Julia's various plotting ecosystems.

@Tokazama Tokazama changed the title Initial Release/Development and Integration Plans Roadmap Feb 10, 2020
@Tokazama Tokazama pinned this issue Feb 10, 2020
@timholy
Copy link
Member

timholy commented Feb 10, 2020

WRT an AxisArrays replacement, should we evaluate DimensionalData.jl? Both DimensionalData and NamedDims.jl happen to be standing at 227 commits, but the current effort being spent in DD is quite high.

@Tokazama
Copy link
Member Author

should we evaluate DimensionalData.jl?

tl;dr I'm open to whatever solution the wider community ends up adopting (particularly JuliaImages) but I currently think that NamedDims has some big advantages

Although DimensionalData.jl has good performance I think NamedDims.jl has had thought put into every single method's performance. For example you'll find this comment peppered throughout the code 0-Allocations see: @Btime (()->dim((:a, :b), :b))()`. Part of this may just be that more people seem to be contributing to NameDims.jl in discussion and PRs.

I have a lot of nit picky things about its design that I don't like. Most of my issue with it stem from it requiring users to specify dedicated types for each dimension. I think this results in a lot of unfriendly syntax and I suspect that it could lead to more burdensome maintenance in the future if it's widely adopted (if two different packages define the Time dimension type then everything breaks when those two packages are loaded).

I admit that a lot of these reasons aren't completely concrete (ugly syntax to one is beautiful Python syntax to another 😉). Therefore, if the consensus ends being that DimensionalData is the way to go then I will fully support that.

One last point. I think the problem with indexing could be solved relatively quickly once people agree and get behind a single approach. I really like what @mcabbott and I are circling around at the end of JuliaCollections/AxisArraysFuture#1. I have most of the code implemented for what I've proposed and I'm starting to right some more examples here.

@timholy
Copy link
Member

timholy commented Feb 10, 2020

Good to know. Until this morning I hadn't realized I wasn't watching NamedDims.jl, so I missed all those discussions. Maybe a good thing though, I've not been ready yet to tackle this issue with the seriousness it deserves, so best if others take the lead. But some of our other cleanups are getting sufficiently complete that this one is rising higher in the priority queue.

@Tokazama
Copy link
Member Author

Tokazama commented Feb 10, 2020

I just pushed some more examples and benchmarks to the last link I referenced. It's not pretty enough for a formal package, but I've tried to explore a lot of generic array interfaces as far as I can reasonably take it (e.g., how would it look to perform cat append!, push!, etc.). The trouble I've found with a lot of this is it's hard to explain why a certain approach might be better without getting a working demonstration, so hopefully this will prove helpful as people have time to look at it.

I've not been ready yet to tackle this issue with the seriousness it deserves

I appreciate any contributions you're able to make. I understand you are very busy with other things.

As for the rest of the roadmap (or lack thereof) I will be finish the broad picture over the next couple days and create separate issues to explore the more granular details.

@Tokazama
Copy link
Member Author

Tokazama commented Feb 19, 2020

Registering a package soon that should help with (or even solve) the array indexing issue here: https://github.com/Tokazama/AxisIndices.jl

@behinger
Copy link

behinger commented Jun 8, 2020

not sure where the best place is to write this, I wrote a very basic eeglab .set/.fdt importer:
https://github.com/unfoldtoolbox/unfold.jl/blob/master/test/debug_readEEGlab.jl

Needs to be adapted and extended to NeuroCore - I will do so happily once a bit more docu is there :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants