-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset rework #96
Comments
This looks quite promising to me but also complex . At this moment I can't foresee all the implications this change will have (except the obvious breaking changes). Do you have a branch were you're working? I'm interested to see how easy/complex it is to port some of the samples. I'd look into a simple one and a complex one. This has the advantage that it uncovers flaws or weaknesses pretty immediately. It also gives you the library user's perspective. Another suggestion I have would be to support the library user as much as possible, but never restrict them. Inevitably we'll restrict the user so it'd be helpful to provide an "escape hatch", by that I mean a way of circumventing our abstractions to allow for unforeseen usages. |
For the end user it shouldn't be complex at all (that was one of my biggest requirements for this rework). They can use the datasets like a normal I don't know if I have time to push it as a branch today but in the meantime you should check out the last code piece in the issue. It's the For the last point, I also think it's important to keep it flexible (that was another big requirement). This solution is very flexible because you can subclass any datset to add properties, you can implement your own datasets with As a final note, I know it's unfortunate to have so many breaking changes but I really do believe they're necessary. Especially the datasets are a mess right now but there are other things that need reworking. I would suggest that we start working on release branches where we batch breaking changes. Only once multiple breaking changes are done, we can rebase onto master to get all the bug fixes and small features that were done there in the meantime and then merge back into master. It takes much more merge effort and requires all the developers to keep track of where we currently stand but it'll make actual releases much more reasonable and allows us to release minor versions when small features and bug fixes are done without conflicting with breaking changes. This would also be a good time to tidy up our GitHub project and possibly start using milestones to mark which features belong in which release. That way we can plan major releases beforehand and work on our big features in an organized manner. |
Also something to consider, regarding the simplicity and extensibility: I think most users use our library for fast and easy access to creating charts from their data. Although we 100% need to make sure our library can be easily extended, it should be our focus to simplify the process of creating a basic chart. Currently using the wrappers and the covariant datasets is much more complicated than it needs to be. A normal end user shouldn't be exposed to these kinds of concepts if they're not needed. Instead, it should feel like they're working with familiar classes like |
Looks very promising. I agree, most of the users of this Package are looking for a easy to use package to present their data, and as such I agree that simplifying the usage of the API is very important. But we shouldn't simplify the API itself, it should still be as powerful as possible, but easier to use. Also, I think the finished API should never throw a Furthermore, I think we should group all breaking changes together and then release a version 2, this way anyone using our package needs to only rewrite their code interfacing with our package once, and not multiple times. And while were at it, we should also help users upgrading their projects to the new version with before and after examples as well as a well documented list of all breaking changes, where they can view what changed, how it changed, and how they have to update their code I'll copy your code snippets you have here to get a further understanding for your mock implementations and probably come back with some feedback. |
Thanks for the feedback. I'm not sure how to understand your first paragraph. Could you elaborate on what you mean by that?
Regarding the second paragraph, I'd have to disagree. The way I want to implement the API on the dataset collections (the datasets themselfs don't throw) is modeled after how C# arrays handle it.
You could argue that it's not necessary to use the dataset collections in such APIs and I'd agree but in what way does it hurt implement the interface explicitly? Totally agree on the third paragraph. I don't know if it makes sense to put all the breaking changes that are currently planned (this, #70, #95, #78) in a single |
Describe the feature request
Currently the datasets are all over the place. Some have base classes, some don't. Mixable charts use the covariant
IMixableDataset<T>
which allows for different types of datasets in one chart but with many downsides. I once was really proud of my implementation of those mixable datasets (when I started learning about covariance) but let's face it, I did horribly. Having to use wrappers for all value types is really bad and extending this thing seems like a nightmare. Now with the recent interop-layer rework (not #70, that one 4 month ago), we need to have an id for each dataset so now we also have anIDataset
interface that ensures that we have an accessable id.The datasets are stored in
config.data.dataset
. This is usually a collection (eitherList
orHashSet
) containing objects of typeIMixableDataset<object>
or some specific class likeBubbleDataset
. For charts that only support one dataset-type with one data structure, using aList
along with a specific class, works fine. However, things get complicated once you're dealing with charts like the line chart. The line chart allows for different types of datasets (it's perfectly legal to have a bar dataset in a line chart) and the line chart also accepts data in different forms (array of numbers, array of number-points, array of time-points). These have to somehow be mixable without breaking typesafety and without allowing you to add other types of datasets.Which charts does this feature request apply to?
All charts
Describe the solution you'd like
I'd like the datasets and the collection of datasets to be enjoyable to use, performant and extendable.
Enjoyable to use
Dataset
ArrayList
orList<object>
or anything of the sorts. If you can modify it, it should be in a typesafe manner.IList<T>
. Until recently (and maybe even now) some datasets don't really allow modification but I'd like to have all the methods ofIList<T>
and as a bonus alsoAddRange
.Wrap()
extension methods are helpful but it's still not great.Dataset collection
LineDataset<string>
to a line chart because it doesn't support string values. This needs to be a compilation error, not aNotSupportedException
!IList<IDataset>
but preventing the user from adding anyIDataset
. Instead theAdd
methods are exposed with only supported overloads. Anything else throws aNotSupportedException
(or better, a compilation error, if possible).Performant
That's actually not that big of a deal considering the bottleneck in this library is likely the usage of reflection and dynamic features but I have not tested it. Still it would be nice if we weren't in boxing hell like we are currently with
IMixableDataset
and the wrappers.Extendable
We should split these datasets into interfaces and base classes so we can extend and modify them if needed. After all we are modeling a JavaScript library so it's definitely of value to be flexible. We don't want to get rid of typesafety (!) but being able to extend the model from outside without heavy reflection usage is great for something like this.
API proposal
I have already implemented the base system and this time I think I can actually be proud of it.
It's entirely typesafe but still flexible, raises compiler errors when you try to add a not supported dataset, has full support for structs, allows for object and collection initializers and implements
IList
as best as possible.There is the base interface
IDataSet
. This interface has no type associated with it. It's more a semantic restriction than anything else but it does contain the most important properties beingId
andType
. It's also important when storing datasets since it's the base of any dataset.Implementing that base interface is the generic version
IDataSet<T>
whereT
can be any type. Noin
orout
modifiers apply. This interface also assures that implementers need to expose a read-only list ofT
.Then the base class for all the datasets is
DataSet<T>
. This is a class that implementsIDataSet<T>
andIList<T>
for modification. It exposes the contents through a read-only property and thereby implementsIDataSet<T>.Data
. It also contains theAddRange
methods we love.Now the dataset collections. In many cases we can just use
List<BubbleDataset>
or something similar so we don't want too much abstraction.For charts that support more than that, we have the handy class
DatasetCollection
. It implementsIList<IDataset>
so you can do modifications as you please with the exception of addingIDataset
. TheAdd
andInsert
methods are implemented explicitly (and therefore can't be called unless casted) and throw aNotSupportedException
if used. However, they expose theprotected
methodAddDataset
which can add any dataset. This is the only way (disregarding reflection) to add datasets to this collection.Now we can derive from that collection and add our own
Add
methods. TheAdd
method is overloaded for every supported dataset. In the case of the line chart, this means every dataset that consists of eitherint
s,long
s,double
s,Point
s orTimePoint
s. For each possibility there is anAdd
method. Not only is the name intuitive, it's also the key to collection initializers. They search for an overload ofAdd
when the base implementsIEnumerable
(whichDatasetCollection
already does). We can add some abstraction to those collections like theNumberDatasetCollection
and theNumberAndPointDatasetCollection
but for now, there aren't interfaces likeINumberDatasetCollection
andIPointDatasetCollection
. You'd still need to implement both so it's not really useful unless we start using composition but then things get even more complex.And here's the code
This is the base implementation for every dataset.
And here is how you can store those datasets.
Now for some examples, shall we?
This is the simplified dataset for a line chart. Fully generic, supports value types and has the type already assigned. If you implement a new line-like chart, you can derive from this class and use the protected constructor to inject your own type.
In order to store all the datasets a line chart supports, we also need a
DatasetCollection
with the appropriateAdd
methods. We could just write those all in one but it makes more sense to get a certain degree of abstraction which helps implementing other charts such as the bar chart.What's missing is the
LineData
class. It just contains the correct dataset collection and the labels. The labels are only serialized when they contain data but they're still get-only.The bar example is similar but a bit more specific.
Another example is for the bubble chart. This one is less specific (both spectrums are well supported).
And here's all the rest I have/you need.
Additional context
Point
andTimeTuple
and many other classes as structs as seen here. They might need some tweaking but are already improved a lot over the current classes.Nullable<T>
(for the defaults, issue incoming). We'll have to see.Final words
Although I already thought this issue through quite a bit, I'm not sure on all the specifics yet. Please tell me anything you see wrong with this, potential pitfalls, missing features, etc.
I'd love to hear feedback on this!
The text was updated successfully, but these errors were encountered: