Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: top-level datasets to be more-expressive #4004

Open
ijlyttle opened this issue Jul 11, 2018 · 4 comments
Open

Feature request: top-level datasets to be more-expressive #4004

ijlyttle opened this issue Jul 11, 2018 · 4 comments

Comments

@ijlyttle
Copy link
Contributor

This may be a follow-on to #3806, and might also affect vega/altair#951.

My suggested implementation may be problematic, but I hope it will be enough to begin a discussion.

Problem

Currently we can use top-level datasets to map names to inline datasets, see spec-datasets. This is convenient for keeping the spec lightweight and comprehensible, especially for layered charts.

If I want to do the same thing, instead referring to a dataset using a url, I (think I) have to use the format shown in spec-csv. The issue is that if I want to switch out my dataset, I need to do it in two places.

What I would like to be able to do

I would like to be able to do something like shown in spec-proposed, where datasets maps names to objects containing data specifications (not just inline values).

As a further request (which may be completely unreasonable), I wonder if datasets could map names to objects that would contain data and, optionally, transform specifications, as transformed data remains data. For example:

{
  ...,
  "datasets": {
    "data-001": {
      "data": {...},
      "transform": {...}
    }
  },
  ...
}

spec-datasets

{
  "$schema": "https://vega.github.io/schema/vega-lite/v2.4.1.json",
  "config": {"view": {"height": 300, "width": 400}},
  "datasets": {
    "data-001": [
      {"a": "A","b": 28}, {"a": "B","b": 55}, {"a": "C","b": 43},
      {"a": "D","b": 91}, {"a": "E","b": 81}, {"a": "F","b": 53},
      {"a": "G","b": 19}, {"a": "H","b": 87}, {"a": "I","b": 52}
    ]
  },
  "layer": [
    {
      "data": {"name": "data-001"},
      "encoding": {
        "x": {"field": "a", "type": "nominal"},
        "y": {"field": "b","type": "quantitative"},
        "opacity": {"value": 0.5}
      },
      "mark": "bar"
    },
    {
      "data": {"name": "data-001"},
      "encoding": {
        "x": {"field": "a", "type": "nominal"},
        "y": {"field": "b","type": "quantitative"}
      },
      "mark": "line"
    }
  ]
}

spec-csv

{
  "$schema": "https://vega.github.io/schema/vega-lite/v2.4.1.json",
  "config": {"view": {"height": 300, "width": 400}},
  "layer": [
    {
      "data": {
        "url": "https://ijlyttle.github.io/vega-lite-demo/data-raw/data_01.csv",
        "format": {"type": "csv"}
      },
      "encoding": {
        "x": {"field": "category", "type": "nominal"},
        "y": {"field": "number","type": "quantitative"},
        "opacity": {"value": 0.5}
      },
      "mark": "bar"
    },
    {
      "data": {
        "url": "https://ijlyttle.github.io/vega-lite-demo/data-raw/data_01.csv",
        "format": {"type": "csv"}
      },
      "encoding": {
        "x": {"field": "category", "type": "nominal"},
        "y": {"field": "number","type": "quantitative"}
      },
      "mark": "line"
    }
  ]
}

spec-proposed

{
  "$schema": "https://vega.github.io/schema/vega-lite/v2.4.1.json",
  "config": {"view": {"height": 300, "width": 400}},
  "datasets" : {
    "data-001": {
      "data": {
        "url": "https://ijlyttle.github.io/vega-lite-demo/data-raw/data_01.csv",
        "format": {"type": "csv"}
      },
    }
  },
  "layer": [
    {
      "data": {"name": "data-001"},
      "encoding": {
        "x": {"field": "category", "type": "nominal"},
        "y": {"field": "number","type": "quantitative"},
        "opacity": {"value": 0.5}
      },
     "mark": "bar"
    },
    {
      "data": {"name": "data-001"},
      "encoding": {
        "x": {"field": "category", "type": "nominal"},
        "y": {"field": "number","type": "quantitative"},
      },
      "mark": "line"
    }
  ]
}
@kanitw
Copy link
Member

kanitw commented Jul 11, 2018

I can see this being useful. Btw, I think we don't need additional data nesting in


    "data-001": {
      "data": {
        "url": "https://ijlyttle.github.io/vega-lite-demo/data-raw/data_01.csv",
        "format": {"type": "csv"}
      },
    }

This should be sufficient:

    "data-001": {
      "url": "https://ijlyttle.github.io/vega-lite-demo/data-raw/data_01.csv",
      "format": {"type": "csv"}
    }

@kanitw kanitw added this to the x.x Data & Transforms milestone Jul 11, 2018
@domoritz
Copy link
Member

I think we could support top level URL data but I'm hesitant about transforms.

@ijlyttle
Copy link
Contributor Author

For me, being able to use top-level URL data is much more important than the transforms - I just thought to bring it up while the topic might be "open".

If transforms will not be available at the top-level, then there is no need for the data nesting, so I would agree completely with @kanitw's suggestion. (Congrats btw!)

@kanitw
Copy link
Member

kanitw commented Jul 11, 2018

I think we could revise if we want the support for transform later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants