Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more window transform examples #3572

Closed
3 tasks done
kanitw opened this issue Mar 26, 2018 · 6 comments
Closed
3 tasks done

Add more window transform examples #3572

kanitw opened this issue Mar 26, 2018 · 6 comments
Assignees

Comments

@kanitw
Copy link
Member

kanitw commented Mar 26, 2018

(Add to window.md)

@kanitw
Copy link
Member Author

kanitw commented Mar 26, 2018

@AkshatSh

The window_student_rank.vl.json doesn't show anything -- so I'm gonna remove it first.

Please correct it and submit them in a follow-up PR.

I also wonder why the frame for rank and count isn't the same. (Maybe it's correct, but I want to understand why -- given that it's not showing anything I suspect that it's wrong.)

{
  "$schema": "https://vega.github.io/schema/vega-lite/v2.json",
  "description": "A bar graph showing the scores of the top 5 students. This shows an example of the window transform, for how the top X can be filtered, and also how a rank can be computed for each student.",
  "width": 300,
  "height": 50,
  "data": {
    "values": [
      {"student": "A", "score": 100}, {"student": "B", "score": 56},
      {"student": "C", "score": 88}, {"student": "D", "score": 65},
      {"student": "E", "score": 45}, {"student": "F", "score": 23},
      {"student": "G", "score": 66}, {"student": "H", "score": 67},
      {"student": "I", "score": 13}, {"student": "J", "score": 12},
      {"student": "K", "score": 50}, {"student": "L", "score": 78},
      {"student": "M", "score": 66}, {"student": "N", "score": 30},
      {"student": "O", "score": 97}, {"student": "P", "score": 75},
      {"student": "Q", "score": 24}, {"student": "R", "score": 42},
      {"student": "S", "score": 76}, {"student": "T", "score": 78},
      {"student": "U", "score": 21}, {"student": "V", "score": 46}
    ]
  },
  "transform": [{
    "window": [{
      "op": "rank",
      "field": "score",
      "as": "rank"
    }],
    "sort": [{ "field": "score", "order": "ascending" }],
    "groupby": [
      "Student"
    ],
    "frame": [null, 0]
  },
  {
    "window": [{
      "op": "count",
      "field": "score",
      "as": "totalStudents"
    }],
    "sort": [{ "field": "score", "order": "ascending" }],
    "groupby": [
      "Student"
    ],
    "frame": [null, null]
  },
  {
    "filter": "datum.totalStudents - datum.rank > 5"
  }],
  "mark": "bar",
  "encoding": {
    "x": {
        "field": "Score",
        "type": "quantitative",
        "axis": { "title": "Score", "grid": false }
    },
    "y": {
        "field": "student",
        "type": "nominal",
        "scale": { "rangeStep": 12 },
        "axis": { "title": "" }
    },
    "color": {
        "field": "student",
        "type": "nominal"
    }
  }
}

@AkshatSh
Copy link
Contributor

AkshatSh commented Mar 26, 2018

Rank and Count have different frames because they compute two different things.

Rank Window Transform computes "how many students am I better than?" (rank) and the Count Window Transform computes "how many students total?" (totalStudents)

To get the Top K, I select (totalStudents - Rank) < K. Which means the top scoring student would be better than all students, so Rank would be equal to TotalStudents. The second student will have a Rank of Total Students - 1, so totalStudents - (totalStudents - 1) = 1, and so forth.

This way the best student will get 0, the second best will get 1, and we can filter all these values while the value produced is less than K to get the top K students.

The fix here was to change Score to score and make the filter "filter": "datum.totalStudents - datum.rank < 5" instead of "filter": "datum.totalStudents - datum.rank > 5".

I have a WIP PR for these changes and other examples: #3573

@kanitw
Copy link
Member Author

kanitw commented Mar 26, 2018

You rationale still doesn't explain why rank would use "frame": [null, 0], not [null, null]

@AkshatSh
Copy link
Contributor

I wasn't sure whether [null, null] or [null, 0] would be the appropriate one. [null, 0] is the default value for the window transform, so I assumed to get it working as expected I would use [null, 0]. Also when I was thinking about rank I thought about it as the rank up until the current point hence I defined the frame as [null, 0].

I can change it to [null, null] if that is better.

@kanitw
Copy link
Member Author

kanitw commented Mar 26, 2018

I d say test and see what s right.

@AkshatSh
Copy link
Contributor

Update on this: After checking with the vega channel, rank does not rely on the frame. So I moved the example to have count and rank in the same window transform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants