Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optionally hit worker even if some sources didn't hold data for the tile #75

Open
tcql opened this issue Jan 26, 2016 · 6 comments
Open

Comments

@tcql
Copy link
Contributor

tcql commented Jan 26, 2016

Per this code:

for (var i = 0; i < results.length; i++) {
  data[sources[i].name] = results[i];
  if (!results[i]) return process.send({reduce: true});
}

the worker bails out and returns a reduce event if any source doesn't have data for the requested tile. This is usually great, but in some cases where you want to compare disparate data sources and are relying on reduce events to send back information about how much data each source does or doesn't exist in a tile, you end up losing information.

For example, if I want to find the length of roads in San Francisco that are matched by GPS datapoints. I would like to keep a tally of the total length of road in the bbox, as well as how much is matchable by GPS points. Right now, if there is no GPS data in the tile, we bail out, so I'm missing some of the total length information.

To maintain compatibility and provide optimization for the usual cases where you want this bail-out behavior, I'm proposing we add a tile-reduce option for this, maybe requireAllSources: false (defaulted true).

cc @morganherlocker @aaronlidman @mourner

@mourner
Copy link
Member

mourner commented Jan 27, 2016

👍 makes sense, let's add an option.

@morganherlocker
Copy link
Contributor

I'm proposing we add a tile-reduce option for this, maybe requireAllSources: false (defaulted true).

I like this proposal, but might counter with something slightly more fine-grained:

val state
requireData:'all' all sources must have data (current defualt)
requireData:'any' at least 1 source must have data
requireData:'none' function is called even on empty tiles

The any option is actually what I thought the current default was, and would cover cases like Tim's road analysis. The none option is one I have wanted to have for certain unusual cases, such as jobs that request data from a non-tiled source (I have done this with satellite data and can think of other cases), and in cases where results might need to be logged out somewhere, even (especially?) in cases of no data.

Thoughts?

@tcql
Copy link
Contributor Author

tcql commented Apr 27, 2016

bump. any further thoughts here? I'm fine with having 3 modes per @morganherlocker's

val state
requireData:'all' all sources must have data (current default)
requireData:'any' at least 1 source must have data
requireData:'none' function is called even on empty tiles

@mourner @aaronlidman

@mourner
Copy link
Member

mourner commented Apr 27, 2016

I'm fine with the API proposed above too.

@jenningsanderson
Copy link

👋 Any update on this? I'd love to see this feature supported as well.

@iandees
Copy link

iandees commented Feb 10, 2022

Hello from the future! I am running into this today and would also love to have this feature. I'm hacking in a workaround for now, but excited for #110.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants