-
-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
invalid regex test #89
Comments
closing this and putting in the polylines subrepo , tho the hits vs misses question is still relevant for me |
orangejulius
added a commit
to pelias/polylines
that referenced
this issue
Jul 3, 2019
We have had numerous reports from Pelias users about concerning error message during builds regarding the URL regex filter from pelias/model#115. While this filter is good, the resulting error message is alarming. Looking today at the output of a planet build, it appears that many of these errors come from the polylines file created by Valhalla out of the OSM street network. Looking at the contents of the polyline file and corresponding record on OSM, it seems that Valhalla puts the contents of the `ref` tag in the polyline file as an alternate name. The [ref tag](https://wiki.openstreetmap.org/wiki/Key:ref?uselang=en-US) will often contain a URL. This means that not only will the error happen frequently, but many records that are actaully valid will be filtered out. An example of this is the [Iowa Women of Achievement bridge](ttps://www.openstreetmap.org/way/65066830) which is completely valid in terms of name, geometry, and tagging but contains a URL in the `ref` field. The polylines importer currently selects a single name value from the list of names in the polylines file by choosing the longest. This PR adds an additional filter that first removes any URL-like values from consideration, and should completely eliminate any of the otherwise concerning errors while ensuring all valid records make it into Elasticsearch. Fixes pelias/whosonfirst#456 Fixes #216 Fixes pelias/docker#89 Connects pelias/model#116
orangejulius
added a commit
to pelias/polylines
that referenced
this issue
Jul 3, 2019
We have had numerous reports from Pelias users about concerning error message during builds regarding the URL regex filter from pelias/model#115. While this filter is good, the resulting error message is alarming. Looking today at the output of a planet build, it appears that many of these errors come from the polylines file created by Valhalla out of the OSM street network. Looking at the contents of the polyline file and corresponding record on OSM, it seems that Valhalla puts the contents of the `ref` tag in the polyline file as an alternate name. The [ref tag](https://wiki.openstreetmap.org/wiki/Key:ref?uselang=en-US) will often contain a URL. This means that not only will the error happen frequently, but many records that are actaully valid will be filtered out. An example of this is the [Iowa Women of Achievement bridge](ttps://www.openstreetmap.org/way/65066830) which is completely valid in terms of name, geometry, and tagging but contains a URL in the `ref` field. The polylines importer currently selects a single name value from the list of names in the polylines file by choosing the longest. This PR adds an additional filter that first removes any URL-like values from consideration, and should completely eliminate any of the otherwise concerning errors while ensuring all valid records make it into Elasticsearch. Fixes pelias/whosonfirst#456 Fixes #216 Fixes pelias/docker#89 Connects pelias/model#116
orangejulius
added a commit
to pelias/polylines
that referenced
this issue
Jul 3, 2019
We have had numerous reports from Pelias users about concerning error message during builds regarding the URL regex filter from pelias/model#115. While this filter is good, the resulting error message is alarming. Looking today at the output of a planet build, it appears that many of these errors come from the polylines file created by Valhalla out of the OSM street network. Looking at the contents of the polyline file and corresponding record on OSM, it seems that Valhalla puts the contents of the `ref` tag in the polyline file as an alternate name. The [ref tag](https://wiki.openstreetmap.org/wiki/Key:ref?uselang=en-US) will often contain a URL. This means that not only will the error happen frequently, but many records that are actaully valid will be filtered out. An example of this is the [Iowa Women of Achievement bridge](ttps://www.openstreetmap.org/way/65066830) which is completely valid in terms of name, geometry, and tagging but contains a URL in the `ref` field. The polylines importer currently selects a single name value from the list of names in the polylines file by choosing the longest. This PR adds an additional filter that first removes any URL-like values from consideration, and should completely eliminate any of the otherwise concerning errors while ensuring all valid records make it into Elasticsearch. Fixes pelias/whosonfirst#456 Fixes #216 Fixes pelias/docker#89 Connects pelias/model#116
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
During
pelias import all
(or maybe the test run) I hit some 'invalid regex test' errors as below. Also I'm looking for guidance if th number of misses vs. hits should look so skewed towards misses.The text was updated successfully, but these errors were encountered: