-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
19 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
--- | ||
layout: post | ||
title: 'Accidentally Collecting 30 Million Land Records' | ||
author: waldoj | ||
--- | ||
|
||
We’ve just finished an interesting, experimental project to open up cadastral data, and the results are too delightful to keep quiet (despite that we have nothing to show for the project right now except a GitHub repository). | ||
|
||
U.S. Open Data is a long-time supporter of the [Open Addresses](https://openaddresses.io/) project, a volunteer-run project that aggregates government-published address datasets to create a global repository of the coordinates of street addresses. Anecdotally, project volunteers had noticed that a fair number of the data sources contained not just the latitude and longitude of an address, but the boundaries of the parcel. That raised the question of how many of the indexed 257 million addresses might include boundary data that was going unused. Could we have accidentally collected millions of cadastral records? | ||
|
||
So we hired [PostLight](https://www.postlight.com/) to figure it out for us. Developer Bryan Bickford spent a little over a week creating [a Python-based tool to find and extract parcel data from OpenAddresses’ records](https://github.com/openaddresses/parcels). | ||
|
||
Bryan’s work gave us a hard number: of the 1,511 data sources ingested by OpenAddresses, 383 include parcel boundaries (or 25%). There are a total of 30,461,769 parcels included. Although OpenAddresses has address data for dozens of countries, it turns out that only 5 commingle addresses with parcel boundaries: Brazil, Canada, Singapore, Ukraine, and the United States. 95% of those included parcels are in the United States (29,033,215 in total). That’s a lot of land parcels to have collected accidentally. | ||
|
||
If the idea of a comprehensive cadastral map sounds familiar, that’s because this is exactly what [Loveland](https://makeloveland.com/) is doing. They’ve [put together a collection of over 100 million parcels](https://makeloveland.com/reports/parcels). We’ll figure out the extent to which their 100 million parcels overlap with our 30 million parcels, to ensure that we can fill in any gaps that we’ve identified in Loveland’s records. The first 10% of what Loveland does is a lot like what OpenAddresses does, but they go so much farther in building atop the data that they collect, instead of just collecting and publishing metadata. | ||
|
||
The OpenAddresses project is working on figuring out what to do with parcel records, now that it can be known that it has so many of them. That might mean doing nothing (beyond sharing this data), and it might mean creating an ongoing offshoot of the project to publish this data. It remains to be seen. | ||
|
||
Our thanks to the [Shuttleworth Foundation](https://www.shuttleworthfoundation.org/) for funding this project. |