GitHub - eurov/uk_property: Seeking the optimal solution for parsing a huge .csv file

Statistical data set

The UK Land Registry publishes open data on real estate transactions on gov.uk. The datasets are publsihed in txt and csv formats but their size can make them difficult to work with.

So I've got the huge .csv file -> pp-complete.csv

You may download it here. Done? Now you see that each line contains an information about property in the following form:

{F887F88E-7D15-4415-804E-52EAC2F10958},"70000","1995-07-07 00:00","MK15 9HP","D","N","F","31","","ALDRICH DRIVE","WILLEN","MILTON KEYNES","MILTON KEYNES","MILTON KEYNES","A","A"

Here is an explanations of column headers.

Purpose

The purpose is to find all properties that has been sold two or more times, and write the result to a separate file.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
uk_property_.py		uk_property_.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Statistical data set

Purpose

About

Releases

Packages

Languages

eurov/uk_property

Folders and files

Latest commit

History

Repository files navigation

Statistical data set

Purpose

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages