Skip to content

Seeking the optimal solution for parsing a huge .csv file

Notifications You must be signed in to change notification settings

eurov/uk_property

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Statistical data set

The UK Land Registry publishes open data on real estate transactions on gov.uk. The datasets are publsihed in txt and csv formats but their size can make them difficult to work with.

So I've got the huge .csv file -> pp-complete.csv

You may download it here. Done? Now you see that each line contains an information about property in the following form:

{F887F88E-7D15-4415-804E-52EAC2F10958},"70000","1995-07-07 00:00","MK15 9HP","D","N","F","31","","ALDRICH DRIVE","WILLEN","MILTON KEYNES","MILTON KEYNES","MILTON KEYNES","A","A"

Here is an explanations of column headers.

Purpose

The purpose is to find all properties that has been sold two or more times, and write the result to a separate file.

About

Seeking the optimal solution for parsing a huge .csv file

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages