rent house in hk
All scripts should be executed at / where you can see a package renthse
In table House, there's a column source_type INTEGER to identify which the row is from which source.
Source type
- 1: 591.com
- 2: hk.centanet.com
- 3: 28hse.com
- Have python 2.7 installed (not tested on python 3) to run cralwing scripts
- Have sqlite3 installed to examine db
- git clone git@github.com:warenix/renthse.git
Usage: python provider.py -p
Example: python renthse/provider.py -p 591
Output: all entries are inserted/ updated into a sqlite3 database, test.db
Available providers:
- 591
- centanet
- hse28
- Get a free api key at opencagedata
- Modify /renthse/extapi/OpenCage.py
- Find and replace line __key = None with __key = '< MY KEY >'
Command: python renthse/worker.py
Command: sqlite3 test.db 'select count(*) from house;'
Output: total number of records crawled
Sample Output:
4158
Command: sqlite3 test.db 'select * from house order by price desc limit 10'
Output: top 10 expensive house
Sample Output:
430000|山頂/南區|The Mount Austin
400000|貝沙灣|貝沙灣
400000|山頂/南區|山頂道
398000|貝沙灣|貝沙灣 5期 洋房
380000|山頂/南區|甘道
350000|山頂/南區|Overbays
348000|貝沙灣|貝沙灣 5期 洋房
348000|貝沙灣|貝沙灣 5期 洋房
338000|貝沙灣|貝沙灣
330000|貝沙灣|貝沙灣 5期 洋房