Retrieve available business leads from Google knowledge panel in Python3
gkps is very inspire by knowledge-panel-scraper a scraper in CLI for Google's Knowledge Panels
- scape with less false negatives
- segment results
- fancy prompt
Use git to clone the repository, then install required libraries with the package manager pip.
requirements.txt generated by pipreqs
git clone https://github.com/RobyRemzy/google-knowledge-scraper.git
cd google-knowledge-scraper
pip install -r requirements.txt
python gkps.py inputfile.csv
inputfile.csv
should be a plain text CSV file with each row containing data to generate a search query for a specific business.
For example:
"Bobcat of Monroe,Monroe,NC",1711 MORGAN MILL ROAD,MONROE,NC,28110,(704) 289-2200
"Kelly's Garage,Perry,NY",2868 STATE ROUTE 246,PERRY,NY,14530,(585) 237-2504
"Hoxie Implement Co,Hoxie,KS",933 OAK AVENUE,HOXIE,KS,67740-0587,(785) 675-3201
"Duhon Machinery,St. Rose,LA",10460 WEST AIRLINE HIGHWAY,ST. ROSE,LA,70087,(504) 466-5495
The script will try to fetch data on Google knowledge panel and if it fail it will try it again (as it can be successful this time!). If it fail for the second time it will jump to the next row.
- Green => data has been saved
- Cyan => data has been re fetch
- Red => data has been re fetch but not sucessfully
When finished it will prompt you to tweak by hand failed queries on your default editor.
If gkps.py
finish with successful response, files will be copied in a timed folder
results.csv
contains all existing resultsresults_true.csv
contains only successful responsesresults_false.csv
contains only failed responses
Generated files from the last commande are also in the root directory and will be overridden on next attempt.
After some tweaks (or not) you can re launch the party with this command until you cannot retrieve any good data.
python gkps.py results_false.csv
Pull requests are welcome. Let's do this in Rust lang?