Scrapes SSRN Abstracts: first by searching for abstract urls (by JEL code), then by downloading contents of those urls
!!! Restricted to the first 10 pages of each JEL Code (500 papers per code) !!!
url_collector.py
- scrape abstract url's for a given JEL code, crawling though only the first 10 pagesurl_downloader.py
- download the content of each abstract url
https://papers.ssrn.com/sol3/displayjel.cfm
processing G00
-------------------------
requested https://papers.ssrn.com/sol3/JELJOUR_Results.cfm?npage=1&form_name=Jel&code=G00&lim=false
post-request-sleeping for 6 seconds ...
jel_code=G00 | 50 total pages
jel_code=G00 | page=1 | 50 papers
saved json jel=G00__page=1__date=20220301
requested https://papers.ssrn.com/sol3/JELJOUR_Results.cfm?npage=2&form_name=Jel&code=G00&lim=false
post-request-sleeping for 2 seconds ...
jel_code=G00 | page=2 | 50 papers
saved json jel=G00__page=2__date=20220301