SSRN Abstract Scraper

Scrapes SSRN Abstracts: first by searching for abstract urls (by JEL code), then by downloading contents of those urls

!!! Restricted to the first 10 pages of each JEL Code (500 papers per code) !!!

Process Summary

url_collector.py - scrape abstract url's for a given JEL code, crawling though only the first 10 pages
url_downloader.py - download the content of each abstract url

SSRN JEL Codes

https://papers.ssrn.com/sol3/displayjel.cfm

Example of Log (Printed to Screen)

processing G00
-------------------------
requested https://papers.ssrn.com/sol3/JELJOUR_Results.cfm?npage=1&form_name=Jel&code=G00&lim=false
post-request-sleeping for 6 seconds ... 
jel_code=G00 | 50 total pages
jel_code=G00 | page=1 | 50 papers
saved json jel=G00__page=1__date=20220301

requested https://papers.ssrn.com/sol3/JELJOUR_Results.cfm?npage=2&form_name=Jel&code=G00&lim=false
post-request-sleeping for 2 seconds ... 
jel_code=G00 | page=2 | 50 papers
saved json jel=G00__page=2__date=20220301

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
config.py		config.py
url_collector.py		url_collector.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SSRN Abstract Scraper

Process Summary

SSRN JEL Codes

Example of Log (Printed to Screen)

About

Releases

Packages

Languages

talsan/ssrn

Folders and files

Latest commit

History

Repository files navigation

SSRN Abstract Scraper

Process Summary

SSRN JEL Codes

Example of Log (Printed to Screen)

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages