Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrape MLB IDs from Baseball-Reference #222

Merged
merged 3 commits into from
Jul 27, 2021

Conversation

marek-slipski
Copy link
Contributor

Context

Baseball-Reference player stat pages (like this) have player MLB IDs embedded in links. Scraping these IDs would be useful as batting_stats_range and pitching_stats_range return a Dataframe with Baseball-Reference player names as the only identifier but these are difficult to join on. This would grab the IDs from the links and add an mlbID column to the batting and pitching tables.

Example output of pitching_stats_range:
image

@schorrm
Copy link
Collaborator

schorrm commented Jul 15, 2021

We can do this if you update the unit tests to match

@marek-slipski
Copy link
Contributor Author

I think that should do it, but let me know if I'm missing something (I'm new at this).

@schorrm
Copy link
Collaborator

schorrm commented Jul 25, 2021

whoops, missed. can you merge the upstream testing changes back in and push and see if it passes the CI?

@schorrm schorrm merged commit 7b766c1 into jldbc:master Jul 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants