Simple bash script that loots email addresses from commit entries. Email addresses are set via user config when pushing changes up to github. Running git log
against a repository shows a list of commits, from which email addresses can be parsed. The githump
script enumerates all repositories for a target organization or user and then extracts email addresses from the commit logs of each repository. Finally, all unique emails are extracted from the intermediary results and saved off in the results
directory.
Usage is easy: ./githump.sh <target>
where <target>
is the github account username. For example, ./githump.sh SalesforceEng
to target everything at https://github.com/SalesforceEng.
Future improvements to be enumerated here.
This is more of a documentation and calling issue. Write up instructions for running as a repeated task and collecting email addresses historically. This could be done by pushing result commits up to bitbucket or another version control repository.
The JSON results from the github API include an updated_at
key-value pair. If running githump
daily and aggregating results, a check should be added to only clone and search repositories that have been updated since the previous run.