This repository provides a toolchain for gathering and analyzing merge scenarios found in git repositories. The tool stores the collected data in a normalized MySQL database. It supports extracting various features of the merge scenarios as well as executing the merge using different merge tools, detecting merge conflicts, and finding compilation and test problems with the merge resolution.
The toolchain has been tested with Ubuntu 20.04.
- Clone this repository:
git https://github.com/ualberta-smr/merganser
- Install the dependencies.
pip3 install -r requirements.txt
-
Set the
config.py
file: The pre-defined paths, database information, constants, and access keys are stored inconfig.py
file. The full description of these parameters is in the wiki page. The only parameters that the user must set before using Merganser are the GitHub access keys and database parameters. -
Add the list of repositories: The input of the main program is a list of repositories to analyze. There are different ways to create such list:
-
Add the repository list manually: If you already have the list of repositories to analyze, write them in a *.txt file (each repository per line) and copy the text file in
./working_dir/repository_list
(this path isREPOSITORY_LIST_PATH
which is set inconfig.py
). -
Automatic searching: If you do not have specific repositories in mind, but instead, want to analyze repositories with a specific range of stars, watches, forks, size, or that are in a specific application domain, you can search the list of repositories using
search_repository.py
. Read the wiki page to find out the parameters of this module.
-
-
There are two ways to run the tool based on the final goal. the results are stores in CSV files.
- Execute the tool to extract all available data:
python3 ./run_predict.sh <list_of_repositories>
- Execute the tool for conflict prediction data:
python3 ./run_all.sh <list_of_repositories>
-
The next step is storing the the CSV files in a SQL database.
python3 ./data_conversion.py
- For conflict prediction, first create the data:
python3 ./data_prediction.py
The wiki page describes all possible parameters.
Merganser is released under the MIT License.
Feel free to report any issue about Merganser here. You can ask your question about installing and running the tool from the creators Moein Owhadi Kareshk and Sarah Nadi.
You are very welcome to post a pull-request should you have change, bug fix, etc. in mind.