Skip to content

Commit

Permalink
feat: adding doc/CONTRIBUTING.md (#86)
Browse files Browse the repository at this point in the history
* feat: add contributing guidance

* refactor: optimize readme

* hotfix
  • Loading branch information
ganler authored Feb 21, 2023
1 parent 3483f08 commit 2367d33
Show file tree
Hide file tree
Showing 2 changed files with 132 additions and 73 deletions.
85 changes: 12 additions & 73 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,22 @@
<p align="center">
<img src="https://github.com/ganler/nnsmith-logo/raw/master/nnsmith-logo.svg", width="500">
</p>
<div align="center">
<img src="https://github.com/ganler/nnsmith-logo/raw/master/nnsmith-logo.svg" align="right" height="240"/>
</div>

# NNSmith

<p align="center">
<a href="https://github.com/ise-uiuc/nnsmith/actions/workflows/ci.yaml"><img src="https://github.com/ise-uiuc/nnsmith/actions/workflows/ci.yaml/badge.svg">
<a href="https://pypi.org/project/nnsmith/"><img src="https://img.shields.io/pypi/v/nnsmith?color=g">
<a href="https://github.com/ise-uiuc/nnsmith/blob/main/LICENSE"><img src="https://img.shields.io/pypi/l/nnsmith"></a>
</p>

## Support table
NNSmith is a random DNN generator and a fuzzing infrastructure, primiarily designed for automatically validating deep-learning frameworks and compilers.

## Support Table

<center>

| Front-/back-end | [`tvm`](https://github.com/apache/tvm) | [`onnxruntime`](https://github.com/microsoft/onnxruntime) | [`tensorrt`](https://github.com/NVIDIA/TensorRT) | [`tflite`](https://www.tensorflow.org/lite) | [`xla`](https://www.tensorflow.org/xla) | [`torchjit`](https://pytorch.org/docs/stable/jit.html) |
| Models | [`tvm`](https://github.com/apache/tvm) | [`onnxruntime`](https://github.com/microsoft/onnxruntime) | [`tensorrt`](https://github.com/NVIDIA/TensorRT) | [`tflite`](https://www.tensorflow.org/lite) | [`xla`](https://www.tensorflow.org/xla) | [`torchjit`](https://pytorch.org/docs/stable/jit.html) |
| ------------ | ------------------------------------ | ----------------------------------------------- | ---------------------------------------------- | ----------------------------------------- | ------------------------------------- | ----------------------------------------------------- |
| ONNX |||| | | |
| TensorFlow | 🔨 | | ||| |
Expand Down Expand Up @@ -44,19 +48,6 @@ pip install "nnsmith[torch,onnx]" --upgrade
</div>
</details>

<details><summary><b>Install latest pre-release: </b> <i>[expand]</i></summary>
<div>

```shell
pip install "nnsmith[torch,onnx]" \
--pre --upgrade \
--index-url https://test.pypi.org/simple/ \
--extra-index-url https://pypi.org/simple/
```

</div>
</details>


## Quick Start

Expand Down Expand Up @@ -86,63 +77,11 @@ nnsmith.model_gen model.type=onnx debug.viz=true

See other commands under [`doc/cli`](doc/cli.md). We use [hydra](https://hydra.cc/) to manage configurations. See `nnsmith/config/main.yaml`.

## Developer Notes

- `pip install -r requirements/core.txt` to run generation and fuzzing;
- `pip install --upgrade --pre -r requirements/sys/[system].txt` to allow generating and running specific frameworks;
- **Why "--upgrade --pre"?** In fact, all the sources under `requirements/sys/` are nightly release (except tvm) as we want to "save the world" by catching new bugs;

<details><summary><b>Pre-commits</b> <i>[expand]</i></summary>
<div>

You can use `pre-commit` to simpify development:

- `pip install -r requirements/dev.txt`;
- `pre-commit install`;
- `pre-commit` will run upon a commit; To explicitly run `pre-commit` for all files: `pre-commit run --all-files`.

</div>
</details>

<details><summary><b>Local development</b> <i>[expand]</i></summary>
<div>

- Develop locally by setting `export PYTHONPATH=$PYTHONPATH:$(pwd)` (`pwd` should be this git folder.)
- Set `PYTHONPATH=""` when doing `pip install nnsmith` from online version.

</div>
</details>

<details><summary><b>Simplify the code</b> <i>[expand]</i></summary>
<div>

*Simplicity is prerequisite for reliability.* --Edsger W. Dijkstra

We want **code simplicity**: keeping minimal dependencies and focusing on a small set of simple APIs to make NNSmith maintainable to developers and reliable to users.

</div>
</details>

<details><summary><b>Test before commit</b> <i>[expand]</i></summary>
<div>

```shell
# env of torch & tf will conflict so split their unit tests.
pytest tests/core -s
pytest tests/torch -s
pytest tests/tensorflow -s
```

</div>
</details>

## Notes
## Contributing Guide

+ NNSmith is modularized and can be extended as a 3rd-party library, which allows you to patch your own backend and do fuzzing without modifying NNSmith's source code.
+ Meanwhile, feel free to [request](https://github.com/ise-uiuc/nnsmith/issues) a backend support: the project maintainer is happy to support DL systems that care about software reliability and quality to benefit the whole DL software stack.
+ It would be great if you can [let us know](https://github.com/ise-uiuc/nnsmith/issues) if you find new bugs with NNSmith or build a new system inspired by NNSmith.
Please check [doc/CONTRIBUTING.md](doc/CONTRIBUTING.md).

## Paper
## Papers

<details><summary><b>NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers.</b> <i>[expand citation]</i></summary>
<div>
Expand Down
120 changes: 120 additions & 0 deletions doc/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
# Contributor Guide 🤗

We welcome various sorts of contributions to NNSmith,
including reporting an [issue](https://github.com/ise-uiuc/nnsmith/issues) or submitting a [PR](https://github.com/ise-uiuc/nnsmith/pulls).

## Reporting an issue

We appreciate developers to report current limitations of NNSmith on the [GitHub issue tracking system](https://github.com/ise-uiuc/nnsmith/issues),
including but not limited to:

1. **Bug reports**: unexpected behaviors of NNSmith (e.g., a flaw of the operator rule/specification)
2. **Feature requests**: tell us what could be the promising feature to make NNSmith stronger!

## Submitting a PR

The general flow for submitting a PR (Pull Request):

> (Optional) Submit an issue to talk about the PR (necessary for introducing new features);
1. [Fork](https://github.com/ise-uiuc/nnsmith/fork) NNSmith;
2. Pull your fork: `git clone git@github.com:$[YOUR_FORK]/nnsmith.git`;
3. `cd nnsmith && export PYTHONPATH=$PYTHONPATH:$(pwd)`
4. Coding NNSmith! Then commit and push your code!
5. Submit a PR [here](https://github.com/ise-uiuc/nnsmith/pulls);
6. Code review;
7. Merge!

### Do I need to submit an issue before PR?

- **No**: minor or nice-to-have cases such as typo fixes and bug fixes.
- **Yes**: new features (e.g., extending new backend) and fundamental changes.

### Will my contributions be rejected?

Oftentimes not, rare cases yes (that's why it is suggested to submit an issue for discussion first).

**S-sized contributions** are oftentimes easy-to-accept, including bug/typo fixes, CI improvements, test-case improvements, etc.
as long as it is beneficial and satisfies the properties in the "General coding guidance" section.

**M-sized contributions** such as extending new frontends/backends/fuzzing strategies/etc. are welcome as well
-- as long as it shows an edge in improvements.
However, for maintainability, it could be moved to the temporary "contrib" folder if it is non-trivial/unclear for being well-maintained.
For example, let's say we supported backend "X" in the "contrib" folder and started to submitting bug reports to the "X" community.
Later on, if "X" community is found to be not interested fixing bugs
-- we don't have to support "X" as backend and consequently we can just drop it.

**L-sized contributions** are those that conflicting the fundamental designs and goals of NNSmith.
For example, NNSmith is fundamentally model generator, and it too much for it to support, for example, "distributed training".
As a result, such changes might not be accepted unless there is a compelling justification
-- but NNSmith is under Apache-2.0 -- you can always make it in the way you like via a fork :).
Of course, some L-sized contributions can still possibly accepted,
such as improving the operator specification or developing a more promising intermediate representation than GraphIR,
as long as we agree on that the benefits (over the efforts) are unquestionable.

## General coding guidance

### `pre-commit`

[`pre-commit`](https://pre-commit.com/) is a convenient tool to check and format your code while commiting codes.

To set-up pre-commit:

```shell
pip install -r requirements/dev.txt
pre-commit install
```

Now it will run checking and auto-formating while you commit:

```shell
git commit ...
# if [NOTHING HAPPENDS], you are good to go;
# if [IT FAILS], the auto-formatting is automatically applied;
# you just need to check, `git add` these changes and re-commit.
```

### Testing

If appliable (e.g., adding a new backend), add a few tests to validate your implementation. Examples can be found:

1. [Python unit-tests](https://github.com/ise-uiuc/nnsmith/tree/main/tests);
2. [End-to-end testing](https://github.com/ise-uiuc/nnsmith/blob/main/.github/workflows/ci.yaml);

To run the Python tests:

```shell
# env of torch & tf will conflict so split their unit tests.
pytest tests/core -s
pytest tests/torch -s
pytest tests/tensorflow -s
```

### Simple code

> “Simplicity is the prerequisite for reliability.” - Edsger W. Dijkstra
Maintaining code is hard, esp. when
(i) initial code owners are not available; and
(ii) the code is too complicated to be understand/modify.
As a result, contributors are recommand to write simple code:
(i) easy-to-understand;
(ii) well-organized and easy-to-extend;
(iii) well-documented if the concept is tricky;
and (iv) avoiding changes that brings low improvement over high complexity.

For example, the complexity of test-case structure is non-trivial in NNSmith;
consequently, initial maintainers spent some amount of effort to make it systematically structured,
so that it will be easier-to-use and extend.
(@ganler: I know it could be boring, but it is indeed important for a long-live project.)

![](https://gist.github.com/ganler/bdf7e867e57c96e8c09ff31cb0b90a1f/raw/4667ad9b7dcb0b77cb722e7025402105560ebf41/datastructure.png)

There are a few more concrete terms to consider:

1. Try not to introduce new dependencies:
- If we only need "one" function from the prospective dependency, implement it on our own if possible;
- If we have to use, try to consider "reliable" ones first. For example, those have been tested by millions of developers (such as NumPy).
2. Avoid bring data files in the repository -- it will bloat the codebase, making it harder to distribute.
- If it is a picture, upstream that to gist or other "storage" repos and use an URL for it.
- If it is some configuration file or data file, using script to re-generate it (if easy) or we distribute that on ["Releases"](https://github.com/ise-uiuc/nnsmith/releases) (if large).

0 comments on commit 2367d33

Please sign in to comment.