Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] Fix broken links #351

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,4 @@ authors:
title: "nebullvm"
version: 0.4.3
date-released: 2022-10-10
url: "https://github.com/nebuly-ai/nebullvm"
url: "https://github.com/nebuly-ai/nebuly"
Original file line number Diff line number Diff line change
Expand Up @@ -474,7 +474,7 @@ Finally, extending the model and training algorithm to support larger matrix siz

## Speedster integration of AlphaTensor

AlphaTensor opens the doors for further improvements to Speedster. [Speedster](https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster) is an open source module designed to speed up AI inference with just a few lines of code. The library automatically applies the best set of SOTA optimization techniques to achieve maximum inference speed-up.
AlphaTensor opens the doors for further improvements to Speedster. [Speedster](https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster) is an open source module designed to speed up AI inference with just a few lines of code. The library automatically applies the best set of SOTA optimization techniques to achieve maximum inference speed-up.

Within Speedster, AlphaTensor will use its optimized kernels for matrix multiplication to find the optimal set of sub-operations for each layer in the AI model that involve matrix multiplication, including linear layers, attention layers, and convolution layers. The matrix multiplications will be decomposed into sub-matrix multiplications up to the maximum size supported by AlphaTensor, and the fastest decomposition will be selected for each layer. This optimization process will be applied to all layers in the neural network, resulting in a dramatically improved model.

Expand Down
2 changes: 1 addition & 1 deletion optimization/speedster/docs/en/docs/advanced_options.md
Original file line number Diff line number Diff line change
Expand Up @@ -308,7 +308,7 @@ optimized_backbone = OptimizedWrapper(optimized_model, backbone_wrapper.output_n

```

You can find other examples in the [notebooks](https://github.com/nebuly-ai/nebullvm/tree/main/notebooks/speedster) section available on GitHub.
You can find other examples in the [notebooks](https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster/notebooks) section available on GitHub.

## Store the performances of all the optimization techniques

Expand Down
59 changes: 32 additions & 27 deletions optimization/speedster/docs/en/docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,28 +147,31 @@ The following table shows the supported combinations of frameworks, backends and

If you want to manually install the requirements, this section collects links to the official installation guides for all frameworks and compilers supported by `Speedster`.

#### Deep Learning frameworks/backends
- PyTorch: https://pytorch.org/get-started/locally/
- TensorFlow: https://www.tensorflow.org/install
- ONNX: https://github.com/onnx/onnx#installation
- HuggingFace: https://huggingface.co/transformers/installation.html
- Diffusers: https://github.com/huggingface/diffusers#installation

#### Deep Learning compilers
- DeepSparse: https://github.com/neuralmagic/deepsparse#installation
- TensorRT: https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html
- Torch TensorRT: https://pytorch.org/TensorRT/getting_started/installation.html#installation
- ONNXRuntime: https://onnxruntime.ai/docs/install/#python-installs
- OpenVINO: https://docs.openvino.ai/latest/openvino_docs_install_guides_install_dev_tools.html#step-4-install-the-package
- Intel Neural Compressor: https://github.com/intel/neural-compressor#installation
- Apache TVM: https://tvm.apache.org/docs/install/index.html

#### Other requirements
- tf2onnx: https://github.com/onnx/tensorflow-onnx#installation (Install it if you want to convert TensorFlow models to ONNX)
- polygraphy: https://github.com/NVIDIA/TensorRT/tree/main/tools/Polygraphy#installation (Install it if you want to use TensorRT)
- onnx-simplifier: https://github.com/daquexian/onnx-simplifier#python-version (Install it if you want to use TensorRT)
- onnx_graphsurgeon: https://github.com/NVIDIA/TensorRT/tree/master/tools/onnx-graphsurgeon#installation (Install it if you want to use TensorRT with Stable Diffusion)
- onnxmltools: https://github.com/onnx/onnxmltools#install (Install it if you want to convert models to ONNX)
### Deep Learning frameworks/backends

- [PyTorch](https://pytorch.org/get-started/locally/)
- [TensorFlow](https://www.tensorflow.org/install)
- [ONNX](https://github.com/onnx/onnx#installation)
- [HuggingFace](https://huggingface.co/transformers/installation.html)
- [Diffusers](https://github.com/huggingface/diffusers#installation)

### Deep Learning compilers

- [DeepSparse](https://github.com/neuralmagic/deepsparse#installation)
- [TensorRT](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html)
- [Torch TensorRT](https://pytorch.org/TensorRT/getting_started/installation.html#installation)
- [ONNXRuntime](https://onnxruntime.ai/docs/install/#python-installs)
- [OpenVINO](https://docs.openvino.ai/latest/openvino_docs_install_guides_install_dev_tools.html#step-4-install-the-package)
- [Intel Neural Compressor](https://github.com/intel/neural-compressor#installation)
- [Apache TVM](https://tvm.apache.org/docs/install/index.html)

### Other requirements

- [tf2onnx](https://github.com/onnx/tensorflow-onnx#installation) (Install it if you want to convert TensorFlow models to ONNX)
- [polygraphy](https://github.com/NVIDIA/TensorRT/tree/main/tools/Polygraphy#installation) (Install it if you want to use TensorRT)
- [onnx-simplifier](https://github.com/daquexian/onnx-simplifier#python-version) (Install it if you want to use TensorRT)
- [onnx_graphsurgeon](https://github.com/NVIDIA/TensorRT/tree/master/tools/onnx-graphsurgeon#installation) (Install it if you want to use TensorRT with Stable Diffusion)
- [onnxmltools](https://github.com/onnx/onnxmltools#install) (Install it if you want to convert models to ONNX)

## (Optional) Download Docker images with frameworks and optimizers

Expand All @@ -182,7 +185,7 @@ and then run and access the Docker with:

docker run -ti --gpus=all nebulydocker/nebullvm:latest

After optimizing the model, you may decide to deploy it to production. Note that you need to have the deep learning compiler used to optimize the model and other components inside the production Docker. For this reason, we have created several versions of the Docker nebullvm container in the [Docker Hub](https://hub.docker.com/repository/docker/nebulydocker/nebullvm), each containing only one compiler. Pull the image with the compiler that has optimized your model!
After optimizing the model, you may decide to deploy it to production. Note that you need to have the deep learning compiler used to optimize the model and other components inside the production Docker. For this reason, we have created several versions of the Docker nebullvm container in the [Docker Hub](https://hub.docker.com/r/nebulydocker/nebullvm), each containing only one compiler. Pull the image with the compiler that has optimized your model!

## Set up Speedster on custom DL devices

Expand All @@ -208,18 +211,20 @@ You are now ready to use Speedster on TPUs! Speedster will automatically detect

### AWS Inferentia

For AWS Inferentia, you must first create an AWS EC2 instance with the `inf1` instance type.
For AWS Inferentia, you must first create an AWS EC2 instance with the `inf1` instance type.
You can find more information about `inf1` instances in the [official documentation](https://aws.amazon.com/it/ec2/instance-types/inf1/).

!!! info
AWS has recently released the `inf2` instance type, which is a more powerful version of `inf1`. For now `inf2`
AWS has recently released the `inf2` instance type, which is a more powerful version of `inf1`. For now `inf2`
instances are only available in private preview, you can request them directly to AWS by filling this [form](https://pages.awscloud.com/EC2-Inf2-Preview.html).

To use Speedster on AWS Inferentia, we will use the [`torch-neuron`](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/torch/torch-setup.html) library, that must be manually installed on `inf1` instances (on `inf2`instances it's already preinstalled if you use the PyTorch DLAMI provided by AWS).

You can find here the full guides to set up the EC2 instances and install the required libraries:
- `inf1`: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/torch/torch-neuron/setup/pytorch-install.html#install-neuron-pytorch
- `inf2`: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/torch/torch-neuronx/setup/pytorch-install.html#pytorch-neuronx-install

- `inf1`: <https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/torch/torch-neuron/setup/pytorch-install.html#install-neuron-pytorch>

- `inf2`: <https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/torch/torch-neuronx/setup/pytorch-install.html#pytorch-neuronx-install>

After creating the EC2 instance and installing `torch_neuron`, you can follow these steps to set up Speedster:
- Check that the `torch_neuron` library is installed, you can do this by running `python -c "import torch_neuron; print(torch_neuron.__version__)"` in the console (if using `inf1` instances, otherwise change `torch_neuron` with `torch_neuronx`);
Expand Down
2 changes: 1 addition & 1 deletion optimization/speedster/docs/en/docs/notebooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,4 @@ In this section you can find optimization notebooks for multiple DL input models
- Pytorch
- Tensorflow

Please check out notebooks and tutorials on GitHub at [this](https://github.com/nebuly-ai/nebullvm/tree/main/notebooks/speedster) link.
Please check out notebooks and tutorials on GitHub at [this](https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster/notebooks) link.
Original file line number Diff line number Diff line change
Expand Up @@ -627,9 +627,9 @@
"</center>\n",
"\n",
"<center> \n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
"</center>"
]
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -651,9 +651,9 @@
"</center>\n",
"\n",
"<center> \n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
"</center>"
]
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -649,9 +649,9 @@
"</center>\n",
"\n",
"<center> \n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
"</center>"
]
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -622,9 +622,9 @@
"</center>\n",
"\n",
"<center> \n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
"</center>"
]
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -731,9 +731,9 @@
"</center>\n",
"\n",
"<center> \n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
"</center>"
]
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -637,9 +637,9 @@
"</center>\n",
"\n",
"<center> \n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
"</center>"
]
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -576,9 +576,9 @@
"</center>\n",
"\n",
"<center> \n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
"</center>"
]
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -500,9 +500,9 @@
"</center>\n",
"\n",
"<center> \n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#key-concepts\" target=\"_blank\" style=\"text-decoration: none;\"> How speedster works </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#documentation\" target=\"_blank\" style=\"text-decoration: none;\"> Documentation </a> •\n",
" <a href=\"https://github.com/nebuly-ai/nebuly/tree/main/optimization/speedster#quick-start\" target=\"_blank\" style=\"text-decoration: none;\"> Quick start </a> \n",
"</center>"
]
}
Expand Down
Loading