dotnet build /p:SkipNative=true
dotnet build # for cuda support on Windows and Linux
dotnet test
dotnet pack
Requirements:
- Visual Studio
- git
- cmake (tested with 3.18)
NOTE: At this moment, VS versions 17.4.X will not build the native code library. Use 17.3.X until further notice. See: dotnet#858 for more information.
Requirements:
- requirements to run .NET Core 3.1
- git
- cmake (tested with 3.14)
- clang 6.x +
Example to fulfill the requirements in Ubuntu 16:
wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
sudo apt-add-repository "deb https://apt.llvm.org/xenial/ llvm-toolchain-xenial-6.0 main"
sudo apt-get -y update
sudo apt-get -y install clang-6.0 git cmake libunwind8 curl libomp-dev
Commands:
Requirements:
- Clang/LLVM 12.0.0
- git
- .NET SDK 5.0.300
- Cmake 3.20.3
Build with
dotnet build /p:SkipNative=true
An ephemeral feed of packages from Azure DevOps CI is available for those
- View link: https://dotnet.visualstudio.com/TorchSharp/_packaging?_a=feed&feed=SignedPackages
- Nuget feed: https://dotnet.pkgs.visualstudio.com/TorchSharp/_packaging/SignedPackages/nuget/v3/index.json
Some releases are pushed to nuget
dotnet build
dotnet pack
Locally built packages have names like this, names update every day. If repeatedly rebuilding them locally you may have to remove them
from your local .nuget
package cache.
bin/packages/Debug/TorchSharp.0.3.0-local-Debug-20200520.nupkg
bin/packages/Release/TorchSharp.0.3.0-local-Release-20200520.nupkg
To change the TorchSharp package version update this file.
The TorchSharp package is pushed to nuget.org via Azure DevOps CI release pipeline. Assuming you're not building or updating the LibTorch packages
(BuildLibTorchPackages
is false
in azure-pipelines.yml) this is pretty simple once you have the permissions:
-
Update the version number in ./build/BranchInfo.props and in the Release Notes file and then submit a PR.
Updating the major or minor version number should only be done after a discussion with repo admins. The patch number should be incremented by one each release and set to zero after a change to the major or minor version.
-
Integrate code to main and wait for CI to process
-
Go to releases and choose "Create Release" (top right)
-
Under "Artifacts-->Version" choose the pipeline build corresponding to the thing you want to release. It should be a successful build on main
-
Press "Create"
-
Once the package has been successfully pushed and is available in the NuGet gallery, create a GitHub tag in the 'main' branch with the version as the name of the tag.
The libtorch packages are huge (~3GB compressed combined for CUDA Windows) and cause a lot of problems to make and deliver due to NuGet package size restrictions.
These problems include:
-
A massive 2GB binary in the linux CUDA package and multiple 1.0GB binaries in Windows CUDA package
-
Size limitations of about ~500MB on NuGet packages on the Azure DevOps CI system and about ~250MB on
nuget.org
-
Regular download/upload failures on these systems due to network interruptions for packages of this size
-
10GB VM image size restrictions for the containers userd to build these packages in the Azure DevOps CI system, we can easily run out of room.
-
Complete libtorch-cpu packages can't be built using your local machine alone, since they won't contain the full range of native bits. Instead they are built using Azure Pipelines by combining builds
For this reason, we do the following
-
The head, referenceable packages that deliver a functioning runtime are any of:
libtorch-cpu libtorch-cuda-11.7-linux-x64 libtorch-cuda-11.7-win-x64
-
These packages are combo packages that reference multiple parts. The parts are not independently useful. Some parts deliver a single vast file via
primary
andfragment
packages. A build task is then used to "stitch" these files back together to one file on the target machine with a SHA check. This is a hack but there is no other realistic way to deliver these vast files as packages (the alternative is to abandon packaging and require a manual install/detect/link of PyTorch CUDA on all downstream systems, whcih is extremely problematic for many practical reasons).For example, the CUDA package fragments are defined in libtorch-cuda. See more details later in this document.
-
The
libtorch-*
packages are built in Azure DevOps CI using this build pipeline but only in main branch and only whenBuildLibTorchPackages
is set to true in azure-pipelines.yml in the main branch. You must currently manually edit this and submit to main to get newlibtorch-*
packages built. Also incrementLibTorchPackageVersion
if necessary. Do a push to main and the packages will build. This process could be adjusted but at least gets us off the ground. -
After a successful build, the
libtorch-*
packages can be trialled using the package feed from CI (see above). When they are appropriate they can be pushed to nuget using this manually invoked release pipeline in Azure DevOps CI (so they don't have to be manually downloaded and pushed tonuget.org
)b. Press 'New Release'
c. Select the successful main CI build that includes the
libtorch
packages, create the release and wait for it to finish. You should seeInitialize job
,Download artifact - dotnet.TorchSharp - packages
,NuGet push
,Finalize Job
succeeded.d. All packages should now be pushed to
nuget.org
and will appear after indexing. -
If updating libtorch packages, remember to delete all massive artifacts from Azure DevOps and reset this
BuildLibTorchPackages
in azure-pipelines.yml in main branch.
This project grabs LibTorch and makes a C API wrapper for it, then calls these from C#. When updating to a newer version of PyTorch then quite a lot of careful work needs to be done.
-
Make sure you have plenty of disk space, e.g. 15GB
-
Clean and reset to main
git checkout main git clean -xfd .
-
Familiarise yourself with download links. See https://pytorch.org/get-started/locally/ for download links.
For example Linux, LibTorch 1.13.0 uses link
https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.13.0%2Bcpu.zip
Don't download anything yet, or manually. The downloads are acquired automatically in step 2.
To update the version, update this in Dependencies.props:
<LibTorchVersion>1.13.0</LibTorchVersion>
The libtorch version number is also referenced in source code, in the file 'src/TorchSharp/Torch.cs':
const string libtorchPackageVersion = "1.13.0.1";
-
Run these to test downloads and update SHA hashes for the various LibTorch downloads:
dotnet build src\Redist\libtorch-cpu\libtorch-cpu.proj /p:UpdateSHA=true /p:TargetOS=linux /p:Configuration=Release /t:Build /p:IncludeLibTorchCpuPackages=true dotnet build src\Redist\libtorch-cpu\libtorch-cpu.proj /p:UpdateSHA=true /p:TargetOS=mac /p:Configuration=Release /t:Build /p:IncludeLibTorchCpuPackages=true dotnet build src\Redist\libtorch-cpu\libtorch-cpu.proj /p:UpdateSHA=true /p:TargetOS=windows /p:Configuration=Release /t:Build /p:IncludeLibTorchCpuPackages=true dotnet build src\Redist\libtorch-cpu\libtorch-cpu.proj /p:UpdateSHA=true /p:TargetOS=windows /p:Configuration=Debug /t:Build /p:IncludeLibTorchCpuPackages=true dotnet build src\Redist\libtorch-cuda-11.7\libtorch-cuda-11.7.proj /p:UpdateSHA=true /p:TargetOS=linux /p:Configuration=Release /t:Build /p:IncludeLibTorchCudaPackages=true dotnet build src\Redist\libtorch-cuda-11.7\libtorch-cuda-11.7.proj /p:UpdateSHA=true /p:TargetOS=windows /p:Configuration=Release /t:Build /p:IncludeLibTorchCudaPackages=true dotnet build src\Redist\libtorch-cuda-11.7\libtorch-cuda-11.7.proj /p:UpdateSHA=true /p:TargetOS=windows /p:Configuration=Debug /t:Build /p:IncludeLibTorchCudaPackages=true
Each of these will take a very very long time depending on your broadband connection. This can't currently be done in CI.
If file names in the distribution have changed, or files have been removed, you will get errors saying that files cannot be found. That's okay and will be taken care of in the next step.
-
At this point you must very very carefully update the
<File Include= ...
entries under src\Redist projects for libtorch-cpu and libtorch-cuda.This is the step in the upgrade process that takes the most effort and time. It requires extreme care.
Check the contents of the unzip of the archive, e.g.
bin\obj\x64.Debug\libtorch-cpu\libtorch-shared-with-deps-1.13.0\libtorch\lib
You may also need to precisely refactor the CUDA binaries into multiple parts so each package ends up under ~300MB. The NuGet gallery does not allow packages larger than 250MB, so if files are 300MB, after compression, they are likely to be smaller than 250MB. However, you have to look out: if the compression is poor, then packages may end up larger. Note that it is 250 million bytes that is the limit, not 25010241024. In other words, it is 250 MB, not 250 MiB. Note that Windows Explorer will show file sizes in KiB, not thousands of bytes. Use 'dir' from a CMD window to get the exact size in bytes for each file. For example -- the file
libtorch_cpu.so
shows up as 511,872 KB in Windows Explorer, but 524,156,144 bytes in CMD. The 2.4% difference can be significant.If the combined size of the files going into a part is smaller than 250MB, then everything is fine, and there is no need to split the part. It can be singular. If that is not the case, then the part should be fragmented into two or more parts that are linked together by their names.
For example, the following snippet spreads the
torch_cuda_cu.dll
binary file into four fragments of 250 MB each. After compression, they will be even smaller.<File Include= "libtorch\lib\torch_cuda_cu.dll" PackageSuffix="part9-primary" FileUnstitchIndex="0" FileUnstitchStart="0" FileUnstitchSize="250000000" /> <File Include= "libtorch\lib\torch_cuda_cu.dll" PackageSuffix="part9-fragment1" FileUnstitchIndex="1" FileUnstitchStart="250000000" FileUnstitchSize="250000000" /> <File Include= "libtorch\lib\torch_cuda_cu.dll" PackageSuffix="part9-fragment2" FileUnstitchIndex="2" FileUnstitchStart="500000000" FileUnstitchSize="250000000" /> <File Include= "libtorch\lib\torch_cuda_cu.dll" PackageSuffix="part9-fragment3" FileUnstitchIndex="3" FileUnstitchStart="750000000" FileUnstitchSize="-1" />
They must all be called either 'primary,' which should be the first fragment, or 'fragmentN' where 'N' is the ordinal number of the fragment, starting with '1'. The current logic allows for as many as 10 non-primary fragments. If more are needed, the code in FileRestitcher.cs and RestitchPackage.targets needs to be updated. Note that the size of each fragment is expressed in bytes, and that fragment start must be the sum of the size of all previous fragments. A '-1' should be used for the last fragment (and only for the last fragment): it means that the fragment size will be based on how much there is still left of the file.
Each part, whether singular or fragmented, should have its own .nupkgproj file in its own folder under pkg. The folder and file should have the same name as the part. If you need to add new fragments, it is straightforward to just copy an existing fragment folder and rename it as well as the project file to the new fragment. If you must fragment a previously singular part, it is best to rename the existing folder and file to '-fragment1' and then copy a '-primary' folder and rename with the right part name. This is because the primary .nupkgproj files look different from others. Specifically, they include different build targets:
<Content Include="..\common\NormalPackage.props" Pack="true" PackagePath="buildTransitive\netstandard2.0\$(MSBuildProjectName).props" />
<Content Include="..\common\NormalPackage.targets" Pack="true" PackagePath="buildTransitive\netstandard2.0\$(MSBuildProjectName).targets" />
vs.
<Content Include="..\common\RestitchPackage.props" Pack="true" PackagePath="buildTransitive\netstandard2.0\$(MSBuildProjectName).props" />
<Content Include="..\common\RestitchPackage.targets" Pack="true" PackagePath="buildTransitive\netstandard2.0\$(MSBuildProjectName).targets" />
It is the 'RestitchPackage.targets' that will trigger restitching packages on first build after a download.
Because file sizes change from release to release, it may be necessary to add or remove fragments. When you add a fragment, you also need to add a corresponding project folder under the pkg/
top-level folder. The process of doing so is copy-paste-rename of existing folders. The same goes for adding parts (whether fragmented or not): you should add a corresponding folder and project file. If you remove a fragment (or part), you should remove the corresponding folder, or CI will end up building empty packages.
Once you have carefully edited the parts and the files that go into them, clean the build directory and re-issue the libtorch downloads commands until there are no errors.
-
Add the SHA files:
git add src\Redist\libtorch-cpu\*.sha git add src\Redist\libtorch-cuda-11.7\*.sha
After this you may as well submit to CI just to see what happens, though keep going with the other steps below as well.
-
Build the native and managed code without CUDA
dotnet build /p:SkipCuda=true
The first stage unzips the archives, then CMAKE is run.
Unzipping the archives may take quite a while
Note that things may have changed in the LibTorch header files, linking flags etc. There is a CMakeLists.txt that acquires the cmake information delievered in the LibTorch download. It can be subtle.
If the vxcproj for the native code gets configured by cmake then you should now be able to start developing the C++ code in Visual Studio. In order to get the correct environment variables and PATH, start VS from the command line, not from the Start menu:
devenv TorchSharp.sln
e.g. the vcxproj is created here:
bin\obj\x64.Debug\Native\LibTorchSharp\LibTorchSharp.vcxproj
-
Similarly build the native code with CUDA
dotnet build
-
You must also adjust the set of binaries referenced for tests, see various files under
tests
andNativeAssemblyReference
inTorchSharp\Directory.Build.targets
. -
Run tests
dotnet build test -c Debug dotnet build test -c Release
-
Try building packages locally. The build (including CI) doesn't build
libtorch-*
packages by default, just the managed package. To get CI to build newlibtorch-*
packages update this version and setBuildLibTorchPackages
in azure-pipelines.yml:<LibTorchPackageVersion>1.13.0.1</LibTorchPackageVersion> dotnet pack -c Debug /p:SkipCuda=true dotnet pack -c Release /p:SkipCuda=true dotnet pack -c Debug dotnet pack -c Release
-
Submit to CI and debug problems.
-
Remember to delete all massive artifacts from Azure DevOps and reset this
BuildLibTorchPackages
in in azure-pipelines.yml
In order for builds to work properly using Visual Studio 2019 or 2022, you must start VS from the 'x64 Native Tools Command Prompt for VS 2022' (or 2019) in order for the full environment to be set up correctly. Starting VS from the desktop or taskbar will not work properly.