Skip to content

lianakoleva/no-libtorch-compile

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CUDA MODE IRL Project -- WIP

Background:

  • AOTInductor is a specialized version of TorchInductor , designed to process exported PyTorch models, optimize them, and produce shared libraries as well as other relevant artifacts.

  • These compiled artifacts are specifically crafted for deployment in non-Python environments, which are frequently employed for inference deployments on the server side.

  • A somewhat little known feature is that torch.compile can convert a PyTorch program into a C++ .so binary file.

  • We can then load the shared library, enabling us to conduct model predictions directly within a C++ environment.

  • Unfortunately, today, running it still requires a libtorch dependency (very large filesize especially with CUDA).

  • Upon inspection of the symbol table and other attributes of the binary , we see that there is a fairly limited amount of APIs that need to be shimmed.

Goal: Compile a PyTorch program into a (dependency-free) binary through torch.compile()

  • Started with a simple transpose kernel, then tried to expand support to Bert Maher's llama2.so project

  • AOTInductor internals, patching ELF files, dynamic linking, ABI-compatibility

  • lots of simplifying assumptions (no support for dynamic shapes)

APIs to shim:

  • aoti_torch_create_cuda_stream_guard
  • aoti_torch_create_tensor_from_blob
  • aoti_torch_create_tensor_from_blob_v2
  • aoti_torch_delete_cuda_stream_guard
  • aoti_torch_delete_tensor_object
  • aoti_torch_device_type_cpu
  • aoti_torch_device_type_cuda
  • aoti_torch_dtype_float32
  • aoti_torch_empty_strided
  • aoti_torch_get_data_ptr
  • aoti_torch_get_storage_offset
  • aoti_torch_get_storage_size
  • aoti_torch_get_strides
  • aoti_torch_grad_mode_is_enabled
  • aoti_torch_grad_mode_set_enabled

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published