Table of Contents

Shortcuts

Deploying LLMs to ExecuTorch¶

ExecuTorch is designed to support all types of machine learning models, and LLMs are no exception. In this section we demonstrate how to leverage ExecuTorch to performantly run state of the art LLMs on-device out of the box with our provided export LLM APIs, acceleration backends, quantization libraries, tokenizers, and more.

We encourage users to use this project as a starting point and adapt it to their specific needs, which includes creating your own versions of the tokenizer, sampler, acceleration backends, and other components. We hope this project serves as a useful guide in your journey with LLMs and ExecuTorch.

Prerequisites¶

To follow this guide, you’ll need to install ExecuTorch. Please see Setting Up ExecuTorch.

Next steps¶

Deploying LLMs to ExecuTorch can be boiled down to a two-step process: (1) exporting the LLM to a .pte file and (2) running the .pte file using our C++ APIs or Swift/Java bindings.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources