(success-stories)=

# Success Stories

Discover how organizations are leveraging ExecuTorch to deploy AI models at scale on edge devices.

---

## Featured Success Stories

::::{grid} 1
:gutter: 3

:::{grid-item-card} **Meta's Family of Apps**
:class-header: bg-primary text-white

**Industry:** Social Media & Messaging
**Hardware:** Android & iOS Devices
**Impact:** Billions of users, latency reduction

Powers Instagram, WhatsApp, Facebook, and Messenger with real-time on-device AI for content ranking, recommendations, and privacy-preserving features at scale.

[Read Blog →](https://engineering.fb.com/2025/07/28/android/executorch-on-device-ml-meta-family-of-apps/)
:::

:::{grid-item-card} **Meta Quest & Ray-Ban Smart Glasses**
:class-header: bg-success text-white

**Industry:** AR/VR & Wearables
**Hardware:** Quest 3, Ray-Ban Meta Smart Glasses, Meta Ray-Ban Display

Enables immersive mixed reality with real-time computer vision, hand tracking, voice commands, and translation on power-constrained wearable devices.
:::

:::{grid-item-card} **Liquid AI: Efficient, Flexible On-Device Intelligence**
:class-header: bg-info text-white

**Industry:** Artificial Intelligence / Edge Computing
**Hardware:** CPU via PyTorch ExecuTorch
**Impact:** 2× faster inference, lower latency, seamless multimodal deployment

Liquid AI builds foundation models that make AI work where the cloud can't. In its LFM2 series, the team uses PyTorch ExecuTorch within the LEAP Edge SDK to deploy high-performance multimodal models efficiently across devices. ExecuTorch provides the flexibility to support custom architectures and processing pipelines while reducing inference latency through graph optimization and caching. Together, they enable faster, more efficient, privacy-preserving AI that runs entirely on the edge.

[Read Blog →](https://www.liquid.ai/blog/how-liquid-ai-uses-executorch-to-power-efficient-flexible-on-device-intelligence) <!-- @lint-ignore -->
:::

:::{grid-item-card} **PrivateMind: Complete Privacy with On-Device AI**
:class-header: bg-warning text-white

**Industry:** Privacy & Personal Computing
**Hardware:** iOS & Android Devices
**Impact:** 100% on-device processing

PrivateMind delivers a fully private AI assistant using ExecuTorch's .pte format. Built with React Native ExecuTorch, it supports LLaMA, Qwen, Phi-4, and custom models with offline speech-to-text and PDF chat capabilities.

[Visit →](https://privatemind.swmansion.com)
:::

:::{grid-item-card} **NimbleEdge: On-Device Agentic AI Platform**
:class-header: bg-danger text-white

**Industry:** AI Infrastructure
**Hardware:** iOS & Android Devices
**Impact:** 30% higher TPS on iOS, faster time-to-market with Qwen/Gemma models

NimbleEdge successfully integrated ExecuTorch with its open-source DeliteAI platform to enable agentic workflows orchestrated in Python on mobile devices. The extensible ExecuTorch ecosystem allowed implementation of on-device optimization techniques leveraging contextual sparsity. ExecuTorch significantly accelerated the release of "NimbleEdge AI" for iOS, enabling models like Qwen 2.5 with tool calling support and achieving up to 30% higher transactions per second.

[Visit →](https://nimbleedge.com) • [Blog →](https://www.nimbleedge.com/blog/meet-nimbleedge-ai-the-first-truly-private-on-device-assistant) • [iOS App →](https://apps.apple.com/in/app/nimbleedge-ai/id6746237456)
:::

::::

---

## Featured Ecosystem Integrations and Interoperability

::::{grid} 2 2 3 3
:gutter: 2

:::{grid-item-card} **Hugging Face Transformers**
:class-header: bg-secondary text-white

Popular models from Hugging Face easily export to ExecuTorch format for on-device deployment.

[Learn More →](https://github.com/huggingface/optimum-executorch/)
:::

:::{grid-item-card} **React Native ExecuTorch**
:class-header: bg-secondary text-white

Declarative toolkit for running AI models and LLMs in React Native apps with privacy-first, on-device execution.

[Explore →](https://docs.swmansion.com/react-native-executorch/) • [Blog →](https://expo.dev/blog/how-to-run-ai-models-with-react-native-executorch)
:::

:::{grid-item-card} **torchao**
:class-header: bg-secondary text-white

PyTorch-native quantization and optimization library for preparing efficient models for ExecuTorch deployment.

[Blog →](https://pytorch.org/blog/torchao-quantized-models-and-quantization-recipes-now-available-on-huggingface-hub/) • [Qwen Example →](https://huggingface.co/pytorch/Qwen3-4B-INT8-INT4) • [Phi Example →](https://huggingface.co/pytorch/Phi-4-mini-instruct-INT8-INT4) 
:::

:::{grid-item-card} **Unsloth**
:class-header: bg-secondary text-white

Optimize LLM fine-tuning with faster training and reduced VRAM usage, then deploy efficiently with ExecuTorch.

[Example Model →](https://huggingface.co/metascroy/Qwen3-4B-int8-int4-unsloth)
:::

::::

---

## Featured Demos

- **Text and Multimodal LLM demo mobile apps** - Text (Llama, Qwen3, Phi-4) and multimodal (Gemma3, Voxtral) mobile demo apps. [Try →](https://github.com/meta-pytorch/executorch-examples/tree/main/llm)

- **Voxtral** - Deploy audio-text-input LLM on CPU (via XNNPACK) and on CUDA. [Try →](https://github.com/pytorch/executorch/blob/main/examples/models/voxtral/README.md)

- **LoRA adapter** - Export two LoRA adapters that share a single foundation weight file, saving memory and disk space. [Try →](https://github.com/meta-pytorch/executorch-examples/tree/main/program-data-separation/cpp/lora_example)

- **OpenVINO from Intel** - Deploy [Yolo12](https://github.com/pytorch/executorch/tree/main/examples/models/yolo12), [Llama](https://github.com/pytorch/executorch/tree/main/examples/openvino/llama), and [Stable Diffusion](https://github.com/pytorch/executorch/tree/main/examples/openvino/stable_diffusion) on [OpenVINO from Intel](https://www.intel.com/content/www/us/en/developer/articles/community/optimizing-executorch-on-ai-pcs.html).

*Want to showcase your demo? [Submit here →](https://github.com/pytorch/executorch/issues)*