Success Stories#

Discover how organizations are leveraging ExecuTorch to deploy AI models at scale on edge devices.

Featured Success Stories#

Meta’s Family of Apps

Industry: Social Media & Messaging Hardware: Android & iOS Devices Impact: Billions of users, latency reduction

Powers Instagram, WhatsApp, Facebook, and Messenger with real-time on-device AI for content ranking, recommendations, and privacy-preserving features at scale.

Read Blog →

Meta Quest & Ray-Ban Smart Glasses

Industry: AR/VR & Wearables Hardware: Quest 3, Ray-Ban Meta Smart Glasses, Meta Ray-Ban Display

Enables immersive mixed reality with real-time computer vision, hand tracking, voice commands, and translation on power-constrained wearable devices.

Liquid AI: Efficient, Flexible On-Device Intelligence

Industry: Artificial Intelligence / Edge Computing Hardware: CPU via PyTorch ExecuTorch Impact: 2× faster inference, lower latency, seamless multimodal deployment

Liquid AI builds foundation models that make AI work where the cloud can’t. In its LFM2 series, the team uses PyTorch ExecuTorch within the LEAP Edge SDK to deploy high-performance multimodal models efficiently across devices. ExecuTorch provides the flexibility to support custom architectures and processing pipelines while reducing inference latency through graph optimization and caching. Together, they enable faster, more efficient, privacy-preserving AI that runs entirely on the edge.

Read Blog →

PrivateMind: Complete Privacy with On-Device AI

Industry: Privacy & Personal Computing Hardware: iOS & Android Devices Impact: 100% on-device processing

PrivateMind delivers a fully private AI assistant using ExecuTorch’s .pte format. Built with React Native ExecuTorch, it supports LLaMA, Qwen, Phi-4, and custom models with offline speech-to-text and PDF chat capabilities.

Visit →

NimbleEdge: On-Device Agentic AI Platform

Industry: AI Infrastructure Hardware: iOS & Android Devices Impact: 30% higher TPS on iOS, faster time-to-market with Qwen/Gemma models

NimbleEdge successfully integrated ExecuTorch with its open-source DeliteAI platform to enable agentic workflows orchestrated in Python on mobile devices. The extensible ExecuTorch ecosystem allowed implementation of on-device optimization techniques leveraging contextual sparsity. ExecuTorch significantly accelerated the release of “NimbleEdge AI” for iOS, enabling models like Qwen 2.5 with tool calling support and achieving up to 30% higher transactions per second.

Visit → • Blog → • iOS App →

Featured Ecosystem Integrations and Interoperability#

Hugging Face Transformers

Popular models from Hugging Face easily export to ExecuTorch format for on-device deployment.

Learn More →

React Native ExecuTorch

Declarative toolkit for running AI models and LLMs in React Native apps with privacy-first, on-device execution.

Explore → • Blog →

torchao

PyTorch-native quantization and optimization library for preparing efficient models for ExecuTorch deployment.

Blog → • Qwen Example → • Phi Example →

Unsloth

Optimize LLM fine-tuning with faster training and reduced VRAM usage, then deploy efficiently with ExecuTorch.

Example Model →

Featured Demos#

Text and Multimodal LLM demo mobile apps - Text (Llama, Qwen3, Phi-4) and multimodal (Gemma3, Voxtral) mobile demo apps. Try →
Voxtral - Deploy audio-text-input LLM on CPU (via XNNPACK) and on CUDA. Try →
LoRA adapter - Export two LoRA adapters that share a single foundation weight file, saving memory and disk space. Try →
OpenVINO from Intel - Deploy Yolo12, Llama, and Stable Diffusion on OpenVINO from Intel.

Want to showcase your demo? Submit here →

Success Stories#

Featured Success Stories#

Featured Ecosystem Integrations and Interoperability#

Featured Demos#

Docs

Tutorials

Resources