Success Stories#
Discover how organizations are leveraging ExecuTorch to deploy AI models at scale on edge devices.
Featured Success Stories#
Industry: Social Media & Messaging Hardware: Android & iOS Devices Impact: Billions of users, latency reduction
Powers Instagram, WhatsApp, Facebook, and Messenger with real-time on-device AI for content ranking, recommendations, and privacy-preserving features at scale.
Industry: AR/VR & Wearables Hardware: Quest 3, Ray-Ban Meta Smart Glasses, Meta Ray-Ban Display
Enables immersive mixed reality with real-time computer vision, hand tracking, voice commands, and translation on power-constrained wearable devices.
Industry: Artificial Intelligence / Edge Computing Hardware: CPU via PyTorch ExecuTorch Impact: 2× faster inference, lower latency, seamless multimodal deployment
Liquid AI builds foundation models that make AI work where the cloud can’t. In its LFM2 series, the team uses PyTorch ExecuTorch within the LEAP Edge SDK to deploy high-performance multimodal models efficiently across devices. ExecuTorch provides the flexibility to support custom architectures and processing pipelines while reducing inference latency through graph optimization and caching. Together, they enable faster, more efficient, privacy-preserving AI that runs entirely on the edge.
Industry: Privacy & Personal Computing Hardware: iOS & Android Devices Impact: 100% on-device processing
PrivateMind delivers a fully private AI assistant using ExecuTorch’s .pte format. Built with React Native ExecuTorch, it supports LLaMA, Qwen, Phi-4, and custom models with offline speech-to-text and PDF chat capabilities.
Industry: AI Infrastructure Hardware: iOS & Android Devices Impact: 30% higher TPS on iOS, faster time-to-market with Qwen/Gemma models
NimbleEdge successfully integrated ExecuTorch with its open-source DeliteAI platform to enable agentic workflows orchestrated in Python on mobile devices. The extensible ExecuTorch ecosystem allowed implementation of on-device optimization techniques leveraging contextual sparsity. ExecuTorch significantly accelerated the release of “NimbleEdge AI” for iOS, enabling models like Qwen 2.5 with tool calling support and achieving up to 30% higher transactions per second.
Featured Ecosystem Integrations and Interoperability#
Popular models from Hugging Face easily export to ExecuTorch format for on-device deployment.
PyTorch-native quantization and optimization library for preparing efficient models for ExecuTorch deployment.
Optimize LLM fine-tuning with faster training and reduced VRAM usage, then deploy efficiently with ExecuTorch.
Featured Demos#
Text and Multimodal LLM demo mobile apps - Text (Llama, Qwen3, Phi-4) and multimodal (Gemma3, Voxtral) mobile demo apps. Try →
Voxtral - Deploy audio-text-input LLM on CPU (via XNNPACK) and on CUDA. Try →
LoRA adapter - Export two LoRA adapters that share a single foundation weight file, saving memory and disk space. Try →
OpenVINO from Intel - Deploy Yolo12, Llama, and Stable Diffusion on OpenVINO from Intel.
Want to showcase your demo? Submit here →