• Tutorials >
  • How to use torch.compile on Windows CPU/XPU
Shortcuts

How to use torch.compile on Windows CPU/XPU

Author: Zhaoqiong Zheng, Xu, Han

Introduction

TorchInductor is the new compiler backend that compiles the FX Graphs generated by TorchDynamo into optimized C++/Triton kernels.

This tutorial introduces the steps for using TorchInductor via torch.compile on Windows CPU/XPU.

Software Installation

Now, we will walk you through a step-by-step tutorial for how to use torch.compile on Windows CPU/XPU.

Install a Compiler

C++ compiler is required for TorchInductor optimization, let’s take Microsoft Visual C++ (MSVC) as an example.

  1. Download and install MSVC.

  2. During Installation, select Workloads and then Desktop & Mobile. Select a checkmark on Desktop Development with C++ and install.

../_images/install_msvc.png

Note

Windows CPU inductor also support C++ compiler LLVM Compiler and Intel Compiler for better performance. Please check Alternative Compiler for better performance on CPU.

Set Up Environment

Next, let’s configure our environment.

  1. Open a command line environment via cmd.exe.

  2. Activate MSVC via below command:

    "C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Auxiliary/Build/vcvars64.bat"
    
  3. Create and activate a virtual environment:

  4. Install PyTorch 2.5 or later for CPU Usage. Install PyTorch 2.7 or later refer to Getting Started on Intel GPU for XPU usage.

  5. Here is an example of how to use TorchInductor on Windows:

    import torch
    device="cpu" # or "xpu" for XPU
    def foo(x, y):
        a = torch.sin(x)
        b = torch.cos(x)
        return a + b
    opt_foo1 = torch.compile(foo)
    print(opt_foo1(torch.randn(10, 10).to(device), torch.randn(10, 10).to(device)))
    
  6. Below is the output of the above example:

    tensor([[-3.9074e-02,  1.3994e+00,  1.3894e+00,  3.2630e-01,  8.3060e-01,
            1.1833e+00,  1.4016e+00,  7.1905e-01,  9.0637e-01, -1.3648e+00],
            [ 1.3728e+00,  7.2863e-01,  8.6888e-01, -6.5442e-01,  5.6790e-01,
            5.2025e-01, -1.2647e+00,  1.2684e+00, -1.2483e+00, -7.2845e-01],
            [-6.7747e-01,  1.2028e+00,  1.1431e+00,  2.7196e-02,  5.5304e-01,
            6.1945e-01,  4.6654e-01, -3.7376e-01,  9.3644e-01,  1.3600e+00],
            [-1.0157e-01,  7.7200e-02,  1.0146e+00,  8.8175e-02, -1.4057e+00,
            8.8119e-01,  6.2853e-01,  3.2773e-01,  8.5082e-01,  8.4615e-01],
            [ 1.4140e+00,  1.2130e+00, -2.0762e-01,  3.3914e-01,  4.1122e-01,
            8.6895e-01,  5.8852e-01,  9.3310e-01,  1.4101e+00,  9.8318e-01],
            [ 1.2355e+00,  7.9290e-02,  1.3707e+00,  1.3754e+00,  1.3768e+00,
            9.8970e-01,  1.1171e+00, -5.9944e-01,  1.2553e+00,  1.3394e+00],
            [-1.3428e+00,  1.8400e-01,  1.1756e+00, -3.0654e-01,  9.7973e-01,
            1.4019e+00,  1.1886e+00, -1.9194e-01,  1.3632e+00,  1.1811e+00],
            [-7.1615e-01,  4.6622e-01,  1.2089e+00,  9.2011e-01,  1.0659e+00,
            9.0892e-01,  1.1932e+00,  1.3888e+00,  1.3898e+00,  1.3218e+00],
            [ 1.4139e+00, -1.4000e-01,  9.1192e-01,  3.0175e-01, -9.6432e-01,
            -1.0498e+00,  1.4115e+00, -9.3212e-01, -9.0964e-01,  1.0127e+00],
            [ 5.7244e-04,  1.2799e+00,  1.3595e+00,  1.0907e+00,  3.7191e-01,
            1.4062e+00,  1.3672e+00,  6.8502e-02,  8.5216e-01,  8.6046e-01]])
    

Alternative Compiler for better performance on CPU

To enhance performance for inductor on Windows CPU, you can use the Intel Compiler or LLVM Compiler. However, they rely on the runtime libraries from Microsoft Visual C++ (MSVC). Therefore, your first step should be to install MSVC.

Intel Compiler

  1. Download and install Intel Compiler with Windows version.

  2. Set Windows Inductor Compiler via environment variable set CXX=icx-cl.

LLVM Compiler

  1. Download and install LLVM Compiler and choose win64 version.

  2. Set Windows Inductor Compiler via environment variable set CXX=clang-cl.

Conclusion

In this tutorial, we introduce how to use Inductor on Windows CPU with PyTorch 2.5 or later, and on Windows XPU with PyTorch 2.7 or later. We can also use Intel Compiler or LLVM Compiler to get better performance on CPU.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources