How to use `torch.compile` on Windows CPU/XPU¶

Introduction¶

TorchInductor is the new compiler backend that compiles the FX Graphs generated by TorchDynamo into optimized C++/Triton kernels.

This tutorial introduces the steps for using TorchInductor via torch.compile on Windows CPU/XPU.

Software Installation¶

Now, we will walk you through a step-by-step tutorial for how to use torch.compile on Windows CPU/XPU.

Install a Compiler¶

C++ compiler is required for TorchInductor optimization, let’s take Microsoft Visual C++ (MSVC) as an example.

Download and install MSVC.
During Installation, select Workloads and then Desktop & Mobile. Select a checkmark on Desktop Development with C++ and install.

Note

Windows CPU inductor also support C++ compiler LLVM Compiler and Intel Compiler for better performance. Please check Alternative Compiler for better performance on CPU.

Set Up Environment¶

Next, let’s configure our environment.

Open a command line environment via cmd.exe.

Activate MSVC via below command:

"C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Auxiliary/Build/vcvars64.bat"

Create and activate a virtual environment:
Install PyTorch 2.5 or later for CPU Usage. Install PyTorch 2.7 or later refer to Getting Started on Intel GPU for XPU usage.

Here is an example of how to use TorchInductor on Windows:

import torch
device="cpu" # or "xpu" for XPU
def foo(x, y):
    a = torch.sin(x)
    b = torch.cos(x)
    return a + b
opt_foo1 = torch.compile(foo)
print(opt_foo1(torch.randn(10, 10).to(device), torch.randn(10, 10).to(device)))

Below is the output of the above example:

tensor([[-3.9074e-02,  1.3994e+00,  1.3894e+00,  3.2630e-01,  8.3060e-01,
        1.1833e+00,  1.4016e+00,  7.1905e-01,  9.0637e-01, -1.3648e+00],
        [ 1.3728e+00,  7.2863e-01,  8.6888e-01, -6.5442e-01,  5.6790e-01,
        5.2025e-01, -1.2647e+00,  1.2684e+00, -1.2483e+00, -7.2845e-01],
        [-6.7747e-01,  1.2028e+00,  1.1431e+00,  2.7196e-02,  5.5304e-01,
        6.1945e-01,  4.6654e-01, -3.7376e-01,  9.3644e-01,  1.3600e+00],
        [-1.0157e-01,  7.7200e-02,  1.0146e+00,  8.8175e-02, -1.4057e+00,
        8.8119e-01,  6.2853e-01,  3.2773e-01,  8.5082e-01,  8.4615e-01],
        [ 1.4140e+00,  1.2130e+00, -2.0762e-01,  3.3914e-01,  4.1122e-01,
        8.6895e-01,  5.8852e-01,  9.3310e-01,  1.4101e+00,  9.8318e-01],
        [ 1.2355e+00,  7.9290e-02,  1.3707e+00,  1.3754e+00,  1.3768e+00,
        9.8970e-01,  1.1171e+00, -5.9944e-01,  1.2553e+00,  1.3394e+00],
        [-1.3428e+00,  1.8400e-01,  1.1756e+00, -3.0654e-01,  9.7973e-01,
        1.4019e+00,  1.1886e+00, -1.9194e-01,  1.3632e+00,  1.1811e+00],
        [-7.1615e-01,  4.6622e-01,  1.2089e+00,  9.2011e-01,  1.0659e+00,
        9.0892e-01,  1.1932e+00,  1.3888e+00,  1.3898e+00,  1.3218e+00],
        [ 1.4139e+00, -1.4000e-01,  9.1192e-01,  3.0175e-01, -9.6432e-01,
        -1.0498e+00,  1.4115e+00, -9.3212e-01, -9.0964e-01,  1.0127e+00],
        [ 5.7244e-04,  1.2799e+00,  1.3595e+00,  1.0907e+00,  3.7191e-01,
        1.4062e+00,  1.3672e+00,  6.8502e-02,  8.5216e-01,  8.6046e-01]])

Alternative Compiler for better performance on CPU¶

To enhance performance for inductor on Windows CPU, you can use the Intel Compiler or LLVM Compiler. However, they rely on the runtime libraries from Microsoft Visual C++ (MSVC). Therefore, your first step should be to install MSVC.

Intel Compiler¶

Download and install Intel Compiler with Windows version.
Set Windows Inductor Compiler via environment variable set CXX=icx-cl.

LLVM Compiler¶

Download and install LLVM Compiler and choose win64 version.
Set Windows Inductor Compiler via environment variable set CXX=clang-cl.

Conclusion¶

In this tutorial, we introduce how to use Inductor on Windows CPU with PyTorch 2.5 or later, and on Windows XPU with PyTorch 2.7 or later. We can also use Intel Compiler or LLVM Compiler to get better performance on CPU.

How to use `torch.compile` on Windows CPU/XPU¶

Introduction¶

Software Installation¶

Install a Compiler¶

Set Up Environment¶

Alternative Compiler for better performance on CPU¶

Intel Compiler¶

LLVM Compiler¶

Conclusion¶

Docs

Tutorials

Resources

How to use torch.compile on Windows CPU/XPU¶

Introduction¶

Software Installation¶

Install a Compiler¶

Set Up Environment¶

Alternative Compiler for better performance on CPU¶

Intel Compiler¶

LLVM Compiler¶

Conclusion¶

Docs

Tutorials

Resources

How to use `torch.compile` on Windows CPU/XPU¶