---
myst:
html_meta:
description: Gradient mode guards in PyTorch C++ — NoGradGuard and InferenceMode for disabling gradient computation.
keywords: PyTorch, C++, NoGradGuard, InferenceMode, no_grad, inference, RAII guard
---
# Gradient Modes
PyTorch provides RAII guards to control gradient computation behavior.
## NoGradGuard
```{cpp:class} torch::NoGradGuard
RAII guard that disables gradient computation within its scope.
```
```{cpp:function} NoGradGuard()
Disables gradient computation.
```
```{cpp:function} ~NoGradGuard()
Restores previous gradient mode.
```
**Example:**
```cpp
{
torch::NoGradGuard no_grad;
// No gradients computed in this scope
auto result = model->forward(input);
}
```
## InferenceMode
`c10::InferenceMode` is a RAII guard analogous to `NoGradMode` designed for use
when you are certain your operations will have no interactions with autograd
(e.g., model inference). Compared to `NoGradMode`, code run under this mode gets
better performance by disabling autograd-related work like view tracking and version
counter bumps. However, tensors created inside `InferenceMode` have more limitations
when interacting with the autograd system.
```{cpp:class} c10::InferenceMode
RAII guard that enables inference mode for optimized inference.
This is more efficient than NoGradGuard for inference-only workloads.
```
```{cpp:function} explicit InferenceMode(bool enabled = true)
Enables or disables inference mode.
```
**Inference Tensors:**
`InferenceMode` can be enabled for a given block of code. Inside `InferenceMode`,
all newly allocated (non-view) tensors are marked as inference tensors. Inference tensors:
- Do not have a version counter, so an error will be raised if you try to read their version
(e.g., because you saved this tensor for backward).
- Are immutable outside `InferenceMode`. An error will be raised if you try to:
- Mutate their data outside InferenceMode.
- Mutate them to `requires_grad=True` outside InferenceMode.
- To work around this, make a clone outside `InferenceMode` to get a normal tensor before mutating.
A non-view tensor is an inference tensor if and only if it was allocated inside `InferenceMode`.
A view tensor is an inference tensor if and only if it is a view of an inference tensor.
**Performance Guarantees:**
Inside an `InferenceMode` block:
- Like `NoGradMode`, all operations do not record `grad_fn` even if their inputs have
`requires_grad=True`. This applies to both inference tensors and normal tensors.
- View operations on inference tensors do not perform view tracking. View and non-view
inference tensors are indistinguishable.
- Inplace operations on inference tensors are guaranteed not to do a version bump.
For more implementation details, see the [RFC-0011-InferenceMode](https://github.com/pytorch/rfcs/pull/17).
**Basic Example:**
```cpp
{
c10::InferenceMode guard;
// Optimized inference without gradient tracking
auto result = model->forward(input);
}
```
**Inference Workload Example:**
```cpp
c10::InferenceMode guard;
model.load_jit(saved_model);
auto inputs = preprocess_tensors(data);
auto out = model.forward(inputs);
auto outputs = postprocess_tensors(out);
```
**Nested InferenceMode:**
Unlike some other guards, `InferenceMode` can be nested with different enabled/disabled states:
```cpp
{
c10::InferenceMode guard(true);
// InferenceMode is on
{
c10::InferenceMode guard(false);
// InferenceMode is off
}
// InferenceMode is on
}
// InferenceMode is off
```
## InferenceMode vs NoGradMode
`InferenceMode` is preferred over `NoGradMode` for pure inference workloads because
it provides better performance. Key differences:
- Both guards affect tensor execution to skip work not related to inference, but
`InferenceMode` also affects tensor creation while `NoGradMode` doesn't.
- Tensors created inside `InferenceMode` are marked as inference tensors with
certain limitations that apply after exiting `InferenceMode`.
- `InferenceMode` can be nested with enabled/disabled states.
## Migrating from AutoNonVariableTypeMode
The legacy `AutoNonVariableTypeMode` guard (now renamed to
`AutoDispatchBelowADInplaceOrView`) was commonly used for inference workloads
but is unsafe — it can silently bypass safety checks and produce wrong results.
- **For inference-only workloads** (e.g. loading a pretrained JIT model and
running inference in C++ runtime), use `c10::InferenceMode` as a drop-in
replacement. It preserves the performance characteristics while providing
correctness guarantees.
- **For custom autograd kernels** that need to redispatch below the Autograd
dispatch key, use `AutoDispatchBelowADInplaceOrView` instead:
```cpp
class ROIAlignFunction : public torch::autograd::Function {
public:
static torch::autograd::variable_list forward(
torch::autograd::AutogradContext* ctx,
const torch::autograd::Variable& input,
const torch::autograd::Variable& rois,
double spatial_scale, int64_t pooled_height,
int64_t pooled_width, int64_t sampling_ratio, bool aligned) {
ctx->saved_data["spatial_scale"] = spatial_scale;
ctx->save_for_backward({rois});
at::AutoDispatchBelowADInplaceOrView guard;
auto result = roi_align(input, rois, spatial_scale,
pooled_height, pooled_width, sampling_ratio, aligned);
return {result};
}
};
```