---
myst:
html_meta:
description: PyTorch C++ API for computing gradients — torch::autograd::backward and torch::autograd::grad functions for automatic differentiation.
keywords: PyTorch, C++, autograd, backward, grad, gradient, automatic differentiation
---
# Gradient Computation
PyTorch provides functions for computing gradients of tensors with respect
to graph leaves.
## Gradient Functions
```{cpp:function} void torch::autograd::backward(const variable_list& tensors, const variable_list& grad_tensors = {}, std::optional retain_graph = std::nullopt, bool create_graph = false, const variable_list& inputs = {})
Computes the sum of gradients of given tensors with respect to graph leaves.
The graph is differentiated using the chain rule. If any of `tensors`
are non-scalar (i.e. their data has more than one element) and require
gradient, then the Jacobian-vector product would be computed, in this case
the function additionally requires specifying `grad_tensors`. It should be a
sequence of matching length, that contains the "vector" in the
Jacobian-vector product, usually the gradient of the differentiated function
w.r.t. corresponding tensors (`torch::Tensor()` is an acceptable value for
all tensors that don't need gradient tensors).
This function accumulates gradients in the leaves — you might need to zero
them before calling it.
:param tensors: Tensors of which the derivative will be computed.
:param grad_tensors: The "vector" in the Jacobian-vector product, usually
gradients w.r.t. each element of corresponding tensors.
`torch::Tensor()` values can be specified for scalar Tensors or ones
that don't require grad. If a `torch::Tensor()` value would be
acceptable for all grad_tensors, then this argument is optional.
:param retain_graph: If `false`, the graph used to compute the grad will
be freed. Note that in nearly all cases setting this option to `true`
is not needed and often can be worked around in a much more efficient
way. Defaults to the value of `create_graph`.
:param create_graph: If `true`, graph of the derivative will be
constructed, allowing to compute higher order derivative products.
Defaults to `false`.
:param inputs: Inputs w.r.t. which the gradient will be accumulated into
`at::Tensor::grad`. All other Tensors will be ignored. If not
provided, the gradient is accumulated into all the leaf Tensors that
were used to compute `tensors`.
```
```{doxygenfunction} torch::autograd::grad
```
**Example:**
```cpp
#include
auto x = torch::randn({2, 2}, torch::requires_grad());
auto y = x * x;
auto z = y.sum();
// Compute gradients
z.backward();
std::cout << x.grad() << std::endl;
// Or use grad() for specific outputs
auto grads = torch::autograd::grad({z}, {x});
```
## Tensor Gradient Methods
Tensors have built-in methods for gradient computation:
```cpp
// Enable gradient tracking
auto x = torch::randn({2, 2}).requires_grad_(true);
// Check if gradient is required
bool needs_grad = x.requires_grad();
// Access the gradient after backward
auto grad = x.grad();
// Detach from computation graph
auto x_detached = x.detach();
```