Note
Go to the end to download the full example code.
PyTorch: Tensors#
Created On: Dec 03, 2020 | Last Updated: Sep 29, 2025 | Last Verified: Nov 05, 2024
A third order polynomial, trained to predict \(y=\sin(x)\) from \(-\pi\) to \(\pi\) by minimizing squared Euclidean distance.
This implementation uses PyTorch tensors to manually compute the forward pass, loss, and backward pass.
A PyTorch Tensor is basically the same as a numpy array: it does not know anything about deep learning or computational graphs or gradients, and is just a generic n-dimensional array to be used for arbitrary numeric computation.
The biggest difference between a numpy array and a PyTorch Tensor is that a PyTorch Tensor can run on either CPU or GPU. To run operations on the GPU, just cast the Tensor to a cuda datatype.
99 1039.164306640625
199 690.1633911132812
299 459.378173828125
399 306.76531982421875
499 205.84576416015625
599 139.10977172851562
699 94.97840118408203
799 65.79496765136719
899 46.496337890625
999 33.73435592651367
1099 25.294967651367188
1199 19.714000701904297
1299 16.023290634155273
1399 13.582684516906738
1499 11.968647003173828
1599 10.901314735412598
1699 10.195462226867676
1799 9.728677749633789
1899 9.41999626159668
1999 9.215839385986328
Result: y = -0.000988550134934485 + 0.8373526334762573 x + 0.00017054248019121587 x^2 + -0.09057258814573288 x^3
import torch
import math
dtype = torch.float
device = torch.device("cpu")
# device = torch.device("cuda:0") # Uncomment this to run on GPU
# Create random input and output data
x = torch.linspace(-math.pi, math.pi, 2000, device=device, dtype=dtype)
y = torch.sin(x)
# Randomly initialize weights
a = torch.randn((), device=device, dtype=dtype)
b = torch.randn((), device=device, dtype=dtype)
c = torch.randn((), device=device, dtype=dtype)
d = torch.randn((), device=device, dtype=dtype)
learning_rate = 1e-6
for t in range(2000):
# Forward pass: compute predicted y
y_pred = a + b * x + c * x ** 2 + d * x ** 3
# Compute and print loss
loss = (y_pred - y).pow(2).sum().item()
if t % 100 == 99:
print(t, loss)
# Backprop to compute gradients of a, b, c, d with respect to loss
grad_y_pred = 2.0 * (y_pred - y)
grad_a = grad_y_pred.sum()
grad_b = (grad_y_pred * x).sum()
grad_c = (grad_y_pred * x ** 2).sum()
grad_d = (grad_y_pred * x ** 3).sum()
# Update weights using gradient descent
a -= learning_rate * grad_a
b -= learning_rate * grad_b
c -= learning_rate * grad_c
d -= learning_rate * grad_d
print(f'Result: y = {a.item()} + {b.item()} x + {c.item()} x^2 + {d.item()} x^3')
Total running time of the script: (0 minutes 0.224 seconds)