Note
Go to the end to download the full example code.
PyTorch: Tensors#
Created On: Dec 03, 2020 | Last Updated: Sep 29, 2025 | Last Verified: Nov 05, 2024
A third order polynomial, trained to predict \(y=\sin(x)\) from \(-\pi\) to \(\pi\) by minimizing squared Euclidean distance.
This implementation uses PyTorch tensors to manually compute the forward pass, loss, and backward pass.
A PyTorch Tensor is basically the same as a numpy array: it does not know anything about deep learning or computational graphs or gradients, and is just a generic n-dimensional array to be used for arbitrary numeric computation.
The biggest difference between a numpy array and a PyTorch Tensor is that a PyTorch Tensor can run on either CPU or GPU. To run operations on the GPU, just cast the Tensor to a cuda datatype.
99 112.06228637695312
199 78.61465454101562
299 56.05115509033203
399 40.81524658203125
499 30.516984939575195
599 23.54906463623047
699 18.829601287841797
799 15.62961196899414
899 13.457603454589844
999 11.981681823730469
1099 10.977642059326172
1199 10.293853759765625
1299 9.827648162841797
1399 9.509427070617676
1499 9.291967391967773
1599 9.143184661865234
1699 9.041280746459961
1799 8.97139835357666
1899 8.923420906066895
1999 8.890451431274414
Result: y = 0.007187704090029001 + 0.8516734838485718 x + -0.0012399984989315271 x^2 + -0.09260959923267365 x^3
import torch
import math
dtype = torch.float
device = torch.device("cpu")
# device = torch.device("cuda:0") # Uncomment this to run on GPU
# Create random input and output data
x = torch.linspace(-math.pi, math.pi, 2000, device=device, dtype=dtype)
y = torch.sin(x)
# Randomly initialize weights
a = torch.randn((), device=device, dtype=dtype)
b = torch.randn((), device=device, dtype=dtype)
c = torch.randn((), device=device, dtype=dtype)
d = torch.randn((), device=device, dtype=dtype)
learning_rate = 1e-6
for t in range(2000):
# Forward pass: compute predicted y
y_pred = a + b * x + c * x ** 2 + d * x ** 3
# Compute and print loss
loss = (y_pred - y).pow(2).sum().item()
if t % 100 == 99:
print(t, loss)
# Backprop to compute gradients of a, b, c, d with respect to loss
grad_y_pred = 2.0 * (y_pred - y)
grad_a = grad_y_pred.sum()
grad_b = (grad_y_pred * x).sum()
grad_c = (grad_y_pred * x ** 2).sum()
grad_d = (grad_y_pred * x ** 3).sum()
# Update weights using gradient descent
a -= learning_rate * grad_a
b -= learning_rate * grad_b
c -= learning_rate * grad_c
d -= learning_rate * grad_d
print(f'Result: y = {a.item()} + {b.item()} x + {c.item()} x^2 + {d.item()} x^3')
Total running time of the script: (0 minutes 0.224 seconds)