Rate this Page

PyTorch: Tensors#

Created On: Dec 03, 2020 | Last Updated: Sep 29, 2025 | Last Verified: Nov 05, 2024

A third order polynomial, trained to predict \(y=\sin(x)\) from \(-\pi\) to \(\pi\) by minimizing squared Euclidean distance.

This implementation uses PyTorch tensors to manually compute the forward pass, loss, and backward pass.

A PyTorch Tensor is basically the same as a numpy array: it does not know anything about deep learning or computational graphs or gradients, and is just a generic n-dimensional array to be used for arbitrary numeric computation.

The biggest difference between a numpy array and a PyTorch Tensor is that a PyTorch Tensor can run on either CPU or GPU. To run operations on the GPU, just cast the Tensor to a cuda datatype.

99 820.1533203125
199 545.5382080078125
299 363.88311767578125
399 243.71710205078125
499 164.2242431640625
599 111.63668823242188
699 76.84640502929688
799 53.82978820800781
899 38.60157012939453
999 28.526073455810547
1099 21.859657287597656
1199 17.448596954345703
1299 14.529678344726562
1399 12.598100662231445
1499 11.319828987121582
1599 10.47384262084961
1699 9.913894653320312
1799 9.543272018432617
1899 9.297931671142578
1999 9.135520935058594
Result: y = 0.002754510845988989 + 0.8738998174667358 x + -0.0004751963715534657 x^2 + -0.09577110409736633 x^3

import torch
import math


dtype = torch.float
device = torch.device("cpu")
# device = torch.device("cuda:0") # Uncomment this to run on GPU

# Create random input and output data
x = torch.linspace(-math.pi, math.pi, 2000, device=device, dtype=dtype)
y = torch.sin(x)

# Randomly initialize weights
a = torch.randn((), device=device, dtype=dtype)
b = torch.randn((), device=device, dtype=dtype)
c = torch.randn((), device=device, dtype=dtype)
d = torch.randn((), device=device, dtype=dtype)

learning_rate = 1e-6
for t in range(2000):
    # Forward pass: compute predicted y
    y_pred = a + b * x + c * x ** 2 + d * x ** 3

    # Compute and print loss
    loss = (y_pred - y).pow(2).sum().item()
    if t % 100 == 99:
        print(t, loss)

    # Backprop to compute gradients of a, b, c, d with respect to loss
    grad_y_pred = 2.0 * (y_pred - y)
    grad_a = grad_y_pred.sum()
    grad_b = (grad_y_pred * x).sum()
    grad_c = (grad_y_pred * x ** 2).sum()
    grad_d = (grad_y_pred * x ** 3).sum()

    # Update weights using gradient descent
    a -= learning_rate * grad_a
    b -= learning_rate * grad_b
    c -= learning_rate * grad_c
    d -= learning_rate * grad_d


print(f'Result: y = {a.item()} + {b.item()} x + {c.item()} x^2 + {d.item()} x^3')

Total running time of the script: (0 minutes 0.232 seconds)