Rate this Page

PyTorch: Tensors#

Created On: Dec 03, 2020 | Last Updated: May 08, 2026 | Last Verified: Nov 05, 2024

A third order polynomial, trained to predict \(y=\sin(x)\) from \(-\pi\) to \(\pi\) by minimizing squared Euclidean distance.

This implementation uses PyTorch tensors to manually compute the forward pass, loss, and backward pass.

A PyTorch Tensor is basically the same as a numpy array: it does not know anything about deep learning or computational graphs or gradients, and is just a generic n-dimensional array to be used for arbitrary numeric computation.

The biggest difference between a numpy array and a PyTorch Tensor is that a PyTorch Tensor can run on either CPU or GPU. To run operations on the GPU, just cast the Tensor to a cuda datatype.

99 5238.5478515625
199 3586.5361328125
299 2459.189453125
399 1688.9970703125
499 1162.2047119140625
599 801.4767456054688
699 554.1801147460938
799 384.45263671875
899 267.83099365234375
999 187.60887145996094
1099 132.3638916015625
1199 94.27767944335938
1299 67.99238586425781
1399 49.832298278808594
1499 37.2726936340332
1599 28.577613830566406
1699 22.55191421508789
1799 18.37211799621582
1899 15.469992637634277
1999 13.453141212463379
Result: y = -0.06360562145709991 + 0.8256180882453918 x + 0.010973028838634491 x^2 + -0.08890344947576523 x^3

import torch

dtype = torch.float
device = torch.device("cpu")
# device = torch.device("cuda:0") # Uncomment this to run on GPU

# Create random input and output data
x = torch.linspace(-torch.pi, torch.pi, 2000, device=device, dtype=dtype)
y = torch.sin(x)

# Randomly initialize weights
a = torch.randn((), device=device, dtype=dtype)
b = torch.randn((), device=device, dtype=dtype)
c = torch.randn((), device=device, dtype=dtype)
d = torch.randn((), device=device, dtype=dtype)

learning_rate = 1e-6
for t in range(2000):
    # Forward pass: compute predicted y
    y_pred = a + b * x + c * x ** 2 + d * x ** 3

    # Compute and print loss
    loss = (y_pred - y).pow(2).sum().item()
    if t % 100 == 99:
        print(t, loss)

    # Backprop to compute gradients of a, b, c, d with respect to loss
    grad_y_pred = 2.0 * (y_pred - y)
    grad_a = grad_y_pred.sum()
    grad_b = (grad_y_pred * x).sum()
    grad_c = (grad_y_pred * x ** 2).sum()
    grad_d = (grad_y_pred * x ** 3).sum()

    # Update weights using gradient descent
    a -= learning_rate * grad_a
    b -= learning_rate * grad_b
    c -= learning_rate * grad_c
    d -= learning_rate * grad_d


print(f'Result: y = {a.item()} + {b.item()} x + {c.item()} x^2 + {d.item()} x^3')

Total running time of the script: (0 minutes 0.229 seconds)