Note
Go to the end to download the full example code.
PyTorch: Tensors#
Created On: Dec 03, 2020 | Last Updated: Dec 03, 2020 | Last Verified: Nov 05, 2024
A third order polynomial, trained to predict \(y=\sin(x)\) from \(-\pi\) to \(pi\) by minimizing squared Euclidean distance.
This implementation uses PyTorch tensors to manually compute the forward pass, loss, and backward pass.
A PyTorch Tensor is basically the same as a numpy array: it does not know anything about deep learning or computational graphs or gradients, and is just a generic n-dimensional array to be used for arbitrary numeric computation.
The biggest difference between a numpy array and a PyTorch Tensor is that a PyTorch Tensor can run on either CPU or GPU. To run operations on the GPU, just cast the Tensor to a cuda datatype.
99 292.73968505859375
199 204.3029022216797
299 143.56146240234375
399 101.79446411132812
499 73.0423583984375
599 53.22780227661133
699 39.55778503417969
799 30.116783142089844
899 23.58966827392578
999 19.072429656982422
1099 15.943032264709473
1199 13.77301025390625
1299 12.26677131652832
1399 11.22032356262207
1499 10.492674827575684
1599 9.986249923706055
1699 9.633503913879395
1799 9.387591361999512
1899 9.216033935546875
1999 9.096254348754883
Result: y = -0.01618683896958828 + 0.8502035140991211 x + 0.0027924999594688416 x^2 + -0.09240050613880157 x^3
import torch
import math
dtype = torch.float
device = torch.device("cpu")
# device = torch.device("cuda:0") # Uncomment this to run on GPU
# Create random input and output data
x = torch.linspace(-math.pi, math.pi, 2000, device=device, dtype=dtype)
y = torch.sin(x)
# Randomly initialize weights
a = torch.randn((), device=device, dtype=dtype)
b = torch.randn((), device=device, dtype=dtype)
c = torch.randn((), device=device, dtype=dtype)
d = torch.randn((), device=device, dtype=dtype)
learning_rate = 1e-6
for t in range(2000):
# Forward pass: compute predicted y
y_pred = a + b * x + c * x ** 2 + d * x ** 3
# Compute and print loss
loss = (y_pred - y).pow(2).sum().item()
if t % 100 == 99:
print(t, loss)
# Backprop to compute gradients of a, b, c, d with respect to loss
grad_y_pred = 2.0 * (y_pred - y)
grad_a = grad_y_pred.sum()
grad_b = (grad_y_pred * x).sum()
grad_c = (grad_y_pred * x ** 2).sum()
grad_d = (grad_y_pred * x ** 3).sum()
# Update weights using gradient descent
a -= learning_rate * grad_a
b -= learning_rate * grad_b
c -= learning_rate * grad_c
d -= learning_rate * grad_d
print(f'Result: y = {a.item()} + {b.item()} x + {c.item()} x^2 + {d.item()} x^3')
Total running time of the script: (0 minutes 0.230 seconds)