Tutorial 1: PyTorch#
Week 1, Day 1: Basics and PyTorch
By Neuromatch Academy
Content creators: Shubh Pachchigar, Vladimir Haltakov, Matthew Sargent, Konrad Kording
Content reviewers: Deepak Raya, Siwei Bai, Kelson Shilling-Scrivo
Content editors: Anoop Kulkarni, Spiros Chavlis
Production editors: Arush Tagade, Spiros Chavlis
Tutorial Objectives#
Then have a few specific objectives for this tutorial:
Learn about PyTorch and tensors
Tensor Manipulations
Data Loading
GPUs and CUDA Tensors
Train NaiveNet
Get to know your pod
Start thinking about the course as a whole
Setup#
Throughout your Neuromatch tutorials, most (probably all!) notebooks contain setup cells. These cells will import the required Python packages (e.g., PyTorch, NumPy); set global or environment variables, and load in helper functions for things like plotting. In some tutorials, you will notice that we install some dependencies even if they are preinstalled on Google Colab or Kaggle. This happens because we have added automation to our repository through GitHub Actions.
Be sure to run all of the cells in the setup section. Feel free to expand them and have a look at what you are loading in, but you should be able to fulfill the learning objectives of every tutorial without having to look at these cells.
If you start building your own projects built on this code base we highly recommend looking at them in more detail.
Install dependencies#
Show code cell source
# @title Install dependencies
!pip install pandas --quiet
Install and import feedback gadget#
Show code cell source
# @title Install and import feedback gadget
!pip3 install vibecheck datatops --quiet
from vibecheck import DatatopsContentReviewContainer
def content_review(notebook_section: str):
return DatatopsContentReviewContainer(
"", # No text prompt
notebook_section,
{
"url": "https://pmyvdlilci.execute-api.us-east-1.amazonaws.com/klab",
"name": "neuromatch_dl",
"user_key": "f379rz8y",
},
).render()
feedback_prefix = "W1D1_T1"
# Imports
import time
import random
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# PyTorch libraries
import torch
from torch import nn
from torchvision import datasets
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor
Figure Settings#
Show code cell source
# @title Figure Settings
import logging
logging.getLogger('matplotlib.font_manager').disabled = True
import ipywidgets as widgets
%config InlineBackend.figure_format = 'retina'
plt.style.use("https://raw.githubusercontent.com/NeuromatchAcademy/content-creation/main/nma.mplstyle")
Helper Functions#
Show code cell source
# @title Helper Functions
def checkExercise1(A, B, C, D):
"""
Helper function for checking Exercise 1.
Args:
A: torch.Tensor
Torch Tensor of shape (20, 21) consisting of ones.
B: torch.Tensor
Torch Tensor of size([3,4])
C: torch.Tensor
Torch Tensor of size([20,21])
D: torch.Tensor
Torch Tensor of size([19])
Returns:
Nothing.
"""
assert torch.equal(A.to(int),torch.ones(20, 21).to(int)), "Got: {A} \n Expected: {torch.ones(20, 21)} (shape: {torch.ones(20, 21).shape})"
assert np.array_equal(B.numpy(),np.vander([1, 2, 3], 4)), "Got: {B} \n Expected: {np.vander([1, 2, 3], 4)} (shape: {np.vander([1, 2, 3], 4).shape})"
assert C.shape == (20, 21), "Got: {C} \n Expected (shape: {(20, 21)})"
assert torch.equal(D, torch.arange(4, 41, step=2)), "Got {D} \n Expected: {torch.arange(4, 41, step=2)} (shape: {torch.arange(4, 41, step=2).shape})"
print("All correct")
def timeFun(f, dim, iterations, device='cpu'):
"""
Helper function to calculate amount of time taken per instance on CPU/GPU
Args:
f: BufferedReader IO instance
Function name for which to calculate computational time complexity
dim: Integer
Number of dimensions in instance in question
iterations: Integer
Number of iterations for instance in question
device: String
Device on which respective computation is to be run
Returns:
Nothing
"""
iterations = iterations
t_total = 0
for _ in range(iterations):
start = time.time()
f(dim, device)
end = time.time()
t_total += end - start
if device == 'cpu':
print(f"time taken for {iterations} iterations of {f.__name__}({dim}, {device}): {t_total:.5f}")
else:
print(f"time taken for {iterations} iterations of {f.__name__}({dim}, {device}): {t_total:.5f}")
Important note: Colab users
Scratch Code Cells
If you want to quickly try out something or take a look at the data, you can use scratch code cells. They allow you to run Python code, but will not mess up the structure of your notebook.
To open a new scratch cell go to Insert → Scratch code cell.
Section 1: Welcome to Neuromatch Deep learning course#
Time estimate: ~25mins
Video 1: Welcome and History#
This will be an intensive 3 week adventure. We will all learn Deep Learning (DL) in a group. Groups need standards. Read our Code of Conduct.
Submit your feedback#
Show code cell source
# @title Submit your feedback
content_review(f"{feedback_prefix}_Welcome_and_History_Video")
Video 2: Why DL is cool#
Discuss with your pod: What do you hope to get out of this course? [in about 100 words]
Submit your feedback#
Show code cell source
# @title Submit your feedback
content_review(f"{feedback_prefix}_Why_DL_is_cool_Video")
Section 2: The Basics of PyTorch#
Time estimate: ~2 hours 05 mins
PyTorch is a Python-based scientific computing package targeted at two sets of audiences:
A replacement for NumPy optimized for the power of GPUs
A deep learning platform that provides significant flexibility and speed
At its core, PyTorch provides a few key features:
A multidimensional Tensor object, similar to NumPy Array but with GPU acceleration.
An optimized autograd engine for automatically computing derivatives.
A clean, modular API for building and deploying deep learning models.
You can find more information about PyTorch in the Appendix.
Section 2.1: Creating Tensors#
Video 3: Making Tensors#
Submit your feedback#
Show code cell source
# @title Submit your feedback
content_review(f"{feedback_prefix}_Making_Tensors_Video")
There are various ways of creating tensors, and when doing any real deep learning project, we will usually have to do so.
Construct tensors directly:
# We can construct a tensor directly from some common python iterables,
# such as list and tuple nested iterables can also be handled as long as the
# dimensions are compatible
# tensor from a list
a = torch.tensor([0, 1, 2])
#tensor from a tuple of tuples
b = ((1.0, 1.1), (1.2, 1.3))
b = torch.tensor(b)
# tensor from a numpy array
c = np.ones([2, 3])
c = torch.tensor(c)
print(f"Tensor a: {a}")
print(f"Tensor b: {b}")
print(f"Tensor c: {c}")
Tensor a: tensor([0, 1, 2])
Tensor b: tensor([[1.0000, 1.1000],
[1.2000, 1.3000]])
Tensor c: tensor([[1., 1., 1.],
[1., 1., 1.]], dtype=torch.float64)
Some common tensor constructors:
# The numerical arguments we pass to these constructors
# determine the shape of the output tensor
x = torch.ones(5, 3)
y = torch.zeros(2)
z = torch.empty(1, 1, 5)
print(f"Tensor x: {x}")
print(f"Tensor y: {y}")
print(f"Tensor z: {z}")
Tensor x: tensor([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
Tensor y: tensor([0., 0.])
Tensor z: tensor([[[-3.4336e+34, 3.0897e-41, -3.3997e+34, 3.0897e-41, -1.5337e+08]]])
Notice that .empty()
does not return zeros, but seemingly random numbers. Unlike .zeros()
, which initialises the elements of the tensor with zeros, .empty()
just allocates the memory. It is hence a bit faster if you are looking to just create a tensor.
Creating random tensors and tensors like other tensors:
# There are also constructors for random numbers
# Uniform distribution
a = torch.rand(1, 3)
# Normal distribution
b = torch.randn(3, 4)
# There are also constructors that allow us to construct
# a tensor according to the above constructors, but with
# dimensions equal to another tensor.
c = torch.zeros_like(a)
d = torch.rand_like(c)
print(f"Tensor a: {a}")
print(f"Tensor b: {b}")
print(f"Tensor c: {c}")
print(f"Tensor d: {d}")
Tensor a: tensor([[0.4494, 0.8188, 0.0020]])
Tensor b: tensor([[ 0.3806, -0.6209, 0.9028, -0.2074],
[-0.3469, -1.1184, -0.2948, -1.6126],
[ 2.9554, -0.3908, 0.5899, 0.1914]])
Tensor c: tensor([[0., 0., 0.]])
Tensor d: tensor([[0.1873, 0.9869, 0.5485]])
Reproducibility:
PyTorch Random Number Generator (RNG): You can use
torch.manual_seed()
to seed the RNG for all devices (both CPU and GPU):
import torch
torch.manual_seed(0)
For custom operators, you might need to set python seed as well:
import random
random.seed(0)
Random number generators in other libraries (e.g., NumPy):
import numpy as np
np.random.seed(0)
Here, we define for you a function called set_seed
that does the job for you!
def set_seed(seed=None, seed_torch=True):
"""
Function that controls randomness. NumPy and random modules must be imported.
Args:
seed : Integer
A non-negative integer that defines the random state. Default is `None`.
seed_torch : Boolean
If `True` sets the random seed for pytorch tensors, so pytorch module
must be imported. Default is `True`.
Returns:
Nothing.
"""
if seed is None:
seed = np.random.choice(2 ** 32)