2020-02-17 Meeting 1 2020-02-29 Meeting 2 2020-03-07 Meeting 3 csua index pytorch seq2seq umtn website

Pytorch tutorial

The tutorial has links to Jupyter notebooks and Google Colab notebooks!
If running in Colab, be sure to enable the GPU (Runtime > Change runtime type)

Summaries / key takeaways

What is Pytorch?

Make sure you know how to do the following:

How to construct empty, random, all-ones, and all-zeros tensors
How to specify tensor datatype
Create a tensor with the same shape as another tensor, but a different datatype
Look up tensor operations
Add, subtract and multiply tensors
Convert between Numpy arrays and tensors
Move tensors between CPU and GPU

Autograd

Useful takeaways from this tutorial:
- with torch.no_grad()
- requires_grad determines whether the gradient is computed.
Don’t know how important the other stuff is. Part 3 will go into how weight updates etc. is done in practice.

Neural Networks

Notice how they define the network as a class.
Dense layers are called “Linear”
What the network actually does is in “forward”
- Layers that don’t need weights (e.g. max_pool) aren’t members of the class
- E.g. F.max_pool2d is used in forward but not defined in __init__
.parameters() gets weights
Using zero_grad is important at the start of each training loop. I copied the training loop from the end of the document, it’s worth remembering.

optimizer.zero_grad() # Zero gradients
output = net(input)
loss = criterion(output, target) # Compute loss
loss.backward() # Backprop
optimizer.step() # Update step

Training a Classifier

Saving and loading
torchvision is pretty good for computer vision

Datasets and Parallelization

DataLoader and Dataset. Basically, create a Dataset that provides the data, and DataLoader can do shuffling and batching on that dataset.
DataParallel splits batches and distributes them among GPUs. This can be useful since CSUA allows 2 GPUs per person by default!