How to build models with PyTorch
· Category: AI & Machine Learning
Short answer
PyTorch is a Python-first deep learning framework that uses dynamic computation graphs and an intuitive object-oriented API for model building.
Steps
- Install PyTorch and confirm GPU support with torch.cuda.is_available.
- Create tensors and move them to the GPU using the to method.
- Define models by subclassing nn.Module and implementing the forward method.
- Choose an optimizer and loss function from torch.optim and nn.
- Write a training loop that performs forward passes, loss computation, backward passes, and optimizer steps.
Tips
- Use torch.nn.DataParallel or DistributedDataParallel for multi-GPU training.
- Leverage torchvision and torchtext for standard datasets and transforms.
- Profile code with PyTorch Profiler to identify bottlenecks.
- Use torch.compile in newer versions for graph-level optimizations.
Common issues
- Forgetting to call optimizer.zero_grad before backward passes.
- Mismatched tensor shapes in layer definitions causing runtime errors.
- CPU-GPU tensor mismatches when not explicitly moving data.
- Memory fragmentation on GPU leading to out-of-memory errors.
Example
import torch
import torch.nn as nn
class Net(nn.Module):
def __init__(self):
super().__init__()
self.fc = nn.Linear(784, 10)
def forward(self, x):
return self.fc(x)
model = Net()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
This snippet defines a simple PyTorch model, demonstrating the nn.Module subclass pattern and optimizer setup.