Hello PyTorch
Published on Mar 9, 2025 in AI Fundamentals PyTorch
In this short article, I’ll show you how to use PyTorch to create a simple neural network. If you haven’t already seen them, I advise you to read my two previous articles on neural networks, part 1 and part 2. In this article, we’ll just use PyTorch to implement the network of the second article.
Here’s the code:
import torch
import torch.nn as nn
import torch.optim as optim
class SimpleMLP(nn.Module):
def __init__(self):
super(SimpleMLP, self).__init__()
# Define the layers
self.hidden_layer = nn.Linear(2, 2) # Hidden layer with 2 inputs and 2 neurons
self.output_layer = nn.Linear(2, 1) # Output layer with 2 inputs and 1 neuron
self.sigmoid = nn.Sigmoid() # Sigmoid activation function
def forward(self, x):
# Forward pass through the network
hidden_output = self.sigmoid(self.hidden_layer(x))
final_output = self.sigmoid(self.output_layer(hidden_output))
return final_output
input_vectors = torch.tensor([[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]], dtype=torch.float32)
output_vectors = torch.tensor([[0.0], [1.0], [1.0], [0.0]], dtype=torch.float32)
network = SimpleMLP()
# Define the loss function and optimizer
criterion = nn.MSELoss() # Mean Squared Error Loss
optimizer = optim.SGD(network.parameters(), lr=0.1) # Stochastic Gradient Descent
# Train the network
max_epochs = 1000000
for epoch in range(max_epochs):
optimizer.zero_grad() # set all gradients to zero
outputs = network(input_vectors) # Forward pass
loss = criterion(outputs, output_vectors) # Calculate the loss
loss.backward() # Backward pass
optimizer.step() # Update the weights
if (epoch + 1) % 1000 == 0:
print(f"Epoch {epoch + 1}: Loss = {loss.item()}")
if loss.item() < 0.0001:
break
# Test the network
with torch.no_grad(): # Disable gradient tracking (no need to calculate gradients for inference)
for inputs in input_vectors:
output = network(inputs)
print(f"Input: {inputs.numpy()}, Output: {round(output.item())}")
If you have understood the previous articles correctly, there is not much to explain.
When you create a neural network in PyTorch, it has to inherit from nn.Module.
Then we define our network directly by layer rather than by neuron.
nn.Linear applies a linear transformation to the input vector. This is the simplest layer of neuron.
There are also convolution/pooling layers to create CNNs, recurrent layers to create RNNs, normalisation layers, dropout layers that can be used to avoid overfitting during training (when the network remembers the data instead of understanding the pattern), and also ready-to-use transformer layers…
The forward method is used to specify how to calculate the output of our network based on the input.
The training is quite simple, since everything is already implemented. Of course, it can be more complex. You can decide to fix the weights of certain layers or add dropouts during training. But generally speaking, the magic of the OOP is rather well used in PyTorch and training a complex network is as simple as training our “Hello World” network. During training, PyTorch automatically updates recursively all submodules of a module.
In our example, using PyTorch has no other interest than to learn how to use PyTorch. In fact, this code is much slower than the previous from scratch code. The interest of PyTorch is to use a GPU to speed up tensor calculations. Tensors are a generalisation of vectors and matrices. The vectors are tensors of order 1, the matrices are tensors of order 2… But for a network with three neurons, it’s not very useful.
But you have to start somewhere.