Multi Layer Perceptron

Simple feedforward Multilayer perceptron models

MLP

 MLP (n_in:int=784, n_h:int=64, n_out:int=10, dropout:float=0.2)

*Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool*
	Type	Default	Details
n_in	int	784	input dimension e.g. (H,W) for image
n_h	int	64	hidden dimension
n_out	int	10	output dimension (= number of classes for classification)
dropout	float	0.2
Returns	None

Usage

image = torch.rand((5, 28*28))
mlp = MLP(n_in=28*28, n_h=64, n_out=10, dropout=0.1)
out = mlp(image)
print(out.shape)
# cfg = OmegaConf.load('../config/model/image/mlpx.yaml')
# model = instantiate(cfg.nnet)
# out = model(image)
# print(out.shape)

[00:06:00] INFO - MLP: init
[00:06:00] INFO - MLP: init

torch.Size([5, 10])
torch.Size([5, 10])

Training

# load from config file
cfg = OmegaConf.load('../config/image/data/mnist.yaml')
datamodule = instantiate(cfg.datamodule)
datamodule.prepare_data()
datamodule.setup()

x = datamodule.test_ds[0][0] # (C, H, W)
label = datamodule.test_ds[0][1] #(int)
print("original shape (C,H,W): ", x.shape)
print("reshape (C,HxW): ", x.view(x.size(0), -1).shape)
print(x[0][1])

# using nimrod datamodule
train_loader = datamodule.train_dataloader()
val_loader = datamodule.val_dataloader()
test_loader = datamodule.test_dataloader()

[16:43:14] INFO - Init ImageDataModule for mnist
[16:43:17] INFO - loading dataset mnist with args () from split train
[16:43:24] INFO - loading dataset mnist with args () from split test
[16:43:26] INFO - split train into train/val [0.8, 0.2]
[16:43:26] INFO - train: 48000 val: 12000, test: 10000

original shape (C,H,W):  torch.Size([1, 28, 28])
reshape (C,HxW):  torch.Size([1, 784])
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0.])

Training loop

device = get_device()

# data
cfg = OmegaConf.load('../config/image/data/mnist.yaml')
cfg.batch_size = 2048
datamodule = instantiate(cfg.datamodule)
datamodule.prepare_data()
datamodule.setup()

# model
model = mlp.to(device)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)


n_epochs = 2
losses = []
lrs = []
current_step = 0
steps_per_epoch = len(datamodule.train_ds) // cfg.datamodule.batch_size
total_steps = steps_per_epoch * n_epochs
print(f"steps_per_epoch: {steps_per_epoch}, total_steps: {total_steps}")

for epoch in range(n_epochs):
    model.train()
    for images, labels in datamodule.train_dataloader():
        optimizer.zero_grad()
        images = images.view(-1, 28*28)
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        loss = criterion(outputs, labels)        
        loss.backward()
        optimizer.step()
        losses.append(loss.item())
        current_lr = optimizer.param_groups[0]['lr']
        lrs.append(current_lr)
        if not (current_step % 100):
            print(f"Loss {loss.item():.4f}, Current LR: {current_lr:.10f}, Step: {current_step}/{total_steps}")
        current_step += 1

    model.eval()
    with torch.no_grad():
        correct = 0
        total = 0
        for images, labels in datamodule.test_dataloader():
            # model expects input (B,H*W)
            images = images.view(-1, 28*28).to(device)
            images = images.to(device)
            labels = labels.to(device)
            # Pass the input through the model
            outputs = model(images)
            # Get the predicted labels
            _, predicted = torch.max(outputs.data, 1)

            # Update the total and correct counts
            total += labels.size(0)
            correct += (predicted == labels).sum()

        # Print the accuracy
        print(f"Epoch {epoch + 1}: Accuracy = {100 * correct / total:.2f}%")

[16:43:26] INFO - Using device: mps
[16:43:26] INFO - Init ImageDataModule for mnist

CPU times: user 1 μs, sys: 0 ns, total: 1 μs
Wall time: 2.62 μs

[16:43:28] INFO - loading dataset mnist with args () from split train
[16:43:35] INFO - loading dataset mnist with args () from split test
[16:43:37] INFO - split train into train/val [0.8, 0.2]
[16:43:37] INFO - train: 48000 val: 12000, test: 10000

steps_per_epoch: 750, total_steps: 1500
Loss 2.2913, Current LR: 0.0010000000, Step: 0/1500
Loss 0.6561, Current LR: 0.0010000000, Step: 100/1500
Loss 0.5902, Current LR: 0.0010000000, Step: 200/1500
Loss 0.4121, Current LR: 0.0010000000, Step: 300/1500
Loss 0.4594, Current LR: 0.0010000000, Step: 400/1500
Loss 0.1972, Current LR: 0.0010000000, Step: 500/1500
Loss 0.2817, Current LR: 0.0010000000, Step: 600/1500
Loss 0.2748, Current LR: 0.0010000000, Step: 700/1500
Epoch 1: Accuracy = 93.09%
Loss 0.2815, Current LR: 0.0010000000, Step: 800/1500
Loss 0.5310, Current LR: 0.0010000000, Step: 900/1500
Loss 0.1237, Current LR: 0.0010000000, Step: 1000/1500
Loss 0.2789, Current LR: 0.0010000000, Step: 1100/1500
Loss 0.3667, Current LR: 0.0010000000, Step: 1200/1500
Loss 0.1536, Current LR: 0.0010000000, Step: 1300/1500
Loss 0.4022, Current LR: 0.0010000000, Step: 1400/1500
Epoch 2: Accuracy = 94.73%

# plt.figure(1)
# plt.subplot(211)
plt.ylabel('loss')
plt.xlabel('step')
plt.plot(losses)
# plt.subplot(212)
# plt.ylabel('lr')
# plt.xlabel('step')
# plt.plot(lrs)

MLP_X

source

MLP_X

 MLP_X (nnet:__main__.MLP, num_classes:int,
        optimizer:Callable[...,torch.optim.optimizer.Optimizer],
        scheduler:Optional[Callable[...,Any]]=None)

Helper class that provides a standard way to create an ABC using inheritance.

	Type	Default	Details
nnet	MLP		mlp neural net
num_classes	int		number of classes
optimizer	Callable		optimizer
scheduler	Optional	None	scheduler

Usage

cfg = OmegaConf.load('../config/model/image/mlpx.yaml')
model = instantiate(cfg.nnet)
b = torch.rand((16,1, 28*28))
y = model(b)
print(y.shape)

[13:18:42] INFO - MLP: init

torch.Size([16, 1, 10])

Nimrod training

MAX_EPOCHS = 5
# data module config
cfg = OmegaConf.load('../config/image/data/mnist.yaml')
cfg.datamodule.batch_size = 512
cfg.datamodule.num_workers = 0
cfg.datamodule.pin_memory = True
datamodule = instantiate(cfg.datamodule)
datamodule.prepare_data()
datamodule.setup()

# lr monitor
cfg = OmegaConf.load('../config/callbacks/learning_rate_monitor.yaml')
lr_monitor = instantiate(cfg.learning_rate_monitor)

# model
cfg = OmegaConf.load('../config/model/image/mlp.yaml')

trainer = Trainer(
    accelerator="auto",
    min_epochs=1,
    max_epochs=MAX_EPOCHS,
    logger=CSVLogger("logs", name="mnist_mlp"),
    callbacks=[lr_monitor],
    check_val_every_n_epoch=1
    )


tuner = Tuner(trainer)
lr_finder = tuner.lr_find(
    model,
    datamodule=datamodule,
    min_lr=1e-6,
    max_lr=1.0,
    num_training=100,  # number of iterations
    # attr_name="optimizer.lr",
)
fig = lr_finder.plot(suggest=True)
plt.show()
print(f"Suggested learning rate: {lr_finder.suggestion()}")

[16:44:24] INFO - Init ImageDataModule for mnist
[16:44:26] INFO - loading dataset mnist with args () from split train
[16:44:33] INFO - loading dataset mnist with args () from split test
[16:44:35] INFO - split train into train/val [0.8, 0.2]
[16:44:35] INFO - train: 48000 val: 12000, test: 10000
GPU available: True (mps), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[23], line 31
     20 trainer = Trainer(
     21     accelerator="auto",
     22     min_epochs=1,
   (...)
     26     check_val_every_n_epoch=1
     27     )
     30 tuner = Tuner(trainer)
---> 31 lr_finder = tuner.lr_find(
     32     model,
     33     datamodule=datamodule,
     34     min_lr=1e-6,
     35     max_lr=1.0,
     36     num_training=100,  # number of iterations
     37     # attr_name="optimizer.lr",
     38 )
     39 fig = lr_finder.plot(suggest=True)
     40 plt.show()

File ~/miniforge3/envs/nimrod/lib/python3.11/site-packages/lightning/pytorch/tuner/tuning.py:180, in Tuner.lr_find(self, model, train_dataloaders, val_dataloaders, dataloaders, datamodule, method, min_lr, max_lr, num_training, mode, early_stop_threshold, update_attr, attr_name)
    177 lr_finder_callback._early_exit = True
    178 self._trainer.callbacks = [lr_finder_callback] + self._trainer.callbacks
--> 180 self._trainer.fit(model, train_dataloaders, val_dataloaders, datamodule)
    182 self._trainer.callbacks = [cb for cb in self._trainer.callbacks if cb is not lr_finder_callback]
    184 return lr_finder_callback.optimal_lr

File ~/miniforge3/envs/nimrod/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py:532, in Trainer.fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    498 def fit(
    499     self,
    500     model: "pl.LightningModule",
   (...)
    504     ckpt_path: Optional[_PATH] = None,
    505 ) -> None:
    506     r"""Runs the full optimization routine.
    507 
    508     Args:
   (...)
    530 
    531     """
--> 532     model = _maybe_unwrap_optimized(model)
    533     self.strategy._lightning_module = model
    534     _verify_strategy_supports_compile(model, self.strategy)

File ~/miniforge3/envs/nimrod/lib/python3.11/site-packages/lightning/pytorch/utilities/compile.py:111, in _maybe_unwrap_optimized(model)
    109     return model
    110 _check_mixed_imports(model)
--> 111 raise TypeError(
    112     f"`model` must be a `LightningModule` or `torch._dynamo.OptimizedModule`, got `{type(model).__qualname__}`"
    113 )

TypeError: `model` must be a `LightningModule` or `torch._dynamo.OptimizedModule`, got `MLP`

# 1-cycle sched
cfg = OmegaConf.load('../config/model/image/mlp.yaml')
cfg.scheduler.total_steps = len(datamodule.train_dataloader()) * MAX_EPOCHS
cfg.scheduler.max_lr = lr_finder.suggestion()
model = instantiate(cfg)
trainer.fit(model, datamodule.train_dataloader(), datamodule.val_dataloader())

[19:57:34] INFO - MLP: init
[19:57:34] INFO - MLP_X init
[19:57:34] INFO - Classifier: init
[19:57:47] INFO - Optimizer: AdamW (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    capturable: False
    differentiable: False
    eps: 1e-08
    foreach: None
    fused: None
    lr: 0.001
    maximize: False
    weight_decay: 1e-05
)
[19:57:47] INFO - Scheduler: <torch.optim.lr_scheduler.OneCycleLR object>

  | Name         | Type               | Params | Mode 
------------------------------------------------------------
0 | loss         | CrossEntropyLoss   | 0      | train
1 | train_acc    | MulticlassAccuracy | 0      | train
2 | val_acc      | MulticlassAccuracy | 0      | train
3 | test_acc     | MulticlassAccuracy | 0      | train
4 | train_loss   | MeanMetric         | 0      | train
5 | val_loss     | MeanMetric         | 0      | train
6 | test_loss    | MeanMetric         | 0      | train
7 | val_acc_best | MaxMetric          | 0      | train
8 | nnet         | MLP                | 50.9 K | train
------------------------------------------------------------
50.9 K    Trainable params
0         Non-trainable params
50.9 K    Total params
0.204     Total estimated model params size (MB)
14        Modules in train mode
0         Modules in eval mode

`Trainer.fit` stopped: `max_epochs=5` reached.

plt.plot(lr_monitor.lrs['lr-AdamW'])

csv_path = f"{trainer.logger.log_dir}/metrics.csv"
metrics = pd.read_csv(csv_path)
metrics.head(5)

	epoch	lr-AdamW	step	train/acc_epoch	train/acc_step	train/loss_epoch	train/loss_step	val/acc	val/acc_best	val/loss
0	NaN	0.002384	49	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	0.0	NaN	49	NaN	0.800781	NaN	0.653588	NaN	NaN	NaN
2	NaN	0.005817	99	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3	0.0	NaN	99	NaN	0.894531	NaN	0.378841	NaN	NaN	NaN
4	0.0	NaN	117	NaN	NaN	NaN	NaN	0.924917	0.924917	0.262162

plt.figure()
plt.plot(metrics['step'], metrics['train/loss_step'], 'b.-')
plt.plot(metrics['step'], metrics['val/loss'],'r.-')
plt.show()

trainer.test(model, datamodule.test_dataloader())

/Users/slegroux/miniforge3/envs/nimrod/lib/python3.11/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:424: The 'test_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃        Test metric        ┃       DataLoader 0        ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│         test/acc          │    0.9690999984741211     │
│         test/loss         │    0.10579969733953476    │
└───────────────────────────┴───────────────────────────┘

[{'test/loss': 0.10579969733953476, 'test/acc': 0.9690999984741211}]