[22:06:36] INFO - Init ImageDataModule for fashion_mnist
[22:06:56] INFO - split train into train/val [0.8, 0.2]
[22:06:56] INFO - train: 48000 val: 12000, test: 10000
[22:06:57] INFO - ConvNetX: init
[22:06:57] INFO - Classifier: init
/Users/slegroux/miniforge3/envs/nimrod/lib/python3.11/site-packages/lightning/pytorch/utilities/parsing.py:208: Attribute 'nnet' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['nnet'])`.
tuner = Tuner(trainer)lr_finder = tuner.lr_find( model, datamodule=dm, min_lr=1e-6, max_lr=1.0, num_training=100, # number of iterations# attr_name="optimizer.lr",)fig = lr_finder.plot(suggest=True)plt.show()print(f"Suggested learning rate: {lr_finder.suggestion()}")
[22:09:59] INFO - Optimizer: Adam (
Parameter Group 0
amsgrad: False
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.06
maximize: False
weight_decay: 0
)
[22:09:59] INFO - Scheduler: <torch.optim.lr_scheduler.ReduceLROnPlateau object>
/Users/slegroux/miniforge3/envs/nimrod/lib/python3.11/site-packages/lightning/pytorch/core/optimizer.py:316: The lr scheduler dict contains the key(s) ['monitor', 'strict'], but the keys will be ignored. You need to call `lr_scheduler.step()` manually in manual optimization.
/Users/slegroux/miniforge3/envs/nimrod/lib/python3.11/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
/Users/slegroux/miniforge3/envs/nimrod/lib/python3.11/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
`Trainer.fit` stopped: `max_steps=100` reached.
Learning rate set to 9.120108393559098e-06
Restoring states from the checkpoint path at /Users/slegroux/Projects/nimrod/nbs/.lr_find_5828b967-b82f-4e1a-bda1-997d275f4d03.ckpt
Restored all states from the checkpoint at /Users/slegroux/Projects/nimrod/nbs/.lr_find_5828b967-b82f-4e1a-bda1-997d275f4d03.ckpt
/Users/slegroux/miniforge3/envs/nimrod/lib/python3.11/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:424: The 'test_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
[14:58:07] INFO - Init ImageDataModule for fashion_mnist
[14:58:22] INFO - split train into train/val [0.8, 0.2]
[14:58:22] INFO - train: 48000 val: 12000, test: 10000
criterion = nn.MSELoss()optimizer = torch.optim.Adam(model.parameters(), lr=0.001)N_EPOCHS =5for epoch inrange(N_EPOCHS): i =0 model.train()for images, labels in dm.train_dataloader(): optimizer.zero_grad() images, labels = images.to(device), labels.to(device)# B x C x H x W -> B x C x L images = images.view(-1, images.size(2) * images.size(3)) outputs = model(images)# output should be as close to input as possible loss = criterion(outputs, images) loss.backward() optimizer.step() model.eval()with torch.no_grad(): total_loss, epoch_step =0, 0for images, labels in dm.val_dataloader(): images, labels = images.to(device), labels.to(device) images = images.view(-1, images.size(2) * images.size(3)) outputs = model(images) eval_loss = criterion(outputs, images) epoch_len =len(images) epoch_step += epoch_len total_loss += eval_loss.item() * epoch_len logger.info(f"Epoch: {epoch}, len: {epoch_len}, Loss: {total_loss / epoch_step:.3f}")
[14:56:24] INFO - Epoch: 0, len: 32, Loss: 0.025
[14:56:27] INFO - Epoch: 1, len: 32, Loss: 0.025
[14:56:31] INFO - Epoch: 2, len: 32, Loss: 0.025
[14:56:35] INFO - Epoch: 3, len: 32, Loss: 0.025
[14:56:39] INFO - Epoch: 4, len: 32, Loss: 0.025