Getting Nan as validation loss

 was training a model with cyclic learning rate and after 7th epoch i get nan validation loss..isn't it "exploding gradient problem"? will gradient accumulation be able to solve this issue? i don't get such errors when i try adam or adamW.
i am facing this problem whenever i try deepmemory or diffgrad optimizer,any help?

training statistics so far of the model i am trying :

Train Epoch: 0 LR: 0.0049400000 Loss: 1.878861
Dev loss: 2.1991
Train Epoch: 1 LR: 0.0086800000 Loss: 1.840539
Dev loss: 1.7849
Train Epoch: 2 LR: 0.0075800000 Loss: 1.847198
Dev loss: 1.9127
Train Epoch: 3 LR: 0.0038400000 Loss: 1.287447
Dev loss: 1.3331
Train Epoch: 4 LR: 0.0023000000 Loss: 1.416327
Dev loss: 1.2588
Train Epoch: 5 LR: 0.0060400000 Loss: 1.299999
Dev loss: 1.4838
Train Epoch: 6 LR: 0.0097800000 Loss: 1.540868
Dev loss: 1.5280
Train Epoch: 7 LR: 0.0064800000 Loss: 1.790969
Dev loss: 1.2738
Train Epoch: 8 LR: 0.0027400000 Loss: 1.092477
Dev loss: nan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Getting Nan as validation loss #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Getting Nan as validation loss #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions