Skip to content

Getting Nan as validation loss #1

@mobassir94

Description

@mobassir94

was training a model with cyclic learning rate and after 7th epoch i get nan validation loss..isn't it "exploding gradient problem"? will gradient accumulation be able to solve this issue? i don't get such errors when i try adam or adamW.
i am facing this problem whenever i try deepmemory or diffgrad optimizer,any help?

training statistics so far of the model i am trying :

Train Epoch: 0 LR: 0.0049400000 Loss: 1.878861
Dev loss: 2.1991
Train Epoch: 1 LR: 0.0086800000 Loss: 1.840539
Dev loss: 1.7849
Train Epoch: 2 LR: 0.0075800000 Loss: 1.847198
Dev loss: 1.9127
Train Epoch: 3 LR: 0.0038400000 Loss: 1.287447
Dev loss: 1.3331
Train Epoch: 4 LR: 0.0023000000 Loss: 1.416327
Dev loss: 1.2588
Train Epoch: 5 LR: 0.0060400000 Loss: 1.299999
Dev loss: 1.4838
Train Epoch: 6 LR: 0.0097800000 Loss: 1.540868
Dev loss: 1.5280
Train Epoch: 7 LR: 0.0064800000 Loss: 1.790969
Dev loss: 1.2738
Train Epoch: 8 LR: 0.0027400000 Loss: 1.092477
Dev loss: nan

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions