Embedding+MLP , 99% accuracy, 3.03sec training

<img width="803" height="240" alt="Image" src="https://github.com/user-attachments/assets/389a2249-04bf-401e-9af4-c73def1e6710" />

current approach - epochs :12
Baseline MLP also has early rise but suffers at the final stages around and beyond 90%.

The network learns a vector (embedding) for each possible integer value of a and b. It then concatenates those two vectors and uses a small MLP to map them to logits across mod_value classes .

embedding_dim = 16
Adamw: 5e-4 , w_d =1e-4

MLP head: sequence of Linear + ReLU (+ optional Dropout) layers, final Linear with output dimension mod_value (the logits).

Is this a record?




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Embedding+MLP , 99% accuracy, 3.03sec training #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Embedding+MLP , 99% accuracy, 3.03sec training #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions