You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The tree to sequence anc model currently decreases its loss by a slight amount from around 8ish to 4/5ish. One potential thing to look at is the values/gradients of operations that can saturate like softmax/sigmoid. I'd also think about hyperparameters and initializations (especially the initialization related to the anc). It may be helpful to examine the neural assembly the model generates after it trains. The current loss examines 10 input/output pairs per program. This itself is a hyperparameter to examine. This is the highest priority current task.
The text was updated successfully, but these errors were encountered:
The tree to sequence anc model currently decreases its loss by a slight amount from around 8ish to 4/5ish. One potential thing to look at is the values/gradients of operations that can saturate like softmax/sigmoid. I'd also think about hyperparameters and initializations (especially the initialization related to the anc). It may be helpful to examine the neural assembly the model generates after it trains. The current loss examines 10 input/output pairs per program. This itself is a hyperparameter to examine. This is the highest priority current task.
The text was updated successfully, but these errors were encountered: