This project was made in order to practice Structured Prediction.
structured_model.py
uses structured prediction approach it takes in consideration the possibility of a letter being predicted given the last predicted letter(in the word).
simple_multiclass_model.py
uses multiclass W·x
in order to predict.
simple_structured_model.py
was an attempt to use structured prediction approach using W·ϕ(x, y_hat)
in order to predict.
-
Download Stanfords OCR dataset and place the files in
data/
-
Run
python structured_model.py
-
(Optional) Run the 2 other models for comparison
simple multiclass model | simple structured model | structured model | |
---|---|---|---|
Accuracy | 75.218% | 74.344% | 80.353% |
See what the structured model learned of the possibilities of bigram
x axis - previous letter
y axis - current letter
For example look at the pairs (q,u), (l,y), (n,g): all of these get high value since they are likely
to appear together.
On the other hand (u,u), (e,i), (a,a) will get low value.
- seaborn – data visualization package
Bar Katz – bar-katz on github – [email protected]