You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently any element from the html that couldn't be recognized by the tokenizer will be tokenized into a COMMENT token. Since the catalog can be inconsistent in its way of phrasing, our tokenizer can miss some of these cases. This ticket will mostly be going into the comments.json and see which case we can address with the tokenizer, as well as writing some tests for them to ensure backwards compatibility with older majors (especially the XOM phrasing)
The text was updated successfully, but these errors were encountered:
Currently any element from the html that couldn't be recognized by the tokenizer will be tokenized into a
COMMENT
token. Since the catalog can be inconsistent in its way of phrasing, our tokenizer can miss some of these cases. This ticket will mostly be going into thecomments.json
and see which case we can address with the tokenizer, as well as writing some tests for them to ensure backwards compatibility with older majors (especially the XOM phrasing)The text was updated successfully, but these errors were encountered: