Charabia v0.6.0
meili-bot
released this
22 Aug 12:11
·
494 commits
to refs/heads/main
since this release
Changes
- Resolving version mismatches occurring in Lindera (#112) @mosuka
- Add Thai segmenter (#114) @aFluffyHotdog
- Optimize Thai segmenter (#115) @ManyTheFish
- Deactivate lowercase normalizer when the Script doesn't contain case modifiers (#116) @ManyTheFish
- Release v0.6.0 (#117) @ManyTheFish
Breaking changes ⚠️
Add option to disable char map creation (#109) @matthias-wright
Token::original_lengths(..)
method, used to find the original index of a character in a normalized string, needs the TokenizerBuilder::create_char_map(..)
settings set to true
to work properly.
Thanks again to @ManyTheFish, @aFluffyHotdog, @matthias-wright and @mosuka! 🎉