Releases: proycon/analiticcl
Releases · proycon/analiticcl
v0.3.1
v0.3.0
Major development updates:
- Initial implementation on finding matches in running text (error detection); search mode #2
- Support for Language Models to consider context
- Support for n-grams; decoding using Finite State Transducers
- Strict separation between lexicon and language model
- Still experimental
- Removed frequency from score component and added it as a separate score
- Added frequency-ranking as an opt-in feature now; explicitly propagate frequency score and distance score separately to the output
- Removed lexicon weights
- Made distance score computations relative to input length
- Changed default weights so levenshtein-damarau carries most weight
- Implemented a Python binding (#1)
- Fixed insertions after deletion (#6), removed premature bound-check optimisations
- Implemented a learning mode that collects variants for a given lexicon, either in running text or matched against another test lexicon
- Implemented a cut-off threshold
- Allow frequency information in variant lists
- Adhere strict to lexiconc/variantlist loading order as specified on command line
- Return all matching lexicons for matching rather than just one (in case an entry exists in multiple lexicons)
- More debug levels
- Anagram/edit distance can now be set to an absolute value or a ratio (relative to input length)
- Significant documentation updates
v0.2.0
- This release replaces the underlying big integer library with ibig 0.3.2, which leads to a significant performance increase due to less heap allocations.
- Implemented explicit variant ingestion and matching (but still requires proper testing)
- fixed benchmarks
- allow some escape sequences in alphabet files