-
Notifications
You must be signed in to change notification settings - Fork 55
/
NEWS
198 lines (111 loc) · 4.77 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
0.3.24 (2024-08-12):
* Bug Fix: Catastrophic backtracking in regular expression for numerical references
* Improvement: Remove unicode dependency
0.3.23 (2021-05-03):
* Improvement: Refactor for Ruby 3.0 compatibility
0.3.22 (2018-09-23):
* Improvement: Initial support for Kazakh
0.3.21 (2018-08-30):
* Improvement: Add support for file formats
* Improvement: Add support for numeric references at the end of a sentence (i.e. Wikipedia references)
0.3.20 (2018-08-28):
* Improvement: Handle slanted single quotation as a single quote
* Bug Fix: The text contains a single character abbreviation as part of a list
* Bug Fix: Chinese book quotes
* Improvement: Add viz as abbreviation
0.3.19 (2018-07-19):
* Bug Fix: A parenthetical following an abbreviation is now included as part of the same segment. Example: "The parties to this Agreement are PragmaticSegmenterExampleCompanyA Inc. (“Company A”), and PragmaticSegmenterExampleCompanyB Inc. (“Company B”)." is now treated as one segment.
0.3.18 (2018-03-27):
* Improvement: Performance optimizations
0.3.17 (2017-12-07):
* Bug Fix: Regex for parsing HTML
0.3.16 (2017-11-13):
* Improvement: Support for Danish
0.3.15 (2017-06-28):
* Improvement: Handle em dashes that appear in the middle of a sentence and include a sentence ending punctuation mark
0.3.14 (2017-06-28):
* Improvement: Add English abbreviation Rs. to denote the Indian currency
0.3.13 (2017-01-17):
* Bug Fix: Unexpected sentence break between abbreviation and hyphen
0.3.12 (2016-12-12):
* Bug Fix: Issue with words with leading apostrophes
0.3.11 (2016-11-08):
* Improvement: Update German abbreviation list
* Bug Fix: Refactor 'remove_newline_in_middle_of_sentence' method
0.3.10 (2016-07-01):
* Bug Fix: Change load order of dependencies
0.3.9 (2016-06-16):
* Improvement: Remove `guard-rspec` development dependency
0.3.8 (2016-03-03):
* Bug Fix: Fix bug that cleaned away single letter segments
0.3.7 (2016-01-12):
* Improvement: Add `unicode` gem and use it for downcasing to better handle cyrillic languages
0.3.6 (2016-01-05):
* Improvement: Refactor SENTENCE_STARTERS to each individual language and add SENTENCE_STARTERS for German
0.3.5 (2016-01-04):
* Performance: Reduce GC by replacing #gsub with #gsub! where possible
0.3.4 (2015-12-22):
* Improvement: Large refactor
0.3.3 (2015-05-27):
* Bug Fix: Fix cleaner bug
0.3.2 (2015-05-27):
* Improvement: Add English abbreviations
0.3.1 (2015-03-02):
* Bug Fix: Fix undefined method 'gsub!' for nil:NilClass issue
0.3.0 (2015-02-04):
* Improvement: Add support for square brackets
* Improvement: Add support for continuous exclamation points or questions marks or combinations of both
* Bug Fix: Fix Roman numeral support
* Improvement: Add English abbreviations
0.2.0 (2015-01-26):
* Improvement: Add Dutch Golden Rules and abbreviations
* Improvement: Update README with additional tools
* Improvement: Update segmentation test scores in README with results of new Golden Rule tests
* Improvement: Add Polish abbreviations
0.1.8 (2015-01-22):
* Bug Fix: Fix bug in splitting new sentence after single quotes
0.1.7 (2015-01-22):
* Improvement: Add Alice in Wonderland specs
* Bug Fix: Fix parenthesis between double quotations bug
* Bug Fix: Fix split after quotation ending in dash bug
0.1.6 (2015-01-16):
* Bug Fix: Fix bug in numbered list finder (ignore longer digits)
0.1.5 (2015-01-13):
* Bug Fix: Fix comma at end of quotation bug
0.1.4 (2015-01-13):
* Bug Fix: Fix missing abbreviations
0.1.3 (2015-01-13):
* Improvement: Improve punctuation in bracket replacement
0.1.2 (2015-01-13):
* Bug Fix: Fix missing abbreviations
* Improvement: Add footnote rule to `cleaner.rb`
0.1.1 (2015-01-12):
* Bug Fix: Fix handling of German dates
0.1.0 (2015-01-12):
* Improvement: Add Kommanditgesellschaft Rule
0.0.9 (2015-01-12):
* Improvement: Improve handling of alphabetical and roman numeral lists
0.0.8 (2015-01-12):
* Bug Fix: Fix error in `list.rb`
0.0.7 (2015-01-12):
* Improvement: Add change log to README
* Improvement: Add passing spec for new end of sentence abbreviation (EN)
* Improvement: Add roman numeral list support
0.0.6 (2015-01-11):
* Improvement: Add rule for escaped newlines that include a space between the slash and character
* Improvement: Add Golden Rule #52 and code to make it pass
0.0.5 (2015-01-10):
* Improvement: Make symbol substitution safer
* Improvement: Refactor `process.rb`
* Improvement: Update cleaner with escaped newline rules
0.0.4 (2015-01-10):
* Improvement: Add `ConsecutiveForwardSlashRule` to cleaner
* Improvement: Refactor `segmenter.rb` and `process.rb`
0.0.3 (2015-01-07):
* Improvement: Add travis.yml
* Improvement: Add Code Climate
* Improvement: Update README
0.0.2 (2015-01-07):
* Improvement: Major design refactor
0.0.1 (2015-01-07):
* Initial Release