You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This proposal would be to replace the Consolidation feature with a more general Pattern Replacement Feature. The only real difference is that, instead of entering a, b: c (replace all occurrences of a and b with c), the user will enter a, b (replace all occurrences of a with b). That should be very easy to fix on the back end. We could then add a regex option so that the user could add a regex pattern 'a|b', 'c' (replace all occurrences of a OR b with c.
The addition of regex pattern matching will allow greater flexibility and solve problems like issue #956. And it should only require a few more lines of code.
The text was updated successfully, but these errors were encountered:
Update: The regex_replace branch implements a basic proof-of-concept regex pattern replacement. I chose to use the syntax a > b (for a non-regex replacement of a to b) and REGEX:^a > b for a regex replacement of a at the start of a string to b). > inside the string must be escaped with a backslash. It doesn't yet handle capture groups, and it might be possible to make it faster, but it works.
Oops! Actually, capture groups do work. They can be referenced as \1, \2, etc.
i have refactored the pattern replacement method scrubber.py::pattern_replacement_handler() (and added one method) ... but i'm pleading for help on how best to push these to the pattern_replacement branch that scott made :( (sorry, i'm using vscode, i'm on scott's branch, i've committed and pulled, but just not sure how to push back to pattern_replacement)
the code now handles multiple replacement changes (one per line)
allows for (once UI provides checkbox) "apply regex to each token" (will only work for languages where tokens are separated by whitespace); note: code uses a list comprehension, but only one, each of the individual (regex-)pattern replacements applied to each token, in order
timing shows two edits working on mobyDict.txt in 1.5sec, so appears to be working relatively fast
This proposal would be to replace the Consolidation feature with a more general Pattern Replacement Feature. The only real difference is that, instead of entering
a, b: c
(replace all occurrences ofa
andb
withc
), the user will entera, b
(replace all occurrences ofa
withb
). That should be very easy to fix on the back end. We could then add a regex option so that the user could add a regex pattern'a|b', 'c'
(replace all occurrences ofa
ORb
withc
.The addition of regex pattern matching will allow greater flexibility and solve problems like issue #956. And it should only require a few more lines of code.
The text was updated successfully, but these errors were encountered: