Matchers are stateful expressions that use interlaced capture hooks (ie entities
) to mutate both state and matches. This mere abstraction makes it possible to leverage Regular Expressions beyond the traditional context-free limitations.
Matcher
extends the exec
method of the RegExp
class to iterate over the entities
array for respective capture(s).
By default:
-
A capture position is any position in the
match
array except for the match position itself (iematch[0]
). -
A capture represents an initialized element in a capture position that is not
undefined
. -
An entity position in the
entities
array maps to the capture position of the following index in thematch
whereentityIndex == captueIndex - 1
. -
A
null
entity in theentities
array is always skipped. -
A
‹identity›
entity that is non-callable and which is expected to always be coercible to astring
orsymbol
is assigned to the respectivematch.capture[‹identity›]
, where the intact value of‹identity›
last captured is also iteratively reflected as thematch.identity
. -
A
‹handler›
entity that is callable is called withmatch[0]
, captureindex
, thematch
, and thestate
of theMatcher
instance, which is expected to independently handle all mutations of thematch
instance, including mutations to thematch.capture
ormatch.identity
if necessary.Note: By design, there are no safeguards in place for preserving
‹handler›
mutations to anymatch
ormatch.capture
fields, but which can be implemented by extensions that would justify such exponential expansive costs.
Token Matcher
extends the Matcher
interface with a tokenize
method that
-
Experiments
-
Implementation
- Implement base Matcher —
lib/matcher.js
- Refactor Tokenizer helpers —
lib/token-matcher.js
- Refactor Segmenter helpers and overrides —
segment-matcher.js
- Refactor Debugging helpers —
lib/debug.js
- Refactor Matches wrapper —
lib/matches.js
- Refactor Tokenizer helpers —
- Refactor RegExpRange —
lib/range.js
- Implement base Matcher —