-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enh(gcode): rewrote language for moden gcode support #4040
Conversation
e416841
to
aaeb9f5
Compare
aaeb9f5
to
423b2e1
Compare
src/languages/gcode.js
Outdated
variants: [ | ||
// G General functions: G0, G5.1, G5.2, … | ||
// M Misc functions: M0, M55.6, M199, … | ||
{ match: /(?<![A-Z])[GM]\s*\d+(\.\d+)?/ }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't use look-behinds until version 12 (breaking change)... so we'll need another way, or this PR will have to wait till then... no ETA on version 12 yet.
efedc61
to
3bb7e65
Compare
src/languages/gcode.js
Outdated
const NUMBER = hljs.inherit(hljs.C_NUMBER_MODE, { begin: '([-+]?((\\.\\d+)|(\\d+)(\\.\\d*)?))|' + hljs.C_NUMBER_RE }); | ||
|
||
|
||
const LETTER_BOUNDARY_RE = /(?<![A-Z])/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const LETTER_BOUNDARY_RE = /(?<![A-Z])/; | |
// TODO: post v12 lets use look-behind for more accuracy, until then \b should suffice | |
// const LETTER_BOUNDARY_RE = /(?<![A-Z])/; | |
const LETTER_BOUNDARY_RE = /\b/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was still thinking about that — I unfortunately don't think that \b
is good enough. It completely breaks the spaceless gcode snippet I included in the markup test, because \b
includes [0-9]
, meaning a sequence like G1A2
does not match G1
and A2
separately like it should.
It's pretty common for gcode to be generated by software and not written by humans, and to reduce file size and transmission time to the machines, they strip all whitespace.
I wanted to look into \b
and an on:begin
filter using response.ignoreMatch()
if a digit is found before the first letter.
Would that be fine or discouraged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd have to do that per mode or course... we don't use that much because it's SO expensive... but I think if we were willing to remove GCODE from the list of auto-detectable languages with disableAutodetect
that on:begin
could then be used... since we'd then be certain we were at least doing all that work in the service of highlighting actual GCODE.
And lets still leave a comment somewhere that the better solution is look-behind - for the future when we can use that.
@joshgoebel thank you for your patience. My latest commit added This means that readable gcode with sane spacing will rarely, if ever, run into the callback filter. I've tested both implementations with the existing markup test, as well as some 100lines of spaceless gcode. Running Benchmark & ResultsShell commands to download v12 and install bennywget https://raw.githubusercontent.com/barthy-koeln/highlight.js/2c55db96e8a0523fdbdd6d4069c7007f75d5288b/src/languages/gcode.js -O src/languages/gcode-v12.js
npm run build gcode gcode-v12
cd build
npm install benny
# create benchmark script here e.g. bench.mjs
node bench.mjs Benchmark scriptimport b from 'benny'
import fs from 'fs'
import hljs from 'highlight.js'
const code = fs.readFileSync(import.meta.dirname + '/../test/markup/gcode/extended.txt').toString('utf-8')
b.suite(
'gcode bench',
b.add('v11', () => {
hljs.highlight(code, { language: 'gcode' })
}),
b.add('v12', () => {
hljs.highlight(code, { language: 'gcode-v12' })
}),
b.cycle(),
b.complete(),
b.save({ file: 'gcode', version: '1.0.0' }),
b.save({ file: 'gcode', format: 'chart.html' })
) Results |
42619e9
to
1daed93
Compare
src/languages/gcode.js
Outdated
const charBeforeMatch = matchdata.input[matchdata.index - 1]; | ||
if (charBeforeMatch >= '0' && charBeforeMatch <= '9') { | ||
return; | ||
} | ||
|
||
response.ignoreMatch(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the regex is ![A-Z]
why wouldn't the conditional match? ie:
const charBeforeMatch = matchdata.input[matchdata.index - 1]; | |
if (charBeforeMatch >= '0' && charBeforeMatch <= '9') { | |
return; | |
} | |
response.ignoreMatch(); | |
const charBeforeMatch = matchdata.input[matchdata.index - 1]; | |
if (charBeforeMatch >= 'A' && charBeforeMatch <= 'Z') { | |
response.ignoreMatch(); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to deal with lowercase a-z also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the begin
with on:begin
variant was the only matching logic, we'd have to check for ![a-zA-Z]
.
But since we have a variant with \b
we can be sure that any instance that follows a word boundary ![a-zA-Z0-9_]
has been found already. So we only need to additionally find those that follow 0-9
and ignore everything else.
Doing that means we don't have to account for upper/lowercase.
I just noticed the underscore _
isn't handled by that callback, but it should be. I will push another commit (but will wait for feedback on the thing above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: I've committed an additional check for underscores, to align the filter with what \b
and ![A-Z]
mean.
I've added two lines to the markup test and verified in my v1 branch feat/gcode-v12
that the result is the same.
38136d0
to
e063f58
Compare
Hey @joshgoebel, this is just a careful ping to see if I can or should do anything else here? Edit: I seriously understand everyone's time is precious and this is free open-source work, so I'll accept any answer including "not now, will revisit later" :D |
e063f58
to
54acfb5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry this took so very long!
Build Size ReportChanges to minified artifacts in 5 files changedTotal change +289 B View Changes
|
Complete rework of the
gcode
language to allow for extended uses cases.Not all scope rules will apply to all implementations of g-code, but many applications of gcode have added a lot on top of the original spec.
This language implementation aims to be more extensive but still flexible.
My research has used the following documentations:
And countless code examples extracted from GitHub's search:
Changes
Question about code
I have used this pattern to re-use complex regexes:
I would like to use this for certain duplicated parts (e.g.
(?<![A-Z])
) as well. Is this acceptable?The performance impact is surely negligible since this only happens during initialisation.
It greatly increases readability and IDE support in my opinion.
Screenshots
Checklist
CHANGES.md