Changed Parsing Method of Identifiers #138

hexofyore · 2023-06-05T04:39:03Z

No description provided.

hexofyore · 2023-06-05T04:49:29Z

@ISibboI Sorry. I just checked the test i ran and forgot to run full test. I will look at it. Meanwhile, you can check the logic.

hexofyore · 2023-06-07T16:33:17Z

@ISibboI Is there wrong with the new one?

benwr · 2023-06-08T10:46:05Z

(randomly happened to see this PR - wanted to chime in to say that I've actually benefitted a lot from the flexibility in identifier names, since it lets you define custom conventions that "look like" special syntax but are actually implemented by the Context)

ISibboI

Thanks, this is great.

Due to the comment of @benwr, I would propose to make the identifier parsing generic using a (zero-cost) strategy pattern. This implies quite a large amount of changes though. I would be happy if you would work on that, but otherwise I can also do it.

The Context trait should be extended with an associated type IdentifierValidatorType (or similar). This associated type should be bounded by a trait IdentifierValidator (or similar) with a single function validate_identifier(&str) -> EvalexprResult<()>.

Then, when parsing a literal, it should first attempt to parse boolean and all numeric variants (int, float, scientific notation), and then pass it to validate_identifier. This function can then return any error it wants to if it disagrees with the identifier.

There can be an "any" implementation, that reflects the current behaviour.
An implementation that reflects the Rust behaviour, like what you implemented now.
And also an implementation that allows e.g. only ASCII characters, underscores, numbers and colons, if you want to.

Then this requires a bunch of changes in context implementations that I cannot describe without trying...

So yeah, a bunch of work.

ISibboI · 2023-06-08T13:23:17Z

src/token/mod.rs

+                    let contains_alphabet = literal.contains(|x: char| x.is_alphabetic());
+                    if let Ok(boolean) = literal.parse::<bool>() {
+                        Some(Token::Boolean(boolean))
+                    } else if starts_with_alphabet_or_underscore


The decomposition of the conditions into separate variables is great!
Here we should first check only starts_with_alphabet_or_underscore, and if it does but any of the other three are false, then return a the error variant IllegalIdentifierSequence.

I did boolean first because true and false would both be interpreted as identifiers.

I mean that the branch after boolean should only check for starts_with_alphabet_or_underscore, and then should have an inner branch that checks the rest.

ISibboI · 2023-06-08T13:23:51Z

src/token/mod.rs

@@ -370,10 +384,16 @@ fn partial_tokens_to_tokens(mut tokens: &[PartialToken]) -> EvalexprResult<Vec<T
                                    cutoff = 3;
                                    Some(Token::Float(number))
                                } else {
-                                    Some(Token::Identifier(literal.to_string()))
+                                    return Err(EvalexprError::IllegalIdentifierSequence(


Here we are parsing the literal as a number, so this should be a new error variant like IllegalNumericLiteral.

Ok. I will change this

ISibboI · 2023-06-08T13:23:56Z

src/token/mod.rs

                                }
                            },
-                            _ => Some(Token::Identifier(literal.to_string())),
+                            _ => {
+                                return Err(EvalexprError::IllegalIdentifierSequence(


Here we are parsing the literal as a number, so this should be a new error variant like IllegalNumericLiteral.

hexofyore · 2023-06-09T10:17:42Z

Ok. I will probably do this after this weekend.

zeroishero added 2 commits June 8, 2023 15:53

Changed Parsing Method of Identifiers

e0c91f0

Changed,removed tests with illegal identifiers.

8431650

ISibboI force-pushed the feature_literal_errors branch from 53d4507 to 8431650 Compare June 8, 2023 12:53

ISibboI reviewed Jun 8, 2023

View reviewed changes

Merge branch 'main' into feature_literal_errors

7911d4e

ISibboI force-pushed the main branch 6 times, most recently from e4a8571 to 6608b16 Compare October 11, 2024 15:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changed Parsing Method of Identifiers #138

Changed Parsing Method of Identifiers #138

hexofyore commented Jun 5, 2023

hexofyore commented Jun 5, 2023

hexofyore commented Jun 7, 2023

benwr commented Jun 8, 2023

ISibboI left a comment •

edited

Loading

ISibboI Jun 8, 2023

hexofyore Jun 9, 2023

ISibboI Jun 17, 2023

ISibboI Jun 8, 2023

hexofyore Jun 9, 2023

ISibboI Jun 8, 2023

hexofyore Jun 9, 2023

hexofyore commented Jun 9, 2023

Changed Parsing Method of Identifiers #138

Are you sure you want to change the base?

Changed Parsing Method of Identifiers #138

Conversation

hexofyore commented Jun 5, 2023

hexofyore commented Jun 5, 2023

hexofyore commented Jun 7, 2023

benwr commented Jun 8, 2023

ISibboI left a comment • edited Loading

Choose a reason for hiding this comment

ISibboI Jun 8, 2023

Choose a reason for hiding this comment

hexofyore Jun 9, 2023

Choose a reason for hiding this comment

ISibboI Jun 17, 2023

Choose a reason for hiding this comment

ISibboI Jun 8, 2023

Choose a reason for hiding this comment

hexofyore Jun 9, 2023

Choose a reason for hiding this comment

ISibboI Jun 8, 2023

Choose a reason for hiding this comment

hexofyore Jun 9, 2023

Choose a reason for hiding this comment

hexofyore commented Jun 9, 2023

ISibboI left a comment •

edited

Loading