Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lrpar:%prec not followed by token name #450

Closed
liluyue opened this issue May 13, 2024 · 11 comments · Fixed by #453
Closed

lrpar:%prec not followed by token name #450

liluyue opened this issue May 13, 2024 · 11 comments · Fixed by #453

Comments

@liluyue
Copy link

liluyue commented May 13, 2024

I use %nonassoc,always error
截屏2024-05-13 14 48 55

@ltratt
Copy link
Member

ltratt commented May 13, 2024

If you can include the raw text of the input and the error that's produced, that would be helpful.

@liluyue
Copy link
Author

liluyue commented May 13, 2024

If you can include the raw text of the input and the error that's produced, that would be helpful.
Priority added to the example:

%start Expr
%left '+' '-'
%left '*' '/'
%nonassoc 'u'
%%
Expr -> Result<u64, ()>:
      Expr '+' Expr { Ok($1? + $3?) }
    | Expr '-' Expr { Ok($1? - $3?) }
    |  Expr '*' Expr { Ok($1? * $3?) }
    | Expr '/' Expr { Ok($1? / $3?) }
    | Factor { $1 }
    | '-' Factor %prec 'u' { Ok(-$2?) }
    ;


Factor -> Result<u64, ()>:
      '(' Expr ')' { $2 }
    | 'INT'
      {
          let v = $1.map_err(|_| ())?;
          parse_int($lexer.span_str(v.span()))
      }
    ;
%%
// Any functions here are in scope for all the grammar actions above.

fn parse_int(s: &str) -> Result<u64, ()> {
    match s.parse::<u64>() {
        Ok(val) => Ok(val),
        Err(_) => {
            eprintln!("{} cannot be represented as a u64", s);
            Err(())
        }
    }
}

@ltratt
Copy link
Member

ltratt commented May 13, 2024

Can you also include the error you're seeing please?

@liluyue
Copy link
Author

liluyue commented May 13, 2024

[Error] in src/calc.y
    12|     | '-' Factor %prec 'u' { Ok(-$2?) }
                                %prec not followed by token name
    
    11|     | Factor { $1 }
              ^^^^^^ Unknown reference to rule 'Factor'
    

@ltratt
Copy link
Member

ltratt commented May 13, 2024

That probably means that your lexer isn't defining u is a token type? [It could be a bug in cfgrammar of course, but it has a check if self.ast.tokens.contains(&sym) that looks sensible to me, at least at first glance.]

@liluyue
Copy link
Author

liluyue commented May 13, 2024

It can be troublesome to resolve the shift/reduce between rules without %nonassoc“. According to yacc's rules,% nonassoc** should not be bound to a token

@ltratt
Copy link
Member

ltratt commented May 13, 2024

I don't think %nonassoc is relevant to the error you're seeing but @ratmice might have a better idea than me.

@ratmice
Copy link
Collaborator

ratmice commented May 13, 2024

To get the example working through nimbleparse (besides adding a lex file for it), I had to add the following,
Edit: It seems like that might not be producing the AST I would expect though.

%token 'u'
%expect-unused 'u'

as well as using the Unmatched trick from https://softdevteam.github.io/grmtools/master/book/errorrecovery.html in the lexer. Which seemed to work.

Edit:
I do find it odd that using the Unmatched trick, that %token still seems to be required.

Also unrelated but it is strange that the following error lacks ^^^^ underlining presumably under %prec
Presumably this error might have a zero length span on the line?

 [Error] in src/calc.y
    12|     | '-' Factor %prec 'u' { Ok(-$2?) }
                                %prec not followed by token name

@ltratt
Copy link
Member

ltratt commented May 15, 2024

I think there's at laest 1 bug in grmtools here. Investigating.

@ltratt
Copy link
Member

ltratt commented May 15, 2024

OK so with this lexer:

%%
[0-9]+ "INT"
\+ "+"
\- "-"
/ "/"
\* "*"
\( "("
\) ")"
u "u"
[\t \n\r]+ ;

and this grammar:

%start Expr
%left '+' '-'
%left '*' '/'
%nonassoc 'u'
%expect-unused 'u'
%%
Expr -> Result<u64, ()>:
      Expr '+' Expr { Ok($1? + $3?) }
    | Expr '-' Expr { Ok($1? - $3?) }
    |  Expr '*' Expr { Ok($1? * $3?) }
    | Expr '/' Expr { Ok($1? / $3?) }
    | Factor { $1 }
    | '-' Factor %prec 'u' { Ok(-$2?) }
    ;


Factor -> Result<u64, ()>:
      '(' Expr ')' { $2 }
    | 'INT'
      {
          let v = $1.map_err(|_| ())?;
          parse_int($lexer.span_str(v.span()))
      }
    ;
%%
// Any functions here are in scope for all the grammar actions above.

fn parse_int(s: &str) -> Result<u64, ()> {
    match s.parse::<u64>() {
        Ok(val) => Ok(val),
        Err(_) => {
            eprintln!("{} cannot be represented as a u64", s);
            Err(())
        }
    }
}

I was able to replicate the problem. #453 fixes the problem for me (and with inputs such as "2+3*4" gives a parse tree I'd expect), even if it's not quite the fix I might have hoped for. Will see what @ratmice says in the review though, as my memory is fuzzy on some details.

@liluyue liluyue closed this as completed May 15, 2024
@ltratt
Copy link
Member

ltratt commented May 15, 2024

I'm going to keep this one open until #453 is reviewed, because I'm not 100% confident of my fix!

@ltratt ltratt reopened this May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants