Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace ~ with homedir #49

Closed
wants to merge 10 commits into from
Closed

Replace ~ with homedir #49

wants to merge 10 commits into from

Conversation

prsabahrami
Copy link
Contributor

@prsabahrami prsabahrami commented Sep 6, 2024

As an extension to #37 and fix for #44

@prsabahrami prsabahrami marked this pull request as draft September 6, 2024 21:58
@prsabahrami
Copy link
Contributor Author

prsabahrami commented Sep 6, 2024

At first I thought we can do the replacement while parsing and that way ls ~/Desktop and other commands would work; but, there wouldn't be anyway to handle echo ~user. Hence, reverting parser modification and making the changes on args.rs. This will result in making args mutable.

@prsabahrami prsabahrami marked this pull request as ready for review September 6, 2024 23:33
@certik
Copy link
Collaborator

certik commented Sep 7, 2024

What does echo ~user do? On Ubuntu 22 I get:

image

@prsabahrami
Copy link
Contributor Author

What does echo ~user do? On Ubuntu 22 I get:

image

I believe you should replaceuser with the username. For exampe, in your case, it would be echo ~ondrej. This returns the home dir to a specific user.

@certik
Copy link
Collaborator

certik commented Sep 7, 2024

Ah I see, indeed it does:

image

I didn't know this is possible. If it is hard to support, I think we don't need to support this.

@prsabahrami
Copy link
Contributor Author

I don't think it is hard to support. Rather, it makes it impossible to expand ~ to homedir at parsing time. I believe there are other examples too where an unquoted ~ is not directly expanded. Hence, I think what I just implemented, only expands ~ in commands where it should be. With this implementation, we can support echo ~user easily as well.

@wolfv
Copy link
Member

wolfv commented Sep 7, 2024

Hmm, complicated. I wonder if we should treat it as a special token.

The following behavior is observed in zsh:

╭─wolfv@pixi ~/Programs/shell ‹multiline●› 
╰─$ echo foo~asd 
foo~asd
╭─wolfv@pixi ~/Programs/shell ‹multiline●› 
╰─$ echo ~/foo  
/Users/wolfv/foo
╭─wolfv@pixi ~/Programs/shell ‹multiline●› 
╰─$ echo "~"
~
╭─wolfv@pixi ~/Programs/shell ‹multiline●› 
╰─$ echo ~user
zsh: no such user or named directory: user
╭─wolfv@pixi ~/Programs/shell ‹multiline●› 
╰─$ echo ~wolfv                                                                                                                    
/Users/wolfv

So only when ~ is unquoted and at the beginning of a string it does "something". And the remainder string is parsed until a / is found? (not sure about that last rule).

@wolfv
Copy link
Member

wolfv commented Sep 7, 2024

PS: I haven't tested your branch yet! So maybe it works already exactly as bash here.

arg.to_string()
};
args.push(OsString::from(expanded_arg));
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ls should not be handling ~ explicitly.

@certik
Copy link
Collaborator

certik commented Sep 7, 2024

I think it's time to step back and separate parsing, semantics and execution.

The parser should take care of handling ~ and distinguish if it is part of the token, or special handling. If special handling, it needs to use a dedicated AST node. It should produce an AST.

The semantic phase goes over the AST and checks all semantics (such as: are options valid for "ls", "cd" and other commands).

It should create an IR (intermediate representation) that is semantically valid.

Conceptually we then need to add several IR->IR passes:

  • Add IR->IR pass that expands ~ node and other things (variable substitutions, etc.).

The backend then takes the IR and executes. Ideally it does not handle any ~, any variables (unless they are runtime), etc. Everything is prepared.

We then have a choice: either we have an explicit IR->IR passes, or we can merge them with AST->IR, or with IR->backend. However, conceptually (in our head) we should think of these operations as independent, standalone IR->IR passes.

I made this #70.

@prsabahrami
Copy link
Contributor Author

Hmm, complicated. I wonder if we should treat it as a special token.

The following behavior is observed in zsh:

╭─wolfv@pixi ~/Programs/shell ‹multiline●› 
╰─$ echo foo~asd 
foo~asd
╭─wolfv@pixi ~/Programs/shell ‹multiline●› 
╰─$ echo ~/foo  
/Users/wolfv/foo
╭─wolfv@pixi ~/Programs/shell ‹multiline●› 
╰─$ echo "~"
~
╭─wolfv@pixi ~/Programs/shell ‹multiline●› 
╰─$ echo ~user
zsh: no such user or named directory: user
╭─wolfv@pixi ~/Programs/shell ‹multiline●› 
╰─$ echo ~wolfv                                                                                                                    
/Users/wolfv

So only when ~ is unquoted and at the beginning of a string it does "something". And the remainder string is parsed until a / is found? (not sure about that last rule).

Yes, that is correct.
I think the following gives proper explanation on this.
https://pubs.opengroup.org/onlinepubs/9799919799/utilities/V3_chap02.html#tag_19_06_01
In regards to the last rule.

Otherwise, the characters in the tilde-prefix following the tilde shall be treated as a possible login name from the user database.

@prsabahrami prsabahrami marked this pull request as draft September 7, 2024 22:26
@wolfv
Copy link
Member

wolfv commented Sep 7, 2024

Ah great, the reference you found is indeed pretty clear :)

@prsabahrami
Copy link
Contributor Author

I have moved this to draft for now. I will try to expand during parsing just as @certik mentioned.

@wolfv
Copy link
Member

wolfv commented Sep 7, 2024

I think the pest grammar should turn this into something like TILDE{ username: Option<string> } and then later, we would expand this.

@certik
Copy link
Collaborator

certik commented Sep 8, 2024

I will try to expand during parsing just as @certik mentioned.

I was suggesting NOT to expand during parsing, but rather create a TILDE{ username: Option<string> } AST node for things like ~/bin and ~user, but not for a~b. We expand this later in the compiler.

@prsabahrami prsabahrami closed this Sep 8, 2024
@prsabahrami prsabahrami deleted the cd_replace_homedir branch September 8, 2024 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants