Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] Implement wildcard expansion for the current directory #292

Merged
merged 21 commits into from
Apr 1, 2024

Conversation

itislu
Copy link
Collaborator

@itislu itislu commented Mar 30, 2024


Matching filenames

  • Neither heredoc content nor heredoc end undergo wildcard expansion.
  • * within any quotes should not get expanded and should just match the literal character *.
  • If no filenames could be found, or the current directory could not be read from, the string stays unmodified by wildcard expansion.
  • Filenames starting with . have to be matched explicitely, like with .*.
  • If any character is quoted, the quoted character itself shall be matched.
    This means, even tough filename expansion occurs before quote removal, *"fi"le should not try to match <anything>"fi"le, but <anything>file. (reference)

Order of filenames

The order of the filenames that match exactly immitates bash's sorting order for ASCII characters, following these rules revealed by testing:

  1. Numeric characters come before alphabetic characters.
  2. Alphabetic characters are compared ignoring case.
  3. Lowercase characters come before uppercase characters.
  4. If equal alphanumerically, non-alphanumeric characters are given priority.
  5. If both are non-alphanumeric, lower ASCII value comes first.

However, bash itself uses a much more complicated system called collation, where each unicode character has specific a place in a list of all characters.
This list can be viewed from this path:

/usr/share/i18n/locales/iso14651_t1_common

For all possible characters from the German keyboard, the order is as follows:

\t\n\v\f\r !"#%&'()*+,-./:;<=>?@[\]^_`{|}~¡§¨©«¬®¯°±´¶·¸»¿×÷ˇˍ˘˙˚˛˝–—‘’‚“”„…′″‹›←↑→↓¤¢$£¥€01¹½¼⅛2²3³⅜45⅝67⅞89aAªäÄæÆbBcCdDđðÐeEfFgGhHħĦiIıjJkKlLłŁmMnNŋŊoOºöÖøØpPqQĸrRsSſßẞtT™ŧŦuUüÜvVwWxXyYzZþÞµΩ

In C there is a function that can handle this collating called strcoll().
For more great information:
https://unix.stackexchange.com/questions/423345/generate-collating-order-of-a-list-of-individual-characters
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html#tag_07_03_02



Test cases

export A=$
echo "$A"USER
export A="*"
echo $A""
echo *""file
echo *"f"ile
echo *"*"*
echo *" "*
export B="a  b  c Make"
echo ""$B""*$B*""
echo "*"*""*
echo *"*"*""**
echo "M"*"e"
touch echo file
* 123

@itislu itislu added the feature New feature of the project label Mar 30, 2024
@itislu itislu force-pushed the feat-wildcard-expansion branch from 3acdf70 to 7c32b64 Compare March 30, 2024 21:49
@itislu itislu added this to the Expander milestone Mar 31, 2024
@LeaYeh LeaYeh force-pushed the feat-wildcard-expansion branch from aa251be to d80bd98 Compare March 31, 2024 18:29
@itislu itislu force-pushed the feat-wildcard-expansion branch 5 times, most recently from 913c611 to eb7b041 Compare April 1, 2024 10:04
@itislu itislu linked an issue Apr 1, 2024 that may be closed by this pull request
itislu added 17 commits April 1, 2024 15:17
TODO:
- Don't expand '*' if it is quoted.
- Remove quotes from the pattern to match filenames against.
- Filename list sorting is not correct yet.
This is to better differentiate between the `t_expander_task` type and `t_list` type.
BUG: This breaks the behavior of quotes that come from variable expansions, as was already solved in PR #177.

- Add wildcard tasks to task list to be able to differentiate which asterisks should be expanded and which ones not bc they were quoted.

- Append quote tasks only after parameter expansion together with wildcard tasks. This is so that the wildcard tasks can be appended to the task list in correct order with the quote tasks. Otherwise, it would cause issues with the current implementation of how updating the task_list works after an expansion.

- Also fix norminette for wildcard expansion and separate new functions into other files.
- Revert back to adding the quote tasks together with the variable expansion tasks in order to not take quotes that came from variable expansion into consideration.

- In order to keep the quote interaction with wildcards correct (only quotes that were in the string before variable expansion should effect wildcard expansion), when searching every character of all words for a '*', search for quotes, and if one is found, check the task_list if the current character is part of a task and move the first node from the old task list to the new task list.
At this stage there should be only quote tasks in the old task list, so after this, all nodes of the old task list should have been moved to the new task list, with new wildcard tasks inserted correctly.
1. Compare considering only alphanumeric, not considering case.
2. If equal so far, lowercase before uppercase.
3. If equal still, give priority to non-alphanumeric.
4. If both are non-alphanumeric, just check which has lower ASCII value.
Don't do wildcard expansion in heredoc content or end.

Known issues:
- #295 Word splitting and filename expansion are not performed when assigning values to env variables
- #296 In --posix mode, words after redirection operators do not undergo filename expansion if shell is not interactive
itislu added 3 commits April 1, 2024 15:17
Also reduce 2 unnecessary lines in `expand_wildcard()`.
Also change `any_wildcard_task()` to `any_task_of_type()` for more general use.
It contained a lot of tricky filenames, so it's good to keep it in the history.
@itislu itislu force-pushed the feat-wildcard-expansion branch from eb7b041 to 632a501 Compare April 1, 2024 13:17
@itislu itislu changed the title [FEAT] Implement wildcard expansion [FEAT] Implement wildcard expansion for the current directory Apr 1, 2024
@itislu
Copy link
Collaborator Author

itislu commented Apr 1, 2024

@LeaYeh I left the function name as get_next_wildcard() for consistency because we always use get_... for this type of function.
If that's okay, this PR can be merged.

@LeaYeh LeaYeh merged commit c0aef05 into main Apr 1, 2024
30 checks passed
@LeaYeh LeaYeh deleted the feat-wildcard-expansion branch April 1, 2024 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature of the project
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[EXPLANATION] Wildcard behavior
2 participants