-
-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std::os::argparse module #1897
base: master
Are you sure you want to change the base?
std::os::argparse module #1897
Conversation
I don't know if the API for std argparse has already been discussed (It's not obvious from a quick search of issues or looking at the test runner pr). If not, I've got opinions 😄 and code I'd be willing to donate. But if this has already been decided I don't want to derail. |
There are a bunch of tests of argparse there in the test runner PR. So you may try to get a sense of it API. Anyway, I'm open for ideas. |
@hwchen did you have some feedback? |
Just want to be clear that I'm not really commenting on the the current implementation. I'm more interested in whether there's a certain type of API we're looking for in an argparse module. I come from Rust, and not C, so I'll explain in terms of those libraries.
The ripgrep crate moved away from clap to lexopt, in part to reduce dependencies, and also because lexopt would end up providing more control over arg parsing (at the cost of having to implement more boilerplate). I feel that the current PR API sits between the two (more towards simplicity). I think for stdlib, I'd prefer either extreme; if it's simpler, more complex parsers can be built on it, and if it has more features it can be used easily as-is for more scenarios. Odin ended up with something more comprehensive (can defined opts using a struct with tags). Also, I believe that wherever we want to sit on the spectrum, it's good to be explicit about it. As for my own biases, I've written an arg parsing library for c3 which follows the general structure of lexopt's API. I might prefer something like it in the std library, but I can also see the appeal of other approaches. And seeing as everybody ends up writing their own argparse, there's probably a lot of other opinions out there too :) |
I have a bit of feedback on this. I feel like the API is fine, it's exactly what it says that it is, argument parser. If something more like a full blown CLI app API would be needed (to have 0 hassle subcommands and what not) I could be in another module, which would utilize argparse under the hood. What I currently don't see is a way to provide an array option, since from the implementation it would seem that providing a single option multiple times would result in an error of "duplicated option". The value of the option could be handled by the The only other thing that came to mind was a bit more "hackability", for example if I wanted to somehow implement validation of a parameter, I would have to do it myself after the parsing and I would also have to write the extra help info (if it was for example an enum). But again, this could be solved by the wrapping module, which would hold the users hand a bit more. Altough my view is similar to hwchens' above, I feel like the current implementation here is good enough and if one wants to opt-out of some of the features, they still can (for example the help option is opt-in). Edit: I see now that the callback function can return optional, which makes the hackability possible for validation or exclusivity of options. |
This is a kind a thing I was thinking about. I think it's common for CLI to have accumulated values, e.g. So by design, the callback mechanism is the way to extend the argparse to whatever is needed. I can refine callbacks and arrays of arguments after PR approval.
All hackability is implemented via callbacks, or explicit param validation after parsing in the main (or other function). |
So for the arrays, just a simple flag Since now it would call the callback once and then error. I'm fine with arrays not being available by default and requiring custom implementation. |
FYI, I found array args impractical in most cases, I barely can remember anything I used with array args except maybe gcc :). For simple use, it's possible to use |
That's what most tools do, single flag with values separated by some character. But sometimes you might want those values to be arbitrary strings and there might not be a feasible separator, like when specifying ENV variables for docker etc. |
I am sorry this one isn't looked at yet. It's half past midnight and I don't have the time this lib deserves to check it. I'll need to push it to the weekend. |
Maybe I'm not the kind of audience who is using something like this, but for me it's more natural with a simpler design, as you might have guessed from the way build_options.c work. It is quite simple: have a switch which looks at each arg, then if the arg starts with This way checking is trivially stateful, which can be useful. So the useful functionality is not parsing the arguments but rather:
What are your thoughts? |
My library is also basically a big switch, but there's some additional lexing it does which I think is an improvement over just checking for
I had also added some convenience methods for parsing ints and paths, although I don't think they're really necessary. I might change up the error handling a bit; I want some decent canned error messages, but I also want it to be easy to silence when the user wants custom error messages. (Not sure how this will work on windows; lexopt had a decent amount of logic for handling windows) Example switch in loop:
|
Just a comment there, this: while (true) {
Arg arg = opter.next()!;
if (arg.type == EOF) break;
switch { Should be possible to rewrite as: while (try arg = opter.next())
{
switch { ... }
} |
But I think the one benefit of a module is to create some uniform way of presenting the help. I think the actual parsing might be something of a red herring. enum Options : (String short_opt, String long_opt, String description)
{
UNKNOWN = { "", "", "" }, // Mandatory
VALUE = { "", "", "The value" },
N = { "n=NUM", "number=NUM", "The number to use" },
SHOUT = { "", "shout", "Shout out" }
} In the above then, we can imagine: @parse_opt("foo.exe [options]", strings, Options; @body(OptParser* parser, OptArg arg)
{
switch (arg.type)
{
case UNKNOWN:
// Handle anything else here
// Example using another argument
OptArg! next = parser.next();
if (catch next_thing = parser.next())
{
return report_error("Expected an argument after %s.", arg.string);
}
io::printfn("The argument after was %s.", next.string);
case VALUE:
// Value arguments are handled in UNKNOWN
unreachable();
case N:
...
case SHOUT:
...
}
} Becoming:
But then we can imagine essentially a mini DSL defining things instead: enum Options : (String short_opt, String long_opt, String description)
{
UNKNOWN = { "EMPTY" },
VALUE = { "The value" },
N = { "n=;number=;NUM;The number to use" },
DEBUG = { "g;debug;Use debug" }, // -g, --debug
SHOUT = { "shout;Shout out" }
} It all depends on how much is in the DSL and how much is in the language. It's nice if the description and the commands are in sync, and that's the basic service provided. Then providing those utility methods for matching multiple options and so on. |
Let me explain and reason about my design decisions for C3 argparse is designed with the following requirements in mind:
@hwchen alternative library looks more like syntactic sugar to me, which requires working through each argument separately, and making printout manually. In order to have fair comparison, we need to implement the same functionality using alternative argparse libraries. And judge which one provides cleaner or shorter or more flexible code. It has to be different use case, different options and argument types, maybe sub commands. |
Not sure how I feel about it, but I implemented @lerno 's suggested API on top of my library (without help generation yet; I don't think that's the difficult part). Implementation: https://github.com/hwchen/opter-c3/blob/691e5b19ba5cc8e45418f7e717c1351072f083f3/fancy.c3 I think this demonstrates one perspective on my library: it's the minimum amount of lexing/parsing code that's still actually helpful. With the basic lexing/parsing out of the way, it's easy to focus on adding features if building a more advanced cli library, or handling some really twisted config logic in an application. To me, this gives a feeling of "hackability", which I associate with C3. So if we end up taking this general approach of just a switch statement (and using some of my code), I think it would be nice if the "lower-level" was still exposed, even if there was a higher-level API. |
Example of using callbacks, 2 flavors of callbacks: fn void test_custom_type_callback_unknown_type()
{
char val = '\0';
ArgParseCallbackFn cbf = fn void! (ArgOpt* opt, String value) {
io::printfn("flt--callback");
test::eq(value, "bar");
*anycast(opt.value, char)! = value[0];
};
// NOTE: pretends app struct
ArgParse my_app_state = {};
ArgParse agp = {
.options = {
{
.short_name = 'f',
.long_name = "flt",
.value = &val,
.callback = cbf
},
{
.short_name = 'o',
.long_name = "other",
.value = &my_app_state,
.callback = fn void! (ArgOpt* opt, String value) {
ArgParse* ctx = anycast(opt.value, ArgParse)!;
io::printfn("other--callback");
// NOTE: pretends to update important app struct
ctx.usage = value;
}
}
}
};
String[] args = { "testprog", "--flt=bar", "--other=my_callback" };
test::eq(val, '\0');
test::eq(agp.options[0]._is_present, false);
agp.parse(args)!!;
test::eq(val, 'b');
test::eq(my_app_state.usage, "my_callback");
test::eq(agp.options[0]._is_present, true);
} |
@tomaskallup With small amendments I managed to add multiple options parsing via callbacks fn void test_custom_type_callback_int_accumulator()
{
List(<int>) numbers;
numbers.new_init();
defer numbers.free();
argparse::ArgParse agp = {
.options = {
{
.short_name = 'n',
.long_name = "num",
.value = &numbers,
.callback = fn void! (ArgOpt* opt, String value) {
io::printfn("value: %s", value);
List(<int>) * ctx = anycast(opt.value, List(<int>))!;
int val = value.to_integer(int)!;
ctx.push(val);
}
},
}
};
String[] args = { "testprog", "--num=1", "-n", "5", "--num", "2" };
test::eq(numbers.len(), 0);
test::eq(agp.options[0]._is_present, false);
agp.parse(args)!!;
io::printfn("%s", numbers);
test::eq(numbers.len(), 2);
test::eq(numbers[0], 1);
test::eq(numbers[1], 5);
test::eq(numbers[2], 2);
test::eq(agp.options[0]._is_present, true);
}
|
Ok guys, if you wish moar control, why not, it's easy to add to This is an alternative way of parsing args (while preserving old path, so you can pick whichever you like more): fn void test_argparse_next_all_together()
{
List(<String>) args;
args.new_init();
defer args.free();
String[] argv = { "testprog", "-n", "5", "--num", "2", "--foo=3", "--", "-fex", "-I./foo"};
ArgParse agp;
while (String arg = agp.next(argv)!!) {
args.push(arg);
}
io::printfn("%s", args);
// prints
// [-n, 5, --num, 2, --foo, 3, --, -fex, -I./foo]
}
fn void test_argparse_next_switch()
{
List(<String>) args;
args.new_init();
defer args.free();
ArgParse agp = {
.description = "Test number sum program",
.options = {
{
.short_name = 'a',
.help = "a short name flag"
},
{
.long_name = "flag",
.help = "a long name flag"
},
}
};
String[] argv = { "testprog", "-n", "5", "--num", "2", "--num=3"};
int n_sum = 0; // expected to be 5+2+3
while (String arg = agp.next(argv)!!) {
switch (arg) {
case "-n":
case "--num":
String value = agp.next(argv)!!;
n_sum += value.to_integer(int)!!;
default:
// you may use ArgParse options only for usage display as well
// OR make your own usage printout
agp.print_usage()!!;
test::eq(1, 0); // unexpected here
}
args.push(arg);
}
io::printfn("%s", args);
test::eq(args.len(), 3);
// NOTE: we skipped values in switch "--num" handler
test::eq(args[0], "-n");
test::eq(args[1], "--num");
test::eq(args[2], "--num");
test::eq(n_sum, 5+2+3);
}
|
So currently I evolve |
…evel parser + unit test module ported
…alexveden/args_parse
std::os::argsparse module