State of CLI in the CPP World
Something doesnt really look right in the staticly typed language landscape
After playing with some experimental code here and there comes a moment when you want to control it via some arguments, and most of the time we do that in a dynamic language but what happens in the world of C++.
Well we get our arguments conveniently split from our main entry point main(int argc,char* argv[])
, from then on out we are on our own. Lets take a look a couple of example of Command line parsing libraries that help us manage those arguments and help us construct a nice control flow into our code.
Current Popular Solutions
My favorites from this list are argparse , mostly because i like original python argparse, clipp because its really compact , and CL11 because it offers alot of features. While features are important, ease of use should be more important for a library of this type.
Many Options, but whats wrong?!
Well first off we should set our expectations for what a library that parser arguments should be. Most important thing is it should be intuitive and easy to use, we dont want to spend hours just setting up how our arguments are being handled.
- Single header library ( who wants a behemoth just to handle arguments 😓)
- Should not contribute to code clutter ( meaning too much boilerplate )
- Validate and/or Convert values to a proper type for ease of use
- Nested arguement hierarchy ( something akin to
git commit
git clone
) - Raw access to values of the proper type ( C++ is still a strictly typed language )
So far most libraries give you all, the biggest issue tho is the last point, getting values with their proper type. Having and option named num-threads
that should be an integer
we should expect by fetching it that we get a value thats of type int
not a string or something that we have to perform additional conversion.
After doing some research on the listed about libraries and some thinking of what is really possible in the language there are a couple of ways of handling it.
Handling argument values
Adding an option to our command line argument parser should be combined with selecting a type for it, either for validation or for conversion,otherwise we are just maping names to strings(the arguments).
Bind to a variable Introducing an option we could give it a reference/pointer to an existing variable this way we get direct access to the value and proper name to refer to. Sounds fine, but if we have to do it for all the options this becomes a mess of forward declared variables, making our argument parser mostly an empty shell.
Store as a string Sad option to be honest, keep the argument as it is maybe validate it based on a specified type, and then when we want to convert it, specify the proper type for that like
['--option'].as<Type>
Type erased Type erasure sounds like a great idea, all our options behave the same way, store the value but if we want to retrieve the value we still have to specify the type like with storing it as string
It seems like we have alot of options to handle the values, but all are resulting in making us refer to the type we want multiple times, rather cumbersome in my opinion. That being said the CL11 and Boost::ProgramOptions are great libraries, rather feature heavy if you want to wrap a project fast by adding options.
What is left to do is search for a new way to tackle this issue, our biggest obstacle is ofc the way types are handled in C++, everything that needs a specific type must be known at compile time. Reflection might the the cure but we are still not here, maybe C++23 will rescue us from ugly code.
Different approaches to storing values
Here are some of the different strategies we could use to store the options featuring simple examples, not really the best way to implement but a way to illustrate the concept.
Store as strings
Map each option/flag name to the string value it consumed from the passed parameters, simple, all types you use are strings and you can do lexicographical sort on the names for optimization maybe.
struct Option{
bool required;
std::string desc;
std::string help;
};
std::unsorted_map<std::string,Option> options;
//Iteratable and simple to retrieve
Store as std::variant
Same style of mapping but we store the real converted values in a proper variant, still presents us with the problem that we need to specify the type to get it
using OptionTypes = std::variant<Option<int>,Option<float>,Option<std::string>>;
std::unsorted_map<std::string,OptionTypes> options;
//Iteratable and simple to retrieve
Store in a tuple
Tricky but possible to iterate a tuple of elements, retrieving the values is problematic. One aproach leads to the return value being a variant for which we still have to put in the type once more to get the final value. The second approach would be type tagging using a user defined literal converting a string to the appropriate index. Another approach i found was using a table mapping the get<Index>
function pointers but that has the constraints of a single return type ( our table can store only one function pointer type ) more about that here.
An example about this is shown here.
Store in a struct
Options are defined in a struct, there was a library i saw with an approach like this but it required alot of boilerplate. My concept is having a POD type, converting it to a tuple to iterate it. Converting the struct to tuple is the tricky part. Retrieving the values becomes trivial because they are just members of a struct. This will be a viable options if there were reflection capabilities in the language.
struct MyOptions{
Option<int> num = Option<int>().required().desc("");
Option<float> x = Option<float>().required().desc("");
Option<float> y = Option<float>().required().desc("");
};
auto options = MyOptions();
parse(argc,argv,options);
if(num.value() == 12){
}
Going with this approach, looses our ability to iterate over the items. Here comes structure binding getting us all the items from the structure easily put into a tuple.
struct test{
int index;
float freq;
};
auto opts = test{12,3.4};
auto &[a,b] = opts;
auto tpl = std::make_tuple(a,b);
Great solution at first glance but there is no way to handle any number of structure members, unless we hardcode ( generate ) up to a certain length of sizes. This has already been done int the PFR library in boost, a great way to interact with POD structures.
My approach
Lets start with the basic premise, each option has a name, meta information ( like help text,format,default value ) and a type. The type will serve as validator, storage and an extension point.
Option<Type>("--name")
Now in terms of storing those options, instead of type erasure found that tuples can be used. Most of the work is done by the options anyway so we just need some glue logic.
All the meta information or additional properties can be assigned to the option using chained function calls to make it more compact ( Option<float>().required().desc("")
). So far so good,Parsing now becomes the simple task of going over the separated command line argument strings and iterate over the tuple elements to try and consume them (The option class has a member function consume that ).
Now on the topic of reading the parsed value, tuples are accessed by an index at compile time, what we want to do instead is get the value by name(string).
Impossible Time for some fancy constexpr code
template<size_t I = 0>
arg_var fetch(const char* name) {
if (std::get<I>(opts).name == name) {
return std::get<I>(opts).value;
}
if constexpr (I + 1 != std::tuple_size_v<arg_opt_tupl>) {
return fetch<I + 1>(name);
}else{
throw std::exception{ "Argument name doesnt exist" };
}
}
Its a way to get the value, but required arg_var
to be a std::variant<>
of all the types that can be returned. Still we get a variant at the end, the only thing stopping us mapping a String to a tuple item is the restriction of not having a string as a template argument. That might be possible in the next C++ standard but so far its not. One hack that might be used is converting strings to integers like a hash using user-defined string literals. Would not call that a solution more like bandaid and totally unnecessary.
Having a variant lets us use another feature the visior pattern that can alleviate some pain by lettings us do a comparison/check on values without actually getting them.
bool testing_cmp = p.compare("--num", 12);
Testing for a value cannot be easier, only requirement is the comparison operator, but the function has an optional defaulter parameter a comparison function ( internally uses std::equal_to{}
).
bool testing_cmp = p.compare("--num", 20, std::greater_equal{});
And finally you might just want to fetch the options themselves ( all of them ) by using the aforementioned structured binding with two simple functions. There is no such thing as nested structured binding so we could get the inner options from a command, meaning we hit another limitation of this approach.
Clip p = Clip("name",
Command("commit",
Option<int>("--test"),
Option<float>("--ding")),
Command("push",
Option<float>("--test"),
Option<float>("--dong")),
List<std::string, -1, 1>("--paths"),
Option<int>("--num")
.desc("This is great")
.default_value(3),
Option<float>("--fl")
.desc("This is great")
.default_value(5.14f),
Option<float>("--flz")
.desc("This is not great")
.default_value(5.14f)
);
const auto& [cmd1,cmd2] = p.fetchAllCommands();
const auto& [paths,num, fl, flz] = p.fetchAllOptions();
This would be an example declaration of options and their full fetch, the amount of options is checked at compile time. Now one last thing remains is getting the value off and option without getting all of them.
int temp23 = p.fetcher<int>("--num").value();
This is simple wrapper so we dont have to have std::get<>
in our code and could be wrapped even more so we dont have the .value()
call. Looks clean and could have been cleaner if i have found a way to deduce the parameter for fetcher making it the perfect clean solution in my opinion.
Having a new look at this, you might think well why is this Wrapped in a class, why dont we all create the options and pass them to the parsing functions, that will do exactly the same. Now we can just access the options as they are, normal variables
auto num = Option<int>("--num")
.desc("This is great")
.default_value(3);
auto fl = Option<float>("--fl")
.desc("This is great")
.default_value(5.14f);
parse(argc,argv,
num,fl);
if(num.value == 12){} // We still have an optional value
Now this is an interesting concept, out types handle the parsing and validating values and simple functions like parse()
and help()
glue them together. No inheritance just some simple types and functions that use them. The implementation i did is in no way perfect or even usable in production but more of an exploration of C++ 17 .
The three implented classes Option , List , Command are the ones doing most of the work. Each one of them contains a name and a consume
function that handles consuming values when we encountered their described command line parameter.
Option
Allows us to parse any type we want with the help of the class(functor) Converter<>
that has one purpose, to convert a string value that was captured to our type.
Example: Option<float,1> would capture a single float value at position 1
List
Similar to Option but allows us to capture multiple values at once in a vector of values. Using template parameters we can specify the type,position,min and maximum amount of values to capture
Example:
List<int,-1,1,3>
would capture one,two or three integers at any position
Command
Using a command we can handle cases where we had subcommands in command line tools like git
, analoguous to 'git commit' or git blame
where commit
and blame
is our name of the command. Each command is constructed with a name and any amount of Option and List arguments ( those are specifict to the Command and will only be matched when we are inside the context of that command).
The workhorse of the library are the functions parse
and help
, some of their features are rather barebones because i was getting too involved in this library, but there is atleast a foundation laid out if anyone wants to extend or use the library.
Final Example of Usage
Enough talking lets show some example usage, first of the name tuple_of_args
because we just have a couple of arguments in a tuple. :)
using namespace tuple_of_args;
auto commit = Command("commit",
Option<int>("--test"),
Option<float>("--ding"));
auto push = Command("push",
Option<float>("--test"),
Option<float>("--dong"));
auto paths = List<std::string, -1, 2>("--paths");
auto num = Option<int,1>("num")
.desc("This is great")
.default_value(3)
.choice({ 3,12,24 });
auto fl = Option<float>("--fl")
.desc("This is great")
.default_value(5.14f);
auto flz = Option<float>("--flz")
.desc("This is not great")
.default_value(5.14f);
bool parsed = parse(argc, argv, commit, push, paths, num, fl, flz);
if(!parsed)
help(commit, push, paths, num, fl, flz);
if (num.value == 12) {
std::cout << "Found the value";
}
All of this possible using the power of STL and some type arranging and filtering. The implementation is nowhere near feature complete as any of the ones listed at the start of the article, but it was a fun adventure to go to.
Conclusion
For a language that has been used and developed for so many years it still feels like there is no proper idiomatic solution that looks and works as one should expect. Maybe its just me being influenced by the ease of use of those sorts of libraries in dynamic languages, who knows. I hope someone finds some parts of it usefull, maybe you will finde some of the neat type filtering and tuple filtering in the foundation helpful, maybe you will find a new and more convenient way to implement it.
Repository
You can find the final implementation here https://github.com/Bloodb0ne/tuple_of_args.