Included millert, since he imported this code. He might shed some more light on what's intended.
On 3/15/20 11:27 PM, 0xef967...@gmail.com wrote: > On Sun, Mar 15, 2020 at 07:32:52PM +0100, Martijn van Duren wrote: >> tl;dr new diff below > > Check with: > > OPTS=r- ./getopt-test-MvD2 -r- > getopt-test-MvD2: unknown option -- - > <r> #U- | > > With the unpatched code, my patch, and the original 44BSD, it's: > > OPTS=r- ./getopt-test-O -r- > {./getopt-test-O} <r> <-> | > > See proposed fix at the end. This is intentional. From getopt_long(3): OpenBSD a ‘-’ within the option string matches a ‘-’ (single dash) on the command line. This functionality is provided for backward compatibility with programs, such as su(1), that use ‘-’ as an option flag. This practice is wrong, and should not be used in any current development. The original might have accepted this, but if the getopt_long(3) manpage is rather specific about this and the current code goes out of its way to make sure that it only appears at the end of the string (-q-q doesn't work). So why go out of your way to disallow that behaviour, but not catch the case you describe here (and is not documented as such). If we do care about this particular feature then why add the additional checks at all? Why not let all the code prior to this section handle all the common "-" cases and let the rest utilize "-" as any normal getopt character, including an optstring with "-:" which would allow -q-<arg> and "- <arg>". My reasoning is: Let's restrict this misfeature as much as possible and only use it for where it's needed (su(1)). But let's see what millert or any of the other more experienced devs have to say. > >> optreset and flags. Note that on line 317 we increment the pointer on >> options. This means that the "oli == options" check in your code is >> not needed and even introduces a bug when someone would be stupid >> enough to create an optstring which starts with "--", where a "-" >> option stop to being recognized: > > Yes, that check was dumb, it should be removed. > >> On 3/15/20 11:21 AM, 0xef967...@gmail.com wrote: >>> Notice that OpenBSD's getopt(3) explicitly allows to use "-" as an >>> option character, and there may be a /separate/ bug related to that, >>> but my patch doesn't change or affect it in any way. >> >> This is in line with getopt(3) and getopt_long(3) expands on that >> saying it's (at least) for su(1). So if we want to remove this >> (mis)feature a lot more scrutiny needs to be done and I'm not feeling >> like going down that rab(bit|ies)-hole any time soon. > > I didn't say that that feature should be removed, but that there may > be another bug/incompat related to that. Since it was there since 44BSD, > it should be left as is (IMVHO). > >> I did some pointer dumping inside your test-application and inside >> getopt_internal and found that the OPTS environment variable is 1 byte >> from the last place pointer. When you overwrite every byte of the >> original avs with ':' (including the terminating NUL) you expand the >> string length of the place variable inside getopt_internal and it starts >> to parse the "garbage" you see in your example. > > That was of course intentional. It's perfectly legit to overwrite > the args strings (including their terminating \0). And it makes for > more dramatic testcases ;-) > >> Since our manpage explicitly states that you need to set optreset I >> don't know if it's worth adding a "fix" for, if such a thing is at > > No need to add another fix for it; that return (-1) that my (and your > current patch) eliminates is the only path through which -1 can be > returned /without/ setting "place = EMSG". > >> if ((optchar = (int)*place++) == (int)':' || >> - (optchar == (int)'-' && *place != '\0') || >> - (oli = strchr(options, optchar)) == NULL) { >> - /* >> - * If the user specified "-" and '-' isn't listed in >> - * options, return -1 (non-option) as per POSIX. >> - * Otherwise, it is an unknown option character (or ':'). >> - */ >> - if (optchar == (int)'-' && *place == '\0') >> - return (-1); >> + (oli = strchr(options, optchar)) == NULL || >> + (optchar == '-' && >> + (oli == NULL || strcmp(nargv[optind], "-") != 0))) { >> if (!*place) >> ++optind; >> if (PRINT_ERROR) > > ... || (foo = bar) == NULL || (baz && (foo == NULL || quux)) > > is the same as > > ... || (foo = bar) == NULL || (baz && quux) > > since the "||" is shortcutting. > > But I still don't see how "quux" (strcmp(nargv[optind], "-") != 0) > may happen with a args which is just "-" (since it's already caught by > the "If we have "-" do nothing" check above it). > > It will however BREAK a case like that mentioned at the beginning > of this message. > > So make it just ".. || (foo = bar) == NULL": > > - (optchar == (int)'-' && *place != '\0') || > - (oli = strchr(options, optchar)) == NULL) { > - /* > - * If the user specified "-" and '-' isn't listed in > - * options, return -1 (non-option) as per POSIX. > - * Otherwise, it is an unknown option character (or ':'). > - */ > - if (optchar == (int)'-' && *place == '\0') > - return (-1); > + (oli = strchr(options, optchar)) == NULL)) { >