Stefan Beller wrote:
> On Fri, Feb 19, 2016 at 3:07 PM, Jonathan Nieder <[email protected]> wrote:
[...]
>>> + strbuf_addf(err, "Skipping submodule '%s'\n",
>>> + displaypath);
>>
>> Does the caller expect a newline at the end of err?
>>
>> In the refs code that uses an err strbuf, the convention is to
>> not end the message with a newline. That way, a function like
>> die() can insert a newline and messages are guaranteed to be
>> newline-terminated even if someone is sloppy and does the wrong thing
>> when generating an error.
>
> Oh! So we need to add new lines ourselves here.
Now I'm confused. The code in this patch is inconsistent. I was
recommending consistently leaving out the \n.
It looks like pp_start_one takes the content of err without adding
a \n. That's a bug in pp_start_one and pp_collect_finished IMHO.
For example, default_task_finished and default_start_failure don't
put a newline don't put a newline at the end of the message. I
don't think that's a bug --- they're doing the right thing, but
pp_start_one and pp_collect_finished is just not handling it
correctly.
>>> + if (pp->reference)
>>> + argv_array_push(&cp->args, pp->reference);
>>> + if (pp->depth)
>>> + argv_array_push(&cp->args, pp->depth);
>>
>> What does 'cp' stand for mean? Would a name like 'child' work?
>
> child process ? (any code which had a struct child_process referred to
> it as *cp)
Output from 'git grep --F -e "child_process *"' finds variables with
various names, corresponding to what kind of child it is.
[...]
>> Is this the 'list' subcommand?
>
> no. :(
No problem --- that's what review is for.
[...]
>>> + if (pp.print_unmatched) {
>>> + printf("#unmatched\n");
>>
>> I'm still confused about this. I think a comment where
>> 'print_unmatched' is declared would probably clear it up.
>
> Probably this is a too literal translation from shell to C.
> By printing #unmatched the shell on the other end of the pipe
> of this helper program knows to just stop and error out.
>
> So this is telling the downstream program to stop.
>
> Maybe we can name the variable 'quickstop' ?
> 'abort_and_exit' ?
Interesting.
Would it work for the helper to exit at that point with nonzero status?
Lacking "set -o pipefail", it's a little messy to handle in the shell,
but it's possible:
{
git submodule--helper \
--foo \
--bar \
--baz ||
... handle error, e.g. by printing something
that the other end of the pipe wants to see ...
} |
...
[...]
>> (just curious) why are these saved up and printed all at once instead
>> of being printed as it goes?
>
> To have a clean abort path, i.e. we need to be done with all the parallel
> stuff
> before we start on doing local on-disk stuff.
Interesting.
That's subtle. Probably worth a comment so people know not to break
it. (E.g.
/*
* We saved the output and splat it all at once now.
* That means:
* - the listener does not have to interleave their (checkout)
* work with our fetching. The writes involved in a
* checkout involve more straightforward sequential I/O.
* - the listener can avoid doing any work if fetching failed.
*/
).
Thinking out loud: the other side could write their output to a
temporary file (e.g. under .git) and save us the trouble of buffering
it. But the list of submodules and statuses is not likely to be huge
--- buffering doesn't hurt much.
[...]
> In a next step, we can do the checkout in parallel if there is nothing to
> assume
> to lead to trouble. i.e. update strategy is checkout and the submodule is
> in a clean state at the tip of a branch.
Somebody tried parallelizing file output during checkout and found it
sped up the checkout on some systems by a small amount. But then
later operations to read back the files were much slower. It seems
that the output pattern affected the layout of the files on disk in a
bad way.
I'm not too afraid. Just saying that the benefit of parallel checkout
would be something to measure.
A bigger worry is that checkout might be I/O bound and not benefit
much from parallelism.
> All problems which need to be solved by the user should be presented in
> a sequential way, i.e. present one problem and then full stop.
> That seems to be more encouraging as if we'd throw a "Here are a dozen
> broken submodule repositories which we expect the user fixes up".
It depends on the problem --- sometimes people want to see a list of
errors and fix them all (hence tools like "make -k"). I agree that
stop-on-first-error is a good default and that it's not worth worrying
about introducing --keep-going until someone asks for it.
Thanks for your thoughtfulness,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html