On Thu, Feb 1, 2024, 09:09 alex xmb sw ratchev <fxmb...@gmail.com> wrote:
> > > On Wed, Jan 31, 2024, 20:36 Robert Elz <k...@munnari.oz.au> wrote: > >> Date: Wed, 31 Jan 2024 11:35:57 -0500 >> From: Chet Ramey <chet.ra...@case.edu> >> Message-ID: <1e50aa99-8d53-4cdf-ba5e-6aaf3ccc6...@case.edu> >> >> | Not quite. `new' in this sense is the opposite of `anything in the >> past' >> | as Dale described it -- already notified and removed from the jobs >> list. >> >> I guess the part about bash that I am not understanding here is how the >> "already notified" works. To me there are just two ways for that, either >> the user has done a "wait" which has collected that pid already (either >> without -n, and no pid args, or with pid args and one of those is the pid >> in question) or with -n and the pid in question was the one whose status >> was returned, or the user/script did the jobs command (or jobs -l) and the >> job in question was shown as completed. >> > > i say additional datastructure for the saving purpose .. > it d need new uid , real-unique-id , or some special hash of the jobs/pids/cmdlines Is there some other way? >> >> | Half the problem here is that bash aggressively marks dead jobs as >> being >> | notified in non-interactive shells without job control enabled, and >> moves >> | them out of the jobs table. >> >> That might be more than half the problem, it might be the entire problem. >> >> | If you use wait -n without arguments, you probably don't care, >> >> No you do, that just means any of the children ... the script could make >> a list of all of them and supply that list, but if the list is just going >> to contain all the existing children, why bother? (With -n - and not >> exactly one pid arg, -p is generally going to be required, but that option >> has no bearing on which process is selected, or might be, which is the >> issue here). >> >> | but if you >> | do, or if you use wait -n with pid/job arguments (which you've >> presumably >> | saved yourself) you're going to need slightly different semantics >> than we >> | have now to answer that reliably. And that will probably need a new >> option. >> >> That's a pity, particularly since the current semantics don't seem to >> be useful in general. Since the sole issue provoking that seems to be >> the wait over and over policy, rather than "wait once, and remove >> completely" >> perhaps rather than a new, but different, -n like option, a better idea >> would >> be a "only once" option (ie: if the option (-r (remove) or -c (cleanup) >> or -o >> (once only)) is set, then when the wait with that option returns status >> or, >> or waits until termination without returning status (in the not -n case, >> with >> no pid args, or many pid args) then the processes are completely deleted >> from >> everywhere in the shell. Using that option would make a changed -n safe >> to use in loops. If you do that, also add an option (maybe the upper >> case >> version of whatever is selected for that one, or just some other letter) >> to >> mean "don't wait" (kind of like wait(2) WNOWAIT) - which in default bash >> would >> just be a no-op (except in posix mode, apparently - whereas the -[cor] >> option >> would be a no-op in posix mode). >> >> If you were to do that, other shells could add the same (except in >> probably >> all of them, -[cor] would always be the default, and the other one would >> be >> the one which changes behaviour). >> >> | And that's why I used `more': there are several differences, so which >> | of those differences should we attempt to change? >> >> Just the one. >> >> | > The one change that should be made is >> | > to allow wait -n to collect processes/jobs that have already >> terminated. >> | >> | Yes, that's one of the things we're talking about. I don't have any >> problem >> | with it, but should it take a new option to change those semantics? >> >> Good, though I think some more thought should go into that. In another >> thread you said (paraphrasing) correctly, that scripts should not be >> relying upon bugs, and the current wait -n behaviour is a bug - that it >> might have been intentionally coded that way doesn't make it any less so. >> It isn't as if it was ever documented to work the way it does, or everyone >> would have known about it already. >> >> | > Changing it to wait for all the listed pids >> | It's never done that. >> | We're not going to change the return value from wait. >> >> Good, I only mentioned those possibilities because your earlier >> message was unclear about what "more like wait without -n" meant. >> >> | Yeah, but we're talking about bash here. It doesn't really matter what >> | the Bourne shell did; there are likely plenty of scripts that assume >> | the historical bash behavior. >> >> Really? Why? What's the point of collecting the status twice? >> It can't change in the meantime can it, once a process has done exit(N) >> its exit status should always be N, regardless of how often it is waited >> upon. >> >> [Aside: this should be obvious, but when one is collecting status changes, >> rather than just "terminated" status, then the pid isn't removed if it >> returns a "stopped" or "continued" status.] >> >> | > I meant the distinction between processes >> | > that the shell has already collected status for, and those for >> which it >> >> | You're not the first to propose something like that, but I'm not >> going to >> | be writing that code any time soon. >> >> Nor am I, if you go back to the message where I first mentioned it, >> which I can't locate at the minute, I am fairly sure I said that while >> it might help in this case, I doubt it is worth the effort. Or something >> like that. >> >> Actually, found it eventually (this is quoting myself, earlier): >> >> But as long as it is just a matter of cleaning up, and jobs works for >> >> that, I don't currently see the need. >> >> | It is, in fact, true in the current implementation, as long as the pid >> | is in the jobs list. >> >> That caveat is the problem. >> >> | It's always been true. If there is a job marked >> | (internally, if you must) as dead for which the user has not yet >> received >> | notification, wait -n returns it and marks it as notified (and deletes >> | it from the jobs list). >> >> That part is good. >> >> | Yes, that's one of the things we're talking about: whether wait -n >> should >> | consider pids/jobs *not* in the jobs list, the way wait without -n >> does. >> | That's about the only thing we're talking about changing here so far. >> >> Maybe a better discussion, and potential change, would be to whatever >> other that the use of the wait, or jobs, commands can result in a job >> moving out of the jobs list. If there were nothing other than those, >> (and jobs list overflow or similar) then we'd be fine, and it seems to >> me now, no change to the -n operation would be needed. >> >> | That hasn't actually been true with bash running in default mode for a >> | very long time now. Bash has allowed multiple waits for the same pid >> for >> | many years, whether or not you or I think it's a good idea or the >> correct >> | semantics. Even if it was an accident of the implementation, and >> maybe you >> | could say it was, we are stuck with it. >> >> Which is why I suggested an option (just above) to turn that misfeature >> off. >> Even better perhaps might be a bash shopt. >> >> | It's ok, we got one. >> >> A kind of unlikely one. >> >> kre >> >> >>