On 2020-07-15 at 10:11, Bob Weber wrote: > On 7/15/20 8:44 AM, Greg Wooledge wrote: > >> On Wed, Jul 15, 2020 at 08:34:36AM -0400, Bob Weber wrote: >> >>> My only purpose was to show how tr could be used to handle >>> multiple characters as a delimiter either as tr -s '\\\|' '\|' >>> or >> >> The problem is, it can't, at least not the way you showed. The >> original example, sadly, did NOT contain instances of the | and \ >> characters in isolation, so one might be lulled into a false sense >> of security, and write code that (for example) simply deletes all >> of the \ characters, and then splits on the | characters. >> >> But that won't work in the general case, where | and \ might appear >> as literal data characters. >> >> My own solution, which involved using awk to convert the \| pairs >> into NUL bytes, is also technically incorrect. However, there was >> an additional stipulation: the stream was to be converted into a >> bash array. A bash array is a list of C strings, so they cannot >> contain NUL bytes. Therefore you can't possibly have NUL bytes in >> the original input stream (at least, not and still produce a bash >> array), so my conversion of the multi-character delimiters into NUL >> bytes will "work". >> >> But it's a freaking ugly problem any way you look at it, and it >> just got uglier when it was revealed that the OP might be trying to >> write shell code that parses shell code. Especially if the code in >> question is a series of poorly written GNU-tainted grep commands. >> > Which is why I showed this: > > tr -s '\\\|' '\|' > > which replaces \| with a single character which is known not to be in > the input data
How do you know that? We don't necessarily have the full input data set. We have a sample input data set, which may or may not be the only one that will ever be used. If '|' were guaranteed to never occur in the input data, it would probably have been selected as the delimiter standalone, rather than only as part of the '\|' pair. -- The Wanderer The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man. -- George Bernard Shaw
signature.asc
Description: OpenPGP digital signature