On Thu, Feb 27, 2020 at 01:40:22PM +0100, Albretch Mueller wrote: > I need to find all files which names satisfy a pattern and contain a > certain string, then from those files I need to printf some metadata, > a la: > > find "${_SDIR}" -type f -iregex .*"${_X}" -printf '"%TD > %TT",%Ts,%s,"%P"\n' > "${_TMPFL}" 2>&1
The quoting is wrong for the regex. And you probably don't even want to use a regex. It would help if we had some clue what's in the _X variable, and why you're trying to use a regex instead of a standard glob. Did you simply want all the files whose names end with the contents of that variable? If so: find "$sdir" -type f -name "*$x" -printf '...' (or -iname if you want it to be case-insensitive). The way you've got it quoted now, the .*$x bit will be expanded by the shell against the contents of the CURRENT directory, which is absolutely NOT what you want. The glob (or regex) needs to be quoted so that the shell WON'T expand it, but find WILL. Generally speaking, always use a glob if a glob will do the job. Don't use a regex unless it's absolutely necessary. > I am trying to do all steps in one go, So, start by saying what all of the steps ARE. Earlier, you mentioned "contain a certain string", but that's really vague. > I know I am being silly, since on that statement there are in fact > two searches one on the metadata and one through the content of the > patterned files, but you can see what I mean. We can't really "see what you mean" until you show us. Why don't you just tell us the actual problem? It can't be THAT embarrassing. "I want to find all of the files whose names end with .txt and which contain the string penis." find . -type f -name '*.txt' -exec grep -l penis {} + > There should be a way to > do it in once swoop. Or probably there is another utility to do that. > The thing is that I work on corpora research and very often you need > to search large amounts fo text files in no time. So wait, there's a *third* part of the problem? It has to handle extremely large inputs, and be very fast? It sounds like you need highly specialized tools that perform indexing of your content. Like Xapian or something. It's not really my area of expertise, so someone else may be able to suggest a more suitable tool set.