\(foo\|bar\)'"

Bernhard Voelker Tue, 17 May 2022 23:30:17 -0700

Follow-up Comment #2, bug #58197 (project findutils):

The main point about the optimizer in find(1) is to avoid
costly activities when cheap ones can already rule out that
the current item has to be processed further.
Costly activities are e.g. extra stat() calls.
Cheap activities are evaluating conditions in CPU like -name;
i.e., no further OS call is required.


Therefore "-path ... -o -path ..." is considered about as
expensive as "-regex ...|..." when comparing the things requiring
an extra stat() which is by magnitudes slower than -path
processing in memory.

Although -path appears more times in the expression tree
(or be it other more expensive conditions like -size which
really need an extra stat()), once `find` has the information
for an item, that will be used for the evaluation of the next
condition.  E.g. "-size -1000c -size -999c" will only lead to
one stat() invocation.

Re. the below 2 examples: I've also run both commands here,
and already after a couple of runs of each they are almost at
the same time ... here even the -regex takes 1.22s while the one
with 2x -path needs 1.16s.
Therefore, changing this would be over-optimization just compli-
cating the code further.

BTW: the optimizer in `find` is already too tricky and doesn't
consider some side effects of doing things earlier than specified
by the user on the command line.
It was already discussed to remove at least parts of the optimiser
(arm-swapping) and to adhere the left-to-right rule of POSIX again.


    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?58197>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/

[bug #58197] "find" fails to optimize "-path /usr/foo -o -path /usr/bar" to "-regex '/usr/\(foo\|bar\)'"

Reply via email to