Am 07.04.25 um 15:35 schrieb Norbert:
Am 07.04.25 um 14:19 schrieb Howard Chu:
Norbert wrote:
Hi,
Used version 2.5.19.1 (ltb)
We have a LDAP with about 4.6 million entries and an indexed attribute which
occurs around 3.9 million times. We typicall filter for that attribute with a
specific value (eq). Which is typicall very fast and no problems. As soon as
the same value used twice the execution time for that filter is becoming really
slow even when additional criteria of the filter limits the result to exact 1
entry. Search time is at ~5% for single entry results compared to potential 2
entry results.
Some more details how this was determined:
1) enable "stats" logging on production server for 5 minutes.
2) collect the slowest ~1200 from several thousand searches within the 5
minutes from the log
3) create a separate ldap server with exact same data and configuration
(imported with slapadd)
4) use a script running locally on the extra server which executes the 1200
filters one after the other and measure complete execution time of script
With production data I measure around 11s for the ~1200 searches. For all these
searches one attribute in the filter could have 2 hits, but it is actually
limited to 1 hit because of following filter
"(&(objectClass=value)(almost_uniqe_attr=value)(another_attr=*))" Means searching with
only "almost_uniqe_attr=value" as filter it would return 2 results, but
objectClass and another_attr limit it to exact 1 entry.
When I now remove the second entry from the ldap server for these exact ~1200
filters the script run time will be ~0.5s .
If re-add those ~1200 entries the runtime will be around 5s (and with a
complete recreate of the db it will be 11s again.)
Limiting the search scope by using a more specific base dn for the search does
not change anything in regards to the execution time.
So the question is: 1) can I change anything on the server side to speed up the
execution time of these searches?
How common is 'another_attr'? Is there a presence index on it?
another_attr is the most occuring attribute in the server, typically values
occur once but in this particular
case it is the majority that 2 entries are referenced with this attribute. The
index for this attribute is
configured as "eq,sub".
sorry. I got confused with my arbitrary names. Each entry of interest has another_attr set. But when looking
at the search performance when removing (another_attr=*) from the test filters, it does not have any impact
regards to performance. With or without it the run time is the same and it returns 1 entry because objectClass
matters in these cases. another_attr has eq but not pres. Many entries have actually the same value in this case.
Regards,
Norbert