Hi, We have a requirement of implementing a boolean filter with up to 500k values.
We took the approach of post filter. Our environment has 7 servers of 128gb ram and 64cpus each server. We have 20-40m very large documents. Each solr instance has 64 shards with 2 replicas and JVM memory xms and xmx set to 31GB. We are seeing that using single post filter with 1000 on 20m documents takes about 4.5 seconds. Logic in our collect method: numericDocValues = reader.getNumericDocValues(FileFilterPostQuery.this.metaField); if (numericDocValues != null && numericDocValues.advanceExact(docNumber)) { longVal = numericDocValues.longValue(); } else { return; } } if (numericValuesSet.contains(longVal)) { super.collect(docNumber); } Is it the best we can get? Thanks, Artur Rudenko This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.