I would say "a" - index and filter on the metadata in Solr. Otherwise you will have larger result lists from Solr that have to be queried against the database, negating any minor speedup in the Solr portion of the search.

Solr filter queries, with their associated caching, should result in very reasonable query performance.

The metadata would not need to be stored in Solr to perform the filter queries, but would need to be stored if the application wishes to retrieve the metadata for result documents.

-- Jack Krupansky

-----Original Message----- From: savvyzac
Sent: Saturday, August 11, 2012 11:27 AM
To: solr-user@lucene.apache.org
Subject: Performance Multiple fields

Hello,

I have to index around 5-6 million documents in solr for full text search.
Each of these documents have around 25 additional metadata fields attached
to them. Each of the metadata fields individually are small (upto 64
characters). Common queries would be involving a search term along with
multiple metadata fields used to filter the data. So my questions is which
would provide better performance wrt search response time. (indexing time is
not a concern):

a. Index the text data as well as push all metadata fields into solr as
stored fields  and query solr for all the fields using a single query.
(Effectively solr does the filtering with metadata as well as search)

b. Store the metadata fields in a db like Mysql. Use solr only for full text
and then use the document ids returned from solr as an input to the database
to filter based on other metadata to retrieve the final set of documents.

Thanks
Arijit



--
View this message in context: http://lucene.472066.n3.nabble.com/Performance-Multiple-fields-tp4000599.html Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to