Luwak may be relevant here (https://github.com/flaxsearch/luwak)?
Or it may help to describe difference from Luwak's solution to further fine-tune your requirement. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Thu, Jul 3, 2014 at 6:10 PM, Matt Stunfield <heceri...@gmail.com> wrote: > Hi, > > I'm new to Lucene/SOLR and I'm researching if SOLR would fit to our case > requirements. I would be very happy if You could help me :). > > Environment: > We have a database storing some (mostly) text information. There are > elements containing multiple sections of information. Each section is > stored separately. This section content would be Solr document. Section > content can be any number of text paragraphs - usually it is less than 200 > words. We have a dictionary of terms - currently 1600 items and there will > be more. Term might be a single word or phrase - 3 to 4 words. > > Task: find if terms occur in single section. Found terms must be > distinguished. In query result there is (highlighted?) found terms > positions. > > I know that Solr is perfect for searching items in large amounts of data, > but (1.) *will it cope (reasonably fast) with searching many items in a > small bit of information?* > > Additional info: Solr instance will be accessed from one of .NET clients. > Query text and indexed text will be stemmed. > > 2. I suppose it isn’t possible to perform search with 1600 search items > like: “term_1” “term_2” … “term_N” at once, is it? > 3. Is Lucene/Solr capable of performing simultaneous 1600 queries in > separate threads? > 4. Can I store a query, so that its content won’t be stemmed again > unnecessarily? Terms dictionary won’t be changed frequently. > > I would be glad for any help. > H.