Phil, >From what you described so far, I don't see any red flags. I would pay >attention to reading those timestamps (covered on the Wiki and ML archives), >that's all.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: philmccarthy <philmccar...@gmail.com> > To: solr-user@lucene.apache.org > Sent: Tuesday, January 13, 2009 8:49:33 PM > Subject: Indexing the same data in many records > > > Hi, > > I'd like to use Solr to index some webserver logs, in order to allow easy > ad-hoc querying and analysis. Each Solr Document will represent a single > request to the webserver, with fields for time, request URL, referring URL > etc. > > I'm also planning to fetch the page source of each referring URL, and add > that as an indexed field in the Solr document. The aim is to allow queries > like "find hits to /xyz.html where the referring page contains the word > 'foobar'". > > Since hundreds or even thousands of hits may all come from the same > referring page, would this approach be horribly inefficient? (Note the page > source won't be stored in each Document, just indexed). Am I going to > dramatically increase the index size if I do this? > > If so, is there a more elegant way to do what I want? > > Many thanks, > Phil > > > > -- > View this message in context: > http://www.nabble.com/Indexing-the-same-data-in-many-records-tp21448465p21448465.html > Sent from the Solr - User mailing list archive at Nabble.com.