Phil,

>From what you described so far, I don't see any red flags.  I would pay 
>attention to reading those timestamps (covered on the Wiki and ML archives), 
>that's all.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: philmccarthy <philmccar...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, January 13, 2009 8:49:33 PM
> Subject: Indexing the same data in many records
> 
> 
> Hi,
> 
> I'd like to use Solr to index some webserver logs, in order to allow easy
> ad-hoc querying and analysis. Each Solr Document will represent a single
> request to the webserver, with fields for time, request URL, referring URL
> etc.
> 
> I'm also planning to fetch the page source of each referring URL, and add
> that as an indexed field in the Solr document. The aim is to allow queries
> like "find hits to /xyz.html where the referring page contains the word
> 'foobar'".
> 
> Since hundreds or even thousands of hits may all come from the same
> referring page, would this approach be horribly inefficient? (Note the page
> source won't be stored in each Document, just indexed). Am I going to
> dramatically increase the index size if I do this?
> 
> If so, is there a more elegant way to do what I want?
> 
> Many thanks,
> Phil
> 
> 
> 
> -- 
> View this message in context: 
> http://www.nabble.com/Indexing-the-same-data-in-many-records-tp21448465p21448465.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to