Hi John,

Have you considered buying an existing commercial product that delivers
what you want (searching over log files / maybe monitoring)? It may be
cheaper than developing it... http://www.splunk.com/product

Just a disclaimer: I'm not related to the company or product so if you
need any information have a look at their web site or contact them.

Regards,
Daniel

  

-----Original Message-----
From: Jibo John [mailto:jiboj...@mac.com] 
Sent: 23 July 2009 18:44
To: solr-user@lucene.apache.org
Subject: Re: Storing string field in solr.ExternalFieldFile type

Thanks for the quick response, Otis.

We have been able to achieve the ratio of 2 with different settings,
however, considering the huge volume of the data that we need to deal
with - 600 GB of data per day, and, we need to keep it in the index for
3 days - we're looking at all possible ways to reduce the index size
further.
Will definitely keep exploring the straightforward things and see if we
can find a better setting.


Thanks,
-Jibo

On Jul 23, 2009, at 9:49 AM, Otis Gospodnetic wrote:

> I'm not sure if there is a lot of benefit from storing the literal 
> values in that external file vs. directly in the index.  There are a 
> number of things one should look at first, as far as performance is 
> concerned - JVM settings, cache sizes, analysis, etc.
>
> For example, I have one index here that is 9 times the size of the 
> original data because of how its fields are analyzed.  I can change 
> one analysis-level setting and make that ratio go down to 2.  So I'd 
> look at other, more straight forward things first.  There is a Wiki 
> page either on Solr or Lucene Wiki dedicated to various search 
> performance tricks.
>
> Otis
> --
> Sematext is hiring: http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
>> From: Jibo John <jiboj...@mac.com>
>> To: solr-user@lucene.apache.org
>> Sent: Thursday, July 23, 2009 12:08:26 PM
>> Subject: Re: Storing string field in solr.ExternalFieldFile type
>>
>> Thanks for the response, Eric.
>>
>> We have seen that size of the index has a direct impact on the search

>> speed, especially when the index size is in GBs, so trying all 
>> possible ways to keep the index size as low as we can.
>>
>> We thought solr.ExternalFileField type would help to keep the index 
>> size low by storing a text field out side of the index.
>>
>> Here's what we were planning: initially, all the fields except the 
>> solr.ExternalFileField type field will be queried and will be 
>> displayed to the end user. . There will be subsequent calls from the 
>> UI  to pull the solr.ExternalFileField field that will be loaded in a

>> lazy manner.
>>
>> However, realized that solr.ExternalFileField only supports float 
>> type, however, the data that we're planning to keep as an external 
>> field is a string type.
>>
>> Thanks,
>> -Jibo
>>
>>
>>
>> On Jul 22, 2009, at 1:46 PM, Erick Erickson wrote:
>>
>>> Hoping the experts chime in if I'm wrong, but....
>>> As far as I know, while storing a field increases the size of an 
>>> index, it doesn't have much impact on the search speed. Which you 
>>> could pretty easily test by creating the index both ways and firing 
>>> off some timing queries and comparing..... Although it would be time

>>> consuming...
>>>
>>> I believe there's some info on the Lucene Wiki about this, but my 
>>> memory isn't what it used to be.
>>>
>>> Erick
>>>
>>>
>>> On Tue, Jul 21, 2009 at 2:42 PM, Jibo John wrote:
>>>
>>>> We're in the process of building a log searcher application.
>>>>
>>>> In order to reduce the index size to improve the query performance,

>>>> we're exploring the possibility of having:
>>>>
>>>> 1. One field for each log line with 'indexed=true & stored=false'  
>>>> that
>>>> will be used for searching
>>>> 2. Another field for each log line of type solr.ExternalFileField 
>>>> that will be used only for display purpose.
>>>>
>>>> We realized that currently solr.ExternalFileField supports only 
>>>> float type.
>>>>
>>>> Is there a way we can override this to support string type? Any 
>>>> issues with this approach?
>>>>
>>>> Any ideas are welcome.
>>>>
>>>>
>>>> Thanks,
>>>> -Jibo
>>>>
>>>>
>>>>
>


http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on 
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
                                        

Reply via email to