Hi ,
We are planning to use Solr for indexing the server log contents.
The expected processed log file size per day: 100 GB
We are expecting to retain these indexes for 30 days (100*30 ~ 3 TB).
Can any one provide what would be the optimal size of the index that I can
store on a single server,
down (2GB RAM sticks
> are much cheaper
> than 4GB RAM sticks $20 < $100).
>
> Ian.
>
> On Wed, Aug 5, 2009 at 1:44 PM, Silent Surfer wrote:
>
> >
> > Hi ,
> >
> > We are planning to use Solr for indexing the server
> log contents.
> > The e
t; wunder
>
> On Aug 5, 2009, at 10:08 PM, Silent Surfer wrote:
>
> >
> > Hi,
> >
> > That means we need approximately 3000 GB (Index
> Size)/24 GB (RAM) =
> > 125 servers.
> >
> > It would be very hard to convince my org to go for 125
Hi,
I am newbie to Solr. We recently started using Solr.
We are using Solr to process the server logs. We are creating the indexes for
each line of the logs, so that users would be able to do a fine grain search
upto second/ms.
Now what we are observing is , the index size that is being create
Hi,
We observed that when we use the setting "compressed=true" the index size is
around 0.66 times the actual log file, where as if we do not use any
compressed=true setting, the index size is almost as much as 2.6 times.
Our sample solr document size is approximately 1000 bytes. In addition to
Hi,
If you are still not went live already, I would suggest to use the long instead
of date field. According to our testing, search based on date fields are very
slow when compared to search based on long field.
You can use System.getTimeInMillis() to get the time
When showing it to the user, a
Hi ,
Currently we are using Solr 1.3 and we have the following requirement.
As we need to process very high volumes of documents (of the order of 400 GB
per day), we are planning to separate indexer(s) and searcher(s), so that there
won't be performance hit.
Our idea is to have have a set of s
Hi,
Is there any way to dynamically point the Solr servers to an index/data
directories at run time?
We are generating 200 GB worth of index per day and we want to retain the index
for approximately 1 month. So our idea is to keep the first 1 week of index
available at anytime for the users i.
Hi,
Thank you Michael and Chris for the response.
Today after the mail from Michael, we tested with the dynamic loading of cores
and it worked well. So we need to go with the hybrid approach of Multicore and
Distributed searching.
As per our testing, we found that a Solr instance with 20 GB o
traneous storage from Solr
> -- the stored data
> is mixed in with the index data and so it slows down
> searches.
> You could also put all 200G onto one Solr instance rather
> than 10 for >7days
> data, and accept that those searches will be slower.
>
> Micha
Hi Lici,
You may want to try the following snippet
---
SolrServer solr = new
CommonsHttpSolrServer("http://localhost:8983/solr";); //
ModifiableSolrParams params = new ModifiableSolrParams();
params.set("wt", "json"); // Can be json,standard
Hi,
I am new to Lucene forum and it is my first question.I need a clarification
from you.
Requirement:--1. Build a IT search tool for logs similar to
that of Splunk(Only wrt searching logs but not in terms of reporting, graphs
etc) using solr/lucene. The log files are mainly the
Hi,
Any help/pointers on the following message would really help me..
Thanks,Surfer
--- On Tue, 6/2/09, Silent Surfer wrote:
From: Silent Surfer
Subject: Questions regarding IT search solution
To: solr-user@lucene.apache.org
Date: Tuesday, June 2, 2009, 5:45 PM
Hi,
I am new to Lucene forum
Hi,
Any help/pointers on the following message would really help me..
Thanks,Surfer
--- On Tue, 6/2/09, Silent Surfer wrote:
From: Silent Surfer
Subject: Questions regarding IT search solution
To: solr-user@lucene.apache.org
Date: Tuesday, June 2, 2009, 5:45 PM
Hi,
I am new to Lucene forum
onally, I'd start with Hadoop instead of Solr. Putting logs in a
> search index is guaranteed to not scale. People were already trying
> different approaches ten years ago.
>
> wunder
>
> On 6/4/09 8:41 AM, "Silent Surfer" wrote:
>
>> Hi,
>> Any h
so that at query time you send (or distribute
> if you have to) the query to only those shards that have the data (if your
> query is for a limited time period).
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message
27;X'. Is there any preference/option available in Solr, which can be set so the
search results contains only the 3 lines above and 3 lines after the line where
the Keyword match successfully.
Thanks,
Silent Surfer
Hi Mitch,
The configuration that you have seems to be perfectly fine .
Could you please let us know what error you are seeing in the logs ?
Also, could you please confirm whether you have the
mysql-connector-java-5.1.12-bin.jar under the lib folder ?
Following is my configuration that I used a
Hi Ankit,
Try the following approach.
create a query like - [01/01/1900T16:00:00Z/HOUR TO 01/01/1900T18:00:00Z/HOUR ]
Solr will automatically will take care of Rounding off to the HOUR specified.
For eg:
the query - [01/01/1900T16:43:42Z/HOUR TO 01/01/1900T18:55:23Z/HOUR ]
would be equivalent t
Small typo..Corrected and sending..
the query - [01/01/1900T16:43:42Z/HOUR TO 01/01/1900T18:55:23Z/HOUR ]
would be equivalent to
[01/01/1900T16:00:00Z TO 01/01/1900T18:00:00Z]
Thx,
Tiru
- Original Message
From: Silent Surfer
To: solr-user@lucene.apache.org
Sent: Wed, March 31
20 matches
Mail list logo