Hi Guys,
Are you aware of any standard/specification (like JSR 168/286 for portals, CMIS
for CMS) for Search engines ?
Is there any such specification people are working on currently ?
Regards,
Sourav
CAUTION - Disclaimer *
This e-mail contains PRIVILEGED AND C
med out. If it's not, then
you're looking more at a query-type solution, where Solr would be
less interesting.
-- Ken
>-----Original Message-
>From: souravm
>Sent: Saturday, December 06, 2008 9:41 PM
>To: solr-user@lucene.apache.org
>Subject: Limitations of Distribute
Hi,
Any inputs on this would be really helpful. Looking for suggestions/viewpoints
from you guys.
Regards,
Sourav
-Original Message-
From: souravm
Sent: Saturday, December 06, 2008 9:41 PM
To: solr-user@lucene.apache.org
Subject: Limitations of Distributed Search
Hi,
We are
Hi,
We are planning to use Solr for processing large volume of application log
files (around ~ 10 Billions documents of size 5-6 TB).
One of the approach we are considering for the same is to use Distributed
Search extensively.
What we have in mind is distributing the log files in multiple bo
Hi All,
Though my testing I found that query performance, when it is not served from
cache, is largely depending on number of hits and concurrent number of queries.
And in both the cases the query is essentially CPU bound.
Just wondering whether we can update this somewhere in Wiki as this woul
Hi,
I have huge index files to query. On a first cut calculation it looks like I
would need around 3 boxes (each box not more than 125 M records of size 12.5GB)
for around 25 apps - so all together 75 boxes.
However the number of concurrent users would be lesser - may not be more than
20 at
Hi All,
Say I have started a new Solr server instance using the start.jar in java
command. Now for this Solr server instance when all a new Searcher would be
created ?
I am aware of following scenarios -
1. When the instance is started for autowarming a new Searcher is created. But
not sure w
olr with Hadoop
Ah sorry, I had misread your original post. 3-6M docs per hour can be
challenging.
Using the CSV loader, I've indexed 4000 docs per second (14M per hour)
on a 2.6GHz Athlon, but they were relatively simple and small docs.
On Fri, Nov 28, 2008 at 9:54 PM, souravm <[EMAIL PRO
You could also leave the indexes on multiple
boxes and use Solr's distributed search to search across them
(assuming you really didn't really need everything on a single box).
-Yonik
On Fri, Nov 28, 2008 at 7:01 PM, souravm <[EMAIL PROTECTED]> wrote:
> Hi Yonik,
>
> Let me
g job, the more it
makes sense to do in parallel. If you're not doing any link inversion
for web search, it doesn't seem like hadoop is needed for parallelism.
If you are doing web crawling, perhaps look to nutch, not hadoop.
-Yonik
On Fri, Nov 28, 2008 at 1:31 PM, souravm <[E
Hi All,
I have huge number of documents to index (say per hr) and within a hr I cannot
compete it using a single machine. Having them distributed in multiple boxes
and indexing them in parallel is not an option as my target doc size per hr
itself can be very huge (3-6M). So I am considering usi
:40 AM
To: solr-user@lucene.apache.org
Cc: souravm
Subject: Re: Sorting and JVM heap size
On Tue, Nov 25, 2008 at 7:49 AM, souravm <[EMAIL PROTECTED]<mailto:[EMAIL
PROTECTED]>> wrote:
3. Another case is - if there are 2 search requests concurrently hitting the
server, each wit
, souravm <[EMAIL PROTECTED]> wrote:
> Hi Yonik,
>
> Thanks again for the detail input.
>
> Let me try to re-confirm my understanding -
>
> 1. What you say is - if sorting is asked for a field, the same field from all
> documents, which are indexed, would be put in a me
correct.
Regards,
Sourav
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley
Sent: Monday, November 24, 2008 6:03 PM
To: solr-user@lucene.apache.org
Subject: Re: Sorting and JVM heap size
On Mon, Nov 24, 2008 at 8:48 PM, souravm <[EM
t: Re: Sorting and JVM heap size
On Mon, Nov 24, 2008 at 6:26 PM, souravm <[EMAIL PROTECTED]> wrote:
> I have indexed data of size around 20GB. My JVM memory is 1.5GB.
>
> For this data if I do a query with sort flag on (for a single field) I always
> get java out of memor
Hi,
I have indexed data of size around 20GB. My JVM memory is 1.5GB.
For this data if I do a query with sort flag on (for a single field) I always
get java out of memory exception even if the number of hit is 0. With no
sorting (or default sorting with score) the query works perfectly fine.`
I
> then reducing the results afterwards).
Have a look at http://wiki.apache.org/solr/DistributedSearch for more info.
You could also take a look at Hadoop. (http://hadoop.apache.org/)
regards,
Aleks
On Mon, 24 Nov 2008 06:24:51 +0100, souravm <[EMAIL PROTECTED]> wrote:
> Hi,
Hi,
Looking for some insight on distributed search.
Say I have an index distributed in 3 boxes and the index contains time and text
data (typical log file). Each box has index for different timeline - say Box 1
for all Jan to April, Box 2 for May to August and Box 3 for Sep to Dec.
Now if I tr
Hi,
Is there a way to specify sort criteria through Solr admin ui. I tried doing it
thorugh the query statement box but it did not work.
Regards,
Sourav
CAUTION - Disclaimer *
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the u
arately, and tossing it in as a
"plugin". We probably should be creating all these sorts of goodies
and independent modules of code that aren't "core", but that gets
fuzzy to say what's core and what isn't too.
Erik
On Nov 13, 2008, at 8:26 PM, so
Hi,
As I understand the STATS functions (Min, Max, Average, Standard Deviation
etc.) would be available in Solr 1.4.
Just wondering if they are already there in the latest trunk. Else can anyone
suggest any other tool which can be used with Solr 1.3 to achieve this
requirement ?
Regards,
Sou
oop/Hive
http://wiki.apache.org/hadoop/Chukwa
http://incubator.apache.org/pig/
On Fri, Nov 7, 2008 at 9:03 PM, souravm <[EMAIL PROTECTED]> wrote:
> Hi Guys,
>
> Here I'm struggling with to decide whether Solr would be a fitting solution
> for me. Highly appreciate you
Hi Guys,
Here I'm struggling with to decide whether Solr would be a fitting solution for
me. Highly appreciate you
The key requirements can be summarized as below -
1. Need to process very high volume of data online from log files of various
applications - around 100s of Millions of total size
Thanks Noble for your answer.
Regards,
Sourav
-Original Message-
From: Noble Paul നോബിള് नोब्ळ् [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 06, 2008 7:41 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Multicore ...
On Fri, Nov 7, 2008 at 3:28 AM, souravm <[EMAIL PROTEC
requests to other
Solr instances you specified and will merge the results.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: souravm <[EMAIL PROTECTED]>
> To: "solr-user@lucene.apache.org"
> Sent: Thursday, No
Hi,
Can I use multi core feature to have multiple indexes (That is each core would
take care of one type of index) within a single Solar instance ?
Will there be any performance impact due to this type of setup ?
Regards,
Sourav
CAUTION - Disclaimer *
This e-m
Hi,
I have a query on distributed search.
The wiki mentioned that Solr can query and merge results from an index split in
multiple shards.
My question is which server actually does the job of merging. Will there be a
separate master node/shard to the merging job ?
Also is there any performan
Hi Fergus,
Does the 6.6m doc resides on a single box (node) or multiple boxes ? Do u use
distributed search ?
Regards,
Sourav
- Original Message -
From: Fergus McMenemie <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Wed Nov 05 08:21:45 2008
Subject: Re: Large Data Set Sugge
Hi,
I'm new to Solr. Here is a query on distributed search.
I have huge volume of log files which I would like to search. Apart from
generic test search I would also like to get statistics - say each record has a
field telling request processing time and I would like to get average of
processi
29 matches
Mail list logo