date:20100830

Re: anybody using solr with Cassandra?

2010-08-30 Thread Siju George

We will be suing Solr for indexing and Cassandra/Membase/Hbase instead of a database. That is the idea now unless some body gives a better solution :-) thanks --Siju On Tue, Aug 31, 2010 at 11:39 AM, Amit Nithian wrote: > I am curious about this too.. are you talking about using HBase/Cassandr

Re: anybody using solr with Cassandra?

2010-08-30 Thread Amit Nithian

I am curious about this too.. are you talking about using HBase/Cassandra as an aux store of large data or using Cassandra to store the actual lucene index (as in LuCandra)? On Mon, Aug 30, 2010 at 11:06 PM, Siju George wrote: > Thanks a million Nick, > > We are currently debating whether we sho

Re: anybody using solr with Cassandra?

2010-08-30 Thread Siju George

Thanks a million Nick, We are currently debating whether we should use cassandra or membase or hbase with solr. Do you have anything to contribute as advice to us? Thanks again :-) --Siju On Tue, Aug 31, 2010 at 5:15 AM, nickdos wrote: > > Yes, we are Cassandra. There is nothing much to say r

How to use protwords.txt

2010-08-30 Thread Shuai Weng

Hey, Currently we have indexed some biological fulltext files. I was wondering how to config the schema.xml such that the gene names (eg, 'met1', 'met2', 'met3' etc) won't be stemmed into the same word ('met'). I added these gene names into the protwords.txt file but it doesn't seem to work.

Deploying Solr 1.4.1 in JbossAs 6

2010-08-30 Thread Bruno Adam Osiek

Has anyone managed to deploy Solr 1.4.1 into Jboss AS 6? If yes could you provide the required steps for deployment? Thanks, Bruno

Re: Hardware Specs Question

2010-08-30 Thread Lance Norskog

There are synchronization points, which become chokepoints at some number of cores. I don't know where they cause Lucene to top out. Lucene apps are generally disk-bound, not CPU-bound, but yours will be. There are so many variables that it's really not possible to give any numbers. Lance On Mon,

Re: Hardware Specs Question

2010-08-30 Thread Amit Nithian

Lance, makes sense and I have heard about the long GC times on large heaps but I personally haven't experienced a slowdown but that doesn't mean anything either :-). Agreed that tuning the SOLR caching is the way to go. I haven't followed all the solr/lucene changes but from what I remember there

Re: Hardware Specs Question

2010-08-30 Thread 朱炎詹

I am also curious as Amit does. Can you make an example about the garbage collection problem you mentioned? - Original Message - From: "Lance Norskog" To: Sent: Tuesday, August 31, 2010 9:14 AM Subject: Re: Hardware Specs Question It generally works best to tune the Solr caches an

Re: Affinity ranking

2010-08-30 Thread Lance Norskog

This is a mass batch-processing task, rather than a search task. Mahout is the right Apache project for implementing this. It would then create a set of (document->document list). You could then add this to a Solr index. (And invert the graph and add those lists.) It might be possible to do this w

Re: Highlighting, return the matched terms only

2010-08-30 Thread Chris Hostetter

: how could I have the highlighting component return only the terms that were : matched, without any surrounding text ? I'm not a Highlighter expert, but this is somethign that certainly *sounds* like it should be easy. I took a shot at it and this is hte best i could come up with... http://lo

Re: Hardware Specs Question

2010-08-30 Thread Lance Norskog

It generally works best to tune the Solr caches and allocate enough RAM to run comfortably. Linux & Windows et. al. have their own cache of disk blocks. They use very good algorithms for managing this cache. Also, they do not make long garbage collection passes. On Mon, Aug 30, 2010 at 5:48 PM, Am

Re: Hardware Specs Question

2010-08-30 Thread Amit Nithian

Lance, Thanks for your help. What do you mean by that the OS can keep the index in memory better than Solr? Do you mean that you should use another means to keep the index in memory (i.e. ramdisk)? Is there a generally accepted heap size/index size that you follow? Thanks Amit On Mon, Aug 30, 20

Re: edismax pf2 and ps

2010-08-30 Thread Ron Mayer

Short summary: * Multiple simultaneous phrase boosts with different ps2 parameters are working very nicely for me on a few million doc QA system. * I've submitted an updated patch to Jira incorporating feedback from the jira comments. Will be testing it more this week. https://issues

Re: Hardware Specs Question

2010-08-30 Thread Lance Norskog

The price-performance knee for small servers is 32G ram, 2-6 SATA disks on a raid, 8/16 cores. You can buy these servers and half-fill them, leaving room for expansion. I have not done benchmarks about the max # of processors that can be kept busy during indexing or querying, and the total numbers

Hardware Specs Question

2010-08-30 Thread Amit Nithian

Hi all, I am curious to know get some opinions on at what point having more CPU cores shows diminishing returns in terms of QPS. Our index size is about 8GB and we have 16GB of RAM on a quad core 4 x 2.4 GHz AMD Opteron 2216. Currently I have the heap to 8GB. We are looking to get more servers to

Re: anybody using solr with Cassandra?

2010-08-30 Thread nickdos

Yes, we are Cassandra. There is nothing much to say really, it just works. Note we are SOLR generating indexes using Java & SolrJ (embedded mode) and reading data out of Cassandra with Java. Index generation is fast. -- View this message in context: http://lucene.472066.n3.nabble.com/anybody-usi

Re: Search Results optimization

2010-08-30 Thread Lance Norskog

Hi- Here is how it works: Lucene uses TF/DF as the "relevance" formula. This means "term frequency divided by document frequency", or the number of times a term appears in one document over the number of documents that term appears in. This is the basic idea: suppose there are 10 documents say "s

Re: Cutom filter implementation, advice needed

2010-08-30 Thread Ingo Renner

Am 26.08.2010 um 21:07 schrieb Ingo Renner: For those interested and for "the" Google, I found a working solution myself. The QParser is now down to this: public AccessFilterQParser(String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) { super(q

Custom scoring

2010-08-30 Thread Brad Kellett

Hi all, I'm looking for examples or pointers to some info on implementing custom scoring in solr/lucene. Basically, what we're looking at doing is to augment the score from a dismax query with some custom signals based on data in fields from the row initially matched. There will be several of t

Distance sorting with spatial filtering

2010-08-30 Thread Scott K

The new spatial filtering (SOLR-1586) works great and is much faster than fq={!frange. However, I am having problems sorting by distance. If I try GET 'http://localhost:8983/solr/select/?q=*:*&sort=dist(2,latitude,longitude,0,0)+asc' I get an error: Error 400 can not sort on unindexed field: dist(

Re: RTP Apache Lucene/Solr Meetup Sept. 21

2010-08-30 Thread Moises Muratalla

please come to the Southern California area On Mon, Aug 30, 2010 at 1:14 PM, Grant Ingersoll wrote: > I'm pleased to announce the very first ever RTP area (Raleigh, Durham, > Chapel Hill NC) Lucene/Solr meetup on Sept. 21. The event will be held at > Lulu Press and co-sponsored by Lucid Imaginat

RTP Apache Lucene/Solr Meetup Sept. 21

2010-08-30 Thread Grant Ingersoll

I'm pleased to announce the very first ever RTP area (Raleigh, Durham, Chapel Hill NC) Lucene/Solr meetup on Sept. 21. The event will be held at Lulu Press and co-sponsored by Lucid Imagination. To learn more and RSVP, please see http://www.meetup.com/RTP-Apache-Solr-Lucene-Meetup/ Hope to se

Re: DIH - deleting documents, high performance (delta) imports, and passing parameters

2010-08-30 Thread Tommy Chheng

Thanks for the section on "Passing "parameters" to DIH config:" I'm going to try the parameter passing to allow the DIH to index different DBs based on the system environment(local dev machine or production machine) @tommychheng Programmer and UC Irvine Graduate Student Find a great grad sch

Re: Updating document without removing fields

2010-08-30 Thread Max Lynch

Thanks Lance. I have decided to just put all of my processing on a bigger server along with solr. It's too bad, but I can manage. -Max On Sun, Aug 29, 2010 at 9:59 PM, Lance Norskog wrote: > No. Document creation is all-or-nothing, fields are not updateable. > > I think you have to filter all

Re: Multiple passes with WordDelimiterFilterFactory

2010-08-30 Thread Shawn Heisey

On 8/30/2010 9:01 AM, Shawn Heisey wrote: On 8/29/2010 2:17 PM, Erick Erickson wrote: <<>> Try putting this after any instances of, say, WhiteSpaceTokenizerFactory in your analyzser definition, and I believe you'll see that this is not true. At least looking at this in the analysis page from S

Re: Multiple passes with WordDelimiterFilterFactory

2010-08-30 Thread Shawn Heisey

On 8/29/2010 2:17 PM, Erick Erickson wrote: <<>> Try putting this after any instances of, say, WhiteSpaceTokenizerFactory in your analyzser definition, and I believe you'll see that this is not true. At least looking at this in the analysis page from SOLR admin sure doesn't seem to support that

Spatial query

2010-08-30 Thread Anthony Brazton

Hallo everyone, I installed the JTeam solr spatial plugin into Solr 1.4. It seems to work fine except that I am unable to get the calculated distance field back. q={!spatial lat=49.294854 long=8.36869 radius=100 unit=km calc=arc threadCount=2}*:* fl=geo_distance Any help would greatly be appreci

Re: Filter Query question

2010-08-30 Thread Eric Grobler

Hi Grant, Thanks for the explanation. Regards ericz On Mon, Aug 30, 2010 at 3:22 PM, Grant Ingersoll wrote: > > On Aug 30, 2010, at 7:20 AM, Eric Grobler wrote: > > > Hi Solr Community > > > > If you use a filter like: > > q=*:* > > fq=make:Volkswagen > > > > and then the next query is: > >

Expanded Synonyms + phrase search

2010-08-30 Thread Xavier Schepler

Hi, several documents from my index contain the phrase : "PS et". However, PS is expanded to "parti socialiste" and a phrase search for "PS et" fails. A phrase search for "parti socialiste et" succeeds. Can I have both queries working ? Here's the field type :

Affinity ranking

2010-08-30 Thread Ukyo Virgden

Hi, Is there any implementation in solr or lucene for "affinity ranking"? I've been doing some research for content based ranking models and came across the paper "Improving search results using affinity Graph" http://research.microsoft.com/apps/pubs/default.aspx?id=67818 Any thoughts? Cheers Uk

Re: JVM GC is very frequent.

2010-08-30 Thread Grant Ingersoll

Some of it will also depend on things like your caches, heap size, etc. -Grant On Aug 26, 2010, at 12:37 AM, Chengyang wrote: > We have about 500million documents are indexed.The index size is aobut 10G. > Running on a 32bit box. During the pressure testing, we monitered that the > JVM GC is

Re: Filter Query question

2010-08-30 Thread Grant Ingersoll

On Aug 30, 2010, at 7:20 AM, Eric Grobler wrote: > Hi Solr Community > > If you use a filter like: > q=*:* > fq=make:Volkswagen > > and then the next query is: > q=blue > fq=make:Volkswagen > > will Solr use the filter cache before the main query, or only after a "blue" > subset? The firs

DIH - deleting documents, high performance (delta) imports, and passing parameters

2010-08-30 Thread Ephraim Ofir

After wasting a few days navigating the somewhat uncharted and murky waters of DIH, thought I'd share my insights with the community to save other newbies time, so here goes... First off, this is not to say DIH is bad, I think it's great and it works really well for my uses, but it has a few undoc

Capturing 'SQLException' in Solr-DataImport for my Java code

2010-08-30 Thread kishan

HI all, iam using solr 1.4.0 with java. recently i observed in my solr logs , Because of the invalid userName i got java.sql.SQLException: Access denied for user '1234'@'localhost i resolved this but iam not able to capture this error in my code so that to throw a Proper message to the user . h

Filter Query question

2010-08-30 Thread Eric Grobler

Hi Solr Community If you use a filter like: q=*:* fq=make:Volkswagen and then the next query is: q=blue fq=make:Volkswagen will Solr use the filter cache before the main query, or only after a "blue" subset? In other words will this query make more sense? q=(blue) AND (make:Volkswagen)

Re: Cutom filter implementation, advice needed

2010-08-30 Thread Ingo Renner

Am 26.08.2010 um 21:07 schrieb Ingo Renner: Hi again, > I implemented a custom filter and am using it through a QParserPlugin. I'm > wondering however, whether my implementation is that clever yet... > > Here's my QParser; I'm wondering whether I should apply the filter to all > documents in

Re: anybody using solr with Cassandra?

Re: anybody using solr with Cassandra?

Re: anybody using solr with Cassandra?

How to use protwords.txt

Deploying Solr 1.4.1 in JbossAs 6

Re: Hardware Specs Question

Re: Hardware Specs Question

Re: Hardware Specs Question

Re: Affinity ranking

Re: Highlighting, return the matched terms only

Re: Hardware Specs Question

Re: Hardware Specs Question

Re: edismax pf2 and ps

Re: Hardware Specs Question

Hardware Specs Question

Re: anybody using solr with Cassandra?

Re: Search Results optimization

Re: Cutom filter implementation, advice needed

Custom scoring

Distance sorting with spatial filtering

Re: RTP Apache Lucene/Solr Meetup Sept. 21

RTP Apache Lucene/Solr Meetup Sept. 21

Re: DIH - deleting documents, high performance (delta) imports, and passing parameters

Re: Updating document without removing fields

Re: Multiple passes with WordDelimiterFilterFactory

Re: Multiple passes with WordDelimiterFilterFactory

Spatial query

Re: Filter Query question

Expanded Synonyms + phrase search

Affinity ranking

Re: JVM GC is very frequent.

Re: Filter Query question

DIH - deleting documents, high performance (delta) imports, and passing parameters

Capturing 'SQLException' in Solr-DataImport for my Java code

Filter Query question

Re: Cutom filter implementation, advice needed

36 matches

Site Navigation

Mail list logo

Footer information