Facet Query and Query

2008-11-25 Thread Jae Joo
> > I am having some trouble to utilize the facet Query. As I know that the > facet Query has better performance that simple query (q). > Here is the example. > > > http://localhost:8080/test_solr/select?q=*:*&facet=true&fq=state:CA&facet.mincount=1&facet.field=city&facet.field=sector&facet.limit=-

Re: newbie question on SOLR distributed searches with many "shards"

2008-11-25 Thread Noble Paul നോബിള്‍ नोब्ळ्
anything that is passed as a request parameter can be put into the SearchHandlers defaults or invariants section . This is equivalent to passing the shard url in the request However this expects that you may need to setup a loadbalancer if a shard hhos more than one host On Wed, Nov 26, 2008 a

Re: CachedSqlEntityProcessor's purpose

2008-11-25 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Tue, Nov 25, 2008 at 11:35 PM, Amit Nithian <[EMAIL PROTECTED]> wrote: > Thanks for the responses. Few follow-ups: > 1) It seems that the CachedSQLEntityProcessor performs the where clause in > memory on the cache. Is this cache an in memory RDBMS or maps? It is a hashmap in memory > 2) In the e

Re: copyField stored values question

2008-11-25 Thread Yonik Seeley
On Tue, Nov 25, 2008 at 9:24 PM, Michael Henson <[EMAIL PROTECTED]> wrote: > I set the indexed fields to be stored so that I could see what exactly > my custom types' filters produce. In the Analyzer utility in the Admin > webapp seems to apply the filters properly. However, query results > against

copyField stored values question

2008-11-25 Thread Michael Henson
Hello, I am using copyField to send the raw name of an entity into different fields for indexing: # schema.xml snippet I set the indexed fields to be stored so that I could see what exactly my custom types' filters produce. In the Analyzer utility in the Admin webapp see

Re: Increased garbage with Solr 1.3?

2008-11-25 Thread Walter Underwood
Searching. No facets, but fuzzy matching. --wunder On 11/25/08 5:08 PM, "Yonik Seeley" <[EMAIL PROTECTED]> wrote: > On Tue, Nov 25, 2008 at 7:56 PM, Walter Underwood > <[EMAIL PROTECTED]> wrote: >> We are moving from Solr 1.1 to 1.3, and have noticed that 1.3 is working >> the garbage collector a

Re: Increased garbage with Solr 1.3?

2008-11-25 Thread Yonik Seeley
On Tue, Nov 25, 2008 at 7:56 PM, Walter Underwood <[EMAIL PROTECTED]> wrote: > We are moving from Solr 1.1 to 1.3, and have noticed that 1.3 is working > the garbage collector a lot more. Has anyone else seen this? During indexing or searching? Indexing uses the SolrDocument class as an intermedia

Increased garbage with Solr 1.3?

2008-11-25 Thread Walter Underwood
We are moving from Solr 1.1 to 1.3, and have noticed that 1.3 is working the garbage collector a lot more. Has anyone else seen this? wunder

Re: port of Nutch CommonGrams to Solr for help with slow phrase queries

2008-11-25 Thread Norberto Meijome
On Wed, 26 Nov 2008 10:08:03 +1100 Norberto Meijome <[EMAIL PROTECTED]> wrote: > We didn't notice any severe performance hit but : > - data set isn't huge ( ca 1 MM docs). > - reindexed nightly via DIH from MS-SQL, so we can use a separate cache layer > to lower the number of hits to SOLR. To mak

Re: port of Nutch CommonGrams to Solr for help with slow phrase queries

2008-11-25 Thread Norberto Meijome
On Mon, 24 Nov 2008 13:31:39 -0500 "Burton-West, Tom" <[EMAIL PROTECTED]> wrote: > The approach to this problem used by Nutch looks promising. Has anyone > ported the Nutch CommonGrams filter to Solr? > > "Construct n-grams for frequently occuring terms and phrases while > indexing. Optimize phr

Re: Unknown field error using JDBC

2008-11-25 Thread Jon Baer
This sounds exactly same issue I had when going from 1.3 to 1.4 ... it sounds like DIH is trying to automagically figure out the columns :-\ - Jon On Nov 25, 2008, at 6:37 AM, Joel Karlsson wrote: Hello, I get Unknown field error when I'm indexing an Oracle dB. I've reduced the number of

Stuck threads on Weblogic

2008-11-25 Thread Alexander Ramos Jardim
Hello guys, I am getting some stuck threads on my application when it connects to Solr. The stuck threads occur in an even time, in such a way that each 3 days the app is online it hangs up the entire cluster. I don't know if there's any direct relation to Solr, but I get the following exception

Spellcheck for phrase queries

2008-11-25 Thread Manepalli, Kalyan
Hi, I am trying to implement a spell check functionality on a particular field. I need to do a complete phrase spell check when user enters multiple words. For eg: If the user enters "great Hyat" the current implementation would suggest "great Hyatt", just correcting the word "hyat

Re: Using Solr for indexing emails

2008-11-25 Thread Timo Sirainen
On Tue, 2008-11-25 at 20:45 +0530, Shalin Shekhar Mangar wrote: > On Mon, Nov 24, 2008 at 11:51 PM, Timo Sirainen <[EMAIL PROTECTED]> wrote: > > > > > DIH seems to be about Solr pulling data into it from an external source. > > That's not really practical with Dovecot since there's no central > >

Re: Keyword extraction

2008-11-25 Thread Ryan McKinley
lots of approaches out there... the easiest "off the shelf" method would be to use the MoreLikeThisHandler and get the top "interesting" terms; http://wiki.apache.org/solr/MoreLikeThisHandler ryan On Nov 25, 2008, at 2:09 PM, Plaatje, Patrick wrote: Hi all, Strugling with a question I r

Keyword extraction

2008-11-25 Thread Plaatje, Patrick
Hi all, Strugling with a question I recently got from a collegue: is it possible to extract keywords from indexed content? In my opinion it should be possible to find out on what words the ranking of the indexed content is the highest (Lucene or Solr), but have no clue where to begin. Anyone havi

newbie question on SOLR distributed searches with many "shards"

2008-11-25 Thread Gerald De Conto
I wasn't able to find examples/anything via google so thought I'd ask: Say I want to implement a solution using distributed searches with many "shards" in SOLR 1.3.0. Also, say there are too many shards to pass in via the URL (dozens, hundreds, whatever) Is there a way to specify in solrcon

Re: [VOTE] Community Logo Preferences

2008-11-25 Thread Thomas Dowling
https://issues.apache.org/jira/secure/attachment/12394266/apache_solr_b_red.jpg https://issues.apache.org/jira/secure/attachment/12394314/apache_soir_001.jpg https://issues.apache.org/jira/secure/attachment/12394264/apache_solr_a_red.jpg

Re: [VOTE] Community Logo Preferences

2008-11-25 Thread Brendan Grainger
https://issues.apache.org/jira/secure/attachment/12394264/apache_solr_a_red.jpg https://issues.apache.org/jira/secure/attachment/12394282/solr2_maho_impression.png https://issues.apache.org/jira/secure/attachment/12394266/apache_solr_b_red.jpg https://issues.apache.org/jira/secure/attachment/12394

Re: [VOTE] Community Logo Preferences

2008-11-25 Thread Chris Harris
https://issues.apache.org/jira/secure/attachment/12394282/solr2_maho_impression.png https://issues.apache.org/jira/secure/attachment/12394475/solr2_maho-vote.png

Re: CachedSqlEntityProcessor's purpose

2008-11-25 Thread Amit Nithian
Thanks for the responses. Few follow-ups: 1) It seems that the CachedSQLEntityProcessor performs the where clause in memory on the cache. Is this cache an in memory RDBMS or maps? 2) In the example, there were two use cases, one that is like query="select * from Y where xid=${X.ID}" and another whe

Re: matching exact terms

2008-11-25 Thread Ryan McKinley
On Nov 25, 2008, at 11:40 AM, Brian Whitman wrote: This is probably severe user error, but I am curious about how to index docs to make this query work: happy birthday to return the doc with n_name:"Happy Birthday" before the doc with n_name:"Happy Birthday, Happy Birthday" . As it is now, t

Re: Sorting and JVM heap size ....

2008-11-25 Thread Shalin Shekhar Mangar
On Tue, Nov 25, 2008 at 9:37 PM, souravm <[EMAIL PROTECTED]> wrote: > > Could you please explain a bit more on how the new searcher can double the > memory ? > Take a look at slide 13 of Yonik's presentation available at http://people.apache.org/~yonik/ApacheConEU2006/Solr.ppt Each searcher in S

matching exact terms

2008-11-25 Thread Brian Whitman
This is probably severe user error, but I am curious about how to index docs to make this query work: happy birthday to return the doc with n_name:"Happy Birthday" before the doc with n_name:"Happy Birthday, Happy Birthday" . As it is now, the latter appears first for a query of n_name:"happy birt

RE: Sorting and JVM heap size ....

2008-11-25 Thread souravm
Hi Shalin, Thanks for the clarifications. Could you please explain a bit more on how the new searcher can double the memory ? Based on your explanation, when a new set of documents gets committed a new searcher is created. So what I understand is whenever a update/delete query and search quer

Re: CachedSqlEntityProcessor's purpose

2008-11-25 Thread Noble Paul നോബിള്‍ नोब्ळ्
every row emitted by an outer entity results in a new Sql query in the inner entity. (yes 50 queries on inner entity)So,if you wish to join multiple tables then nested entities is the way to go. CachedSqlEntityProcessor is meant to help you reduce the number of queries fired on sub-entities.

Re: port of Nutch CommonGrams to Solr for help with slow phrase queries

2008-11-25 Thread Shalin Shekhar Mangar
Hi Tom, I don't think anybody has worked on adding this to Solr yet. Do you mind opening a jira issue? On Tue, Nov 25, 2008 at 12:01 AM, Burton-West, Tom <[EMAIL PROTECTED]>wrote: > Hello all, > > We are having problems with extremely slow phrase queries when the > phrase query contains a common

Re: Using Solr for indexing emails

2008-11-25 Thread Shalin Shekhar Mangar
On Mon, Nov 24, 2008 at 11:51 PM, Timo Sirainen <[EMAIL PROTECTED]> wrote: > > DIH seems to be about Solr pulling data into it from an external source. > That's not really practical with Dovecot since there's no central > repository of any kind of data, so there's no way to know what has > changed

Re: CachedSqlEntityProcessor's purpose

2008-11-25 Thread Shalin Shekhar Mangar
On Tue, Nov 25, 2008 at 1:52 PM, Amit Nithian <[EMAIL PROTECTED]> wrote: > > I like the concept of having multiple entity blocks for clarity but why > wouldn't I have (for DB efficiency), the following as one entity's SQL > statement "select * from X,Y where x.id=y.xid" and have two fields > point

Re: Analyzing CSV phrase fields

2008-11-25 Thread Yonik Seeley
The easiest solution would be to create the documents you send to solr with multiple keywords fields... they will be separated by a positionIncrement so a phrase query won't see yankees adjacent to cleveland. If you can't do that, then perhaps patch PatternTokenizer filter to put a larger position

Re: Sorting and JVM heap size ....

2008-11-25 Thread Shalin Shekhar Mangar
On Tue, Nov 25, 2008 at 7:49 AM, souravm <[EMAIL PROTECTED]> wrote: > > 3. Another case is - if there are 2 search requests concurrently hitting > the server, each with sorting on the same 20 character date field, then also > it would need 2x2GB memory. So if I know that I need to support at least

Re: [VOTE] Community Logo Preferences

2008-11-25 Thread Marcus Stratmann
https://issues.apache.org/jira/secure/attachment/12394282/solr2_maho_impression.png https://issues.apache.org/jira/secure/attachment/12394376/solr_sp.png https://issues.apache.org/jira/secure/attachment/12393936/logo_remake.jpg https://issues.apache.org/jira/secure/attachment/12394264/apache_solr_

Re: [VOTE] Community Logo Preferences

2008-11-25 Thread Shalin Shekhar Mangar
https://issues.apache.org/jira/secure/attachment/12394266/apache_solr_b_red.jpg https://issues.apache.org/jira/secure/attachment/12394070/sslogo-solr-finder2.0.png https://issues.apache.org/jira/secure/attachment/12394282/solr2_maho_impression.png https://issues.apache.org/jira/secure/attachment/12

Re: Schema Design Guidance

2008-11-25 Thread Shalin Shekhar Mangar
Even if you go for the 400,000 documents way, the size of data and number of unique tokens would remain the same. With your data size, you should think about sharding and distributed search. Is the availability of a product a boolean value or the number of items? To make sure that you don't need to

Re: Unknown field error using JDBC

2008-11-25 Thread Joel Karlsson
I actually don't know which version I was using, but now I've upgraded to 1.3 and it works like a charm!! Thanks a lot! 2008/11/25 Noble Paul നോബിള്‍ नोब्ळ् <[EMAIL PROTECTED]> > which version of DIH are you using? > > On Tue, Nov 25, 2008 at 5:24 PM, Joel Karlsson <[EMAIL PROTECTED]> > wrote: >

Re: solr internationalization support

2008-11-25 Thread Shalin Shekhar Mangar
On Mon, Nov 24, 2008 at 7:56 PM, rameshgalla <[EMAIL PROTECTED]>wrote: > > 1)Which languages solr supports out-of-the box other than english? Solr does not know about any languages. It will apply whatever analyzers you specify in the schema.xml for that field type. > 2)What are the analyzers(s

Re: Unknown field error using JDBC

2008-11-25 Thread Noble Paul നോബിള്‍ नोब्ळ्
which version of DIH are you using? On Tue, Nov 25, 2008 at 5:24 PM, Joel Karlsson <[EMAIL PROTECTED]> wrote: > Hello, > > I get Unknown field error when I'm indexing an Oracle dB. I've reduced the > number of fields/columns in order to troubleshoot. If I change the uniqeKey > to timestamp (for ex

Re: Using Solr for indexing emails

2008-11-25 Thread Norberto Meijome
On Tue, 25 Nov 2008 03:59:31 +0200 Timo Sirainen <[EMAIL PROTECTED]> wrote: > > would it be faster to say q=user: AND highestuid:[ * TO *] ? > > Now that I read again what fq really did, yes, sounds like you're right. you may want to compare them both to see which one is better... I just went

Unknown field error using JDBC

2008-11-25 Thread Joel Karlsson
Hello, I get Unknown field error when I'm indexing an Oracle dB. I've reduced the number of fields/columns in order to troubleshoot. If I change the uniqeKey to timestamp (for example) and create a dynamic field the indexing works fine, except the id-field is empty. --data-config.xml

Unknown field error using JDBC

2008-11-25 Thread Joel Karlsson
Hello, I get Unknown field error when I'm indexing an Oracle dB. I've reduced the number of fields/columns in order to troubleshoot. If I change the uniqeKey to timestamp (for example) and create a dynamic field the indexing works fine, except the id-field is empty. --data-config.xml

CachedSqlEntityProcessor's purpose

2008-11-25 Thread Amit Nithian
I am starting to look at Solr's Data Import Handler framework and am quite impressed with it so far. My question is in trying to reduce the number of SQL queries issued to the database and saw this entity processor. In the following example: I like the concept of having multiple entit