Re: Auto-suggest internal terms

2010-06-03 Thread Michael Kuhlmann
The only solution without "doing any custom work" would be to perform a normal query for each suggestion. But you might get into performance troubles with that, because suggestions are typically performed much more often than complete searches. The much faster solution that needs own work would be

Re: Importing large datasets

2010-06-03 Thread Erik Hatcher
Frankly, if you can create a script that'll turn your data into valid CSV, that might be the easiest, quickest way to ingest your data. Pragmatic, at least. Avoids the complexity of DIH, allows you to script the export from your DB in the most efficient manner you can, and so on. Solr's

Re: Using solrJ to get all fields in a particular schema/index

2010-06-03 Thread Rakhi Khatwani
Hi, Oh okay. Thanks a ton On Tue, May 25, 2010 at 11:15 PM, Chris Hostetter wrote: > > :Is there any way to get all the fields (irrespective of > whether > : it contains a value or null) in solrDocument. > > no. a document only has "Field" instances for the fields which it has >

Re: Importing large datasets

2010-06-03 Thread Grant Ingersoll
On Jun 2, 2010, at 10:30 PM, Blargy wrote: > Whats more efficient a batch size of 1000 or -1 for MySQL? Is this why its > so slow because I am using 2 different datasources? > By batch size, I meant the number of docs sent from the client to Solr. MySQL Batch Size is broken. The only thing th

Re: Error loading class 'solr.HTMLStripStandardTokenizerFactory'

2010-06-03 Thread Koji Sekiguchi
(10/06/03 13:46), Terance Dias wrote: Hi, I'm trying to use the field collapsing feature. For that I need to take a checkout of the trunk and apply the patch available at https://issues.apache.org/jira/browse/SOLR-236 When I take a checkout and run the example-DIH, I get following error in brows

Re: Highlighting arbitrary text without really indexing it

2010-06-03 Thread Binesh Gummadi
Hi, The only related topic I found is the following linkand played with FieldAnalysisRequestHandler. As it expects the field value as an input, this is not really useful. I DO NOT want to q

Re: Auto-suggest internal terms

2010-06-03 Thread Andrzej Bialecki
On 2010-06-03 09:56, Michael Kuhlmann wrote: > The only solution without "doing any custom work" would be to perform a > normal query for each suggestion. But you might get into performance > troubles with that, because suggestions are typically performed much > more often than complete searches.

Deleted documents appearing in facet fields

2010-06-03 Thread Rakhi Khatwani
Hi, I have a schema with id as one of the fields. i index some documents (by adding and deleting some documents). when i perform faceting on all documents(q=*:*) with facet.field=id, i even get those id's for which the document is deleted for example: (025_null,026_null are deleted

Re: Deleted documents appearing in facet fields

2010-06-03 Thread Ahmet Arslan
> Hi, >        I have a schema with id as > one of the fields. i index some documents > (by adding and deleting some documents). > when i perform faceting on all documents(q=*:*) with > facet.field=id, i even > get those id's for which the document is deleted > > for example: (025_null,026

Re: Auto-suggest internal terms

2010-06-03 Thread Michael Kuhlmann
Am 03.06.2010 13:02, schrieb Andrzej Bialecki: > ..., and deploy this > index in a separate JVM (to benefit from other CPUs than the one that > runs your Solr core) Every known webserver ist multithreaded by default, so putting different Solr instances into different JVMs will be of no use. -Mich

Re: Deleted documents appearing in facet fields

2010-06-03 Thread Rakhi Khatwani
Hi Ahmet, but i use solrj to commit documents. and there is no commit method which allows you to mention expungeDeletes. BTW i am using solr 1.4 Regards, Raakhi On Thu, Jun 3, 2010 at 5:03 PM, Ahmet Arslan wrote: > > Hi, > >I have a schema with id as > > one of the field

Re: Deleted documents appearing in facet fields

2010-06-03 Thread Ahmet Arslan
> Hi Ahmet, >               but i use > solrj to commit documents. and there is no commit > method which allows you to mention expungeDeletes. > Alternatively you can do it with SolrQuery. final SolrQuery query = new SolrQuery(); query.set("qt", "/update"); query.set("commit", true); query.set("

Highlighting Issue

2010-06-03 Thread homerlex
We have Highlighting enabled. We specify that we only want highlighting on the "body" property. When doing a query like this: (body:tester OR project_id:704) the text highlighted in the body includes any text that is "tester" and any text that is "704". Is there a way to prevent the highlightin

Re: TikaEntityProcessor not working?

2010-06-03 Thread David George
Which version of Tika do you have? There was a problem introduced somewhere between Tika 0.6 and Tika 0.7 whereby the TikaConfig method config.getParsers() was returns an empty parser list due to class loader scope issues with Solr running under an application server. There is a fix in the Tika 0

Different mergeFactor for master and slaves

2010-06-03 Thread Kris Jack
Hi everyone, I have set up a master-slave configuration where the master machine will be used primarily for indexing while the slave machines will be used for searching. I have just discovered that changing the mergeFactor allows you to effect the indexing/searching performance tradeoff. Is it

Indexing stops after exception

2010-06-03 Thread David George
I have a list of files in a database that I am indexing (it is a liferay database and the file lists are attachments). I'm encountering the following error https://issues.apache.org/jira/browse/PDFBOX-709 on one of the PDF documents and this causes indexing to stop (the TikaEntityProcessor) thro

Re: Highlighting Issue

2010-06-03 Thread Ahmet Arslan
> We have Highlighting enabled.  We specify that we only > want highlighting on > the "body" property. > > When doing a query like this: (body:tester OR > project_id:704) the text > highlighted in the body includes any text that is "tester" > and any text that > is "704". > > Is there a way to pr

Re: Deleted documents appearing in facet fields

2010-06-03 Thread Rakhi Khatwani
Thankyou So Much :) On Thu, Jun 3, 2010 at 6:47 PM, Ahmet Arslan wrote: > > Hi Ahmet, > > but i use > > solrj to commit documents. and there is no commit > > method which allows you to mention expungeDeletes. > > > > Alternatively you can do it with SolrQuery. > > final SolrQuery q

Re: Deleted documents appearing in facet fields

2010-06-03 Thread Erik Hatcher
Ahmet - this would be a good one to genericize on the SolrJ wiki. It seems there is a lot of confusion about SolrJ, where users of it often don't think they can make requests to Solr like you see URL examples of, but of course they can make requests to any request handler as you've shown.

index growing with updates

2010-06-03 Thread Nagelberg, Kallin
Hey, If I add a document to the index that already exists (same uniquekey) what is the expected behavior? I would imagine that if the document is the same then the index should not grow, but mine appears to be growing. Any ideas? Thanks, -Kallin Nagelberg

Re: Auto-suggest internal terms

2010-06-03 Thread Andrzej Bialecki
On 2010-06-03 13:38, Michael Kuhlmann wrote: > Am 03.06.2010 13:02, schrieb Andrzej Bialecki: >> ..., and deploy this >> index in a separate JVM (to benefit from other CPUs than the one that >> runs your Solr core) > > Every known webserver ist multithreaded by default, so putting different > Solr

Re: Auto-suggest internal terms

2010-06-03 Thread Michael Kuhlmann
Am 03.06.2010 16:45, schrieb Andrzej Bialecki: > You are right to a certain degree. Still, there are some contention > points in Lucene/Solr, how threads are allocated on available CPU-s, and > how the heap is used, which can make a two-JVM setup perform much better > than a single-JVM setup given

Reading newSearcher and firstSearcher Queries from Separate File

2010-06-03 Thread Kris Jack
Hi everyone, I am adding queries to the newSearcher and firstSearcher listeners for warming solr up. Unfortunately, I have hundreds of queries and don't particularly want my solrconfig.xml file to become so big. Is there a way that I can point them to another file where these queries are writte

Re: Highlighting Issue

2010-06-03 Thread homerlex
iorixxx wrote: > > hl.requireFieldMatch=true > > http://wiki.apache.org/solr/HighlightingParameters#hl.requireFieldMatch > > I had tried this before but it did not work because in the query I did not specify body:. This does work fine if I put body:terms in the query. Is there an simpler s

Custom faceting question

2010-06-03 Thread Blargy
I believe I'll need to write some custom code to accomplish what I want (efficiently that is) but I'm unsure of what would be the best route to take. Will this require a custom request handler? Search component? Ok the easiest way to explain is to show you what I want. http://shop.ebay.com/?_fro

Re: index growing with updates

2010-06-03 Thread Erick Erickson
Assuming your config is set up to replace unique keys, you're really doing a delete and an add (under the covers). It could very well be that the deleted version of the document is still in your index taking up space and will be until it is purged. HTH Erick On Thu, Jun 3, 2010 at 10:22 AM, Nagel

Re: Highlighting Issue

2010-06-03 Thread Erick Erickson
Have you looked at DisMaxRequestHandler? Best Erick On Thu, Jun 3, 2010 at 11:23 AM, homerlex wrote: > > > iorixxx wrote: > > > > hl.requireFieldMatch=true > > > > http://wiki.apache.org/solr/HighlightingParameters#hl.requireFieldMatch > > > > > > I had tried this before but it did not work bec

Re: DataImportHandler and running out of disk space

2010-06-03 Thread wojtekpia
https://issues.apache.org/jira/browse/SOLR-1939 SOLR-1939 created. -- View this message in context: http://lucene.472066.n3.nabble.com/DataImportHandler-and-running-out-of-disk-space-tp835125p868133.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: index growing with updates

2010-06-03 Thread Nagelberg, Kallin
Is there a way to trigger a purge, or under what conditions does it occur? -Kallin Nagelberg -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Thursday, June 03, 2010 12:40 PM To: solr-user@lucene.apache.org Subject: Re: index growing with updates Assuming y

solr-lucene AND vs "+"

2010-06-03 Thread Jonathan Rochkind
Using solr-lucene query parser, is there a difference between using "AND" and using "+" in querries like this: 1) q= some_field:( one AND two AND "some phrase") 2) q= some_field:(+one +two +"some phrase") Are those always exactly identical in all respects, or are there any differences in

Re: index growing with updates

2010-06-03 Thread Chris Rode
As I understand it one of the easiest ways of ensuring a purge is to trigger an optimisation. This happens periodically depending on your configuration or you can trigger it manually. Regards Chris On 3 June 2010 18:36, Nagelberg, Kallin wrote: > Is there a way to trigger a purge, or under wha

general debugging techniques?

2010-06-03 Thread Jim Blomo
I am new to debugging Java services, so I'm wondering what the best practices are for debugging solr on tomcat. I'm running into a few issues while building up my index, using the ExtractingRequestHandler to format the data from my sources. I can read through the catalina log, but this seems to j

RE: general debugging techniques?

2010-06-03 Thread Nagelberg, Kallin
How much memory have you given tomcat? The default is 64M which is going to be really small for 5MB documents. -Original Message- From: jim.bl...@pbwiki.com [mailto:jim.bl...@pbwiki.com] On Behalf Of Jim Blomo Sent: Thursday, June 03, 2010 2:05 PM To: solr-user@lucene.apache.org Subject:

Re: general debugging techniques?

2010-06-03 Thread Jim Blomo
On Thu, Jun 3, 2010 at 11:17 AM, Nagelberg, Kallin wrote: > How much memory have you given tomcat? The default is 64M which is going to > be really small for 5MB documents. -Xmx128M - my understanding is that this bumps heap size to 128M. What is a reasonable size? Are there other memory flags

RE: general debugging techniques?

2010-06-03 Thread Nagelberg, Kallin
That is still really small for 5MB documents. I think the default solr document cache is 512 items, so you would need at least 3 GB of memory if you didn't change that and the cache filled up. Try disabling the document cache by removing the block from your solrconfig, or at least turn it do

Does SolrJ support nested annotated beans?

2010-06-03 Thread Peter Hanning
When modeling documents with a lot of fields (hundreds) the bean class used with SolrJ to interact with the Solr index tends to get really big and unwieldy. I was hoping that it would be possible to extract groups of properties into nested beans and move the @Field annotations along. Basically, I

Re: solr-lucene AND vs "+"

2010-06-03 Thread Ahmet Arslan
> Using solr-lucene query parser, is > there a difference between using "AND" and using "+" in > querries like this: > > 1)   q= some_field:( one AND two AND "some > phrase") > > 2)   q= some_field:(+one +two +"some > phrase") > > Are those always exactly identical in all respects, or are > ther

Re: Different mergeFactor for master and slaves

2010-06-03 Thread Jason Rutherglen
Kris, That wouldn't do anything because all merging occurs on the master. Jason On Thu, Jun 3, 2010 at 6:25 AM, Kris Jack wrote: > > Hi everyone, > > I have set up a master-slave configuration where the master machine will be > used primarily for indexing while the slave machines will be used f

Re: Different mergeFactor for master and slaves

2010-06-03 Thread Kris Jack
Thanks for the reply. -- View this message in context: http://lucene.472066.n3.nabble.com/Different-mergeFactor-for-master-and-slaves-tp867585p868837.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Reading newSearcher and firstSearcher Queries from Separate File

2010-06-03 Thread Koji Sekiguchi
(10/06/04 0:16), Kris Jack wrote: Hi everyone, I am adding queries to the newSearcher and firstSearcher listeners for warming solr up. Unfortunately, I have hundreds of queries and don't particularly want my solrconfig.xml file to become so big. Is there a way that I can point them to another

Claus Daldorph Nielsen is out of the office.

2010-06-03 Thread Claus Daldorph Nielsen
I will be out of the office starting 04-06-2010 and will not return until 21-06-2010. I will respond to your message when I return.

exclude docs with null field

2010-06-03 Thread bluestar
hi there, say my search query is "new york", and i am searching field1 and field2 for it, how do i specify that i want to exlude docs where field3 doesnt exist? thanks