Re: Lock timed out 2 worker running

2009-07-10 Thread Renz Daluz
Is it best way to implement my own Locking mechanism here? Thanks /Renz 2009/7/10 Renz Daluz > Hi all, > I have 2 workers running (app that's builds the index) and both are > pointing to the same "Solr" (1.3.0) master instance when updating/committing > documents. I'm using SolrJ to save the doc

Re: DisMax query parser syntax for the fq parameter

2009-07-10 Thread gistolero
Yes, it works :-) Thanks Erik! > > I am using the dismax query parser syntax for the fq param: > > > > .../select?qt=dismax&rows=30&q.alt=*:*&qf=content&fq={!dismax > > qf=contentKeyword^1.0 mm=0%}Foo&fq=+date:[2009-03-11T00:00:00Z TO > > 2009-07-09T16:41:50Z]&fl=id,date,content > > > > > > N

Re: Using curl comparing with using WebService::Solr

2009-07-10 Thread Shalin Shekhar Mangar
On Fri, Jul 10, 2009 at 11:50 AM, Francis Yakin wrote: > I also commit too many I guess, since we have 1000 folders, so each loop > will executed the load and commit. > So 1000 loops with 1000 commits. I think it will be help if I only commit > once after the 1000 loops completed. > > Any inputs?

RE: Using curl comparing with using WebService::Solr

2009-07-10 Thread Francis Yakin
How you batching all documents in one curl call? Do you have a sample, so I can modify my script and try it again. Right now I do curl on each documents( I have 1000 docs on each folder and I have 1000 folders) using : curl http://localhost:7001/solr/update --data-binary @abc.xml -H 'Content-

Re: posting binary file and metadata in two separate documents

2009-07-10 Thread rossputin
Hi. Apologies for bumping this one, but another question occurred to me... is there a limit to the number of &ext.literal components I can put in my curl command... if so, i will definitely need to find another way to get this data in, as I am building up relationships between documents, and ther

Re: Problem using ExtractingRequestHandler with tomcat

2009-07-10 Thread solenweg
I'm in the same situation, but is not getting what this ant example is about. Can't find anything in solr about it. Could I get anyone to write a little more specific what one have to do to get rid of the Error loading class 'org.apache.solr.handler.extraction.ExtractingRequestHandler' exception.

Re: Using curl comparing with using WebService::Solr

2009-07-10 Thread Shalin Shekhar Mangar
On Fri, Jul 10, 2009 at 1:17 PM, Francis Yakin wrote: > How you batching all documents in one curl call? Do you have a sample, so I > can modify my script and try it again. > > Right now I do curl on each documents( I have 1000 docs on each folder and > I have 1000 folders) using : > > curl http

Re: Create incremental snapshot

2009-07-10 Thread Asif Rahman
Tushar: Is it necessary to do the optimize on each iteration? When you run an optimize, the entire index is rewritten. Thus each index file can have at most one hard link and each snapshot will consume the full amount of space on your disk. Asir On Thu, Jul 9, 2009 at 3:26 AM, tushar kapoor <

Modifying a stored field after analyzing it?

2009-07-10 Thread Michael _
Hello, I've got a stored, indexed field that contains some actual text, and some metainfo, like this: one two three four [METAINFO] oneprime twoprime threeprime fourprime I have written a Tokenizer that skips past the [METAINFO] marker and uses the last four words as the tokens for the field,

Re: Retrieve docs with > 1 multivalue field hits

2009-07-10 Thread Erik Hatcher
On Jul 9, 2009, at 5:37 PM, A. Steven Anderson wrote: A simple example would be if a schema included a phoneNum mulitValue field and I wanted to return all docs that contained more than 1 phoneNum field value. all docs that contain more than one phone number - regardless of matching a pa

Re: Retrieve docs with > 1 multivalue field hits

2009-07-10 Thread Erik Hatcher
On Jul 9, 2009, at 5:37 PM, A. Steven Anderson wrote: A simple example would be if a schema included a phoneNum mulitValue field and I wanted to return all docs that contained more than 1 phoneNum field value. all docs that contain more than one phone number - regardless of matching a pa

Re: Retrieve docs with > 1 multivalue field hits

2009-07-10 Thread A. Steven Anderson
> > all docs that contain more than one phone number - regardless of matching a > particular query? > Exactly. > knowing that was a useful query, i'd change my indexer to also provide > either a field with the count of phone number values, or a boolean field > saying whether there are more than

Re: Modifying a stored field after analyzing it?

2009-07-10 Thread Shalin Shekhar Mangar
On Fri, Jul 10, 2009 at 5:56 PM, Michael _ wrote: > Hello, > I've got a stored, indexed field that contains some actual text, and some > metainfo, like this: > > one two three four [METAINFO] oneprime twoprime threeprime fourprime > > I have written a Tokenizer that skips past the [METAINFO] ma

Re: Modifying a stored field after analyzing it?

2009-07-10 Thread solrcoder
Shalin Shekhar Mangar wrote: > > Can't you have two fields like this? > > f1 (indexed, not stored) -> one two three four [METAINFO] oneprime > twoprime > threeprime fourprime > f2 (not indexed, stored) -> one two three four > Perhaps I don't understand highlighting, but won't that prevent sni

Re: Modifying a stored field after analyzing it?

2009-07-10 Thread Mark Miller
> > Is there some clever way that I'm missing to build my token stream outside > of Solr, and store just the original text and index my token stream? > > Coming soon. First step was here: http://issues.apache.org/jira/browse/LUCENE-1699 Trunk doesn't have that version of Lucene yet though (I believ

Re: Modifying a stored field after analyzing it?

2009-07-10 Thread solrcoder
markrmiller wrote: > > Coming soon. First step was here: > http://issues.apache.org/jira/browse/LUCENE-1699 > Trunk doesn't have that version of Lucene yet though (I believe thats > still > the case). > > Replacing the RunUpdateProcessor give you full control of the Lucene > document creation.

Re: Modifying a stored field after analyzing it?

2009-07-10 Thread Mark Miller
On Fri, Jul 10, 2009 at 2:02 PM, solrcoder wrote: > > > markrmiller wrote: > > > > Coming soon. First step was here: > > http://issues.apache.org/jira/browse/LUCENE-1699 > > Trunk doesn't have that version of Lucene yet though (I believe thats > > still > > the case). > > > > Replacing the RunUpd

printing scores

2009-07-10 Thread Marc Sturlese
I have noticed a weird behabiour doing score testing. I do a search using dismax request handler with no extra boosting in a index of a milion docs searching in five fields. Printing the score of the docs 3th,4th,5fh,6th I can see that is the same. If I build the index with my own lucene indexer a

Re: printing scores

2009-07-10 Thread Erick Erickson
Why do you care? I'm not being too much of a jerk here, becausescores between separate queries are irrelevant. See: http://wiki.apache.org/lucene-java/ScoresAsPercentages So, the scores aren't important, the important thing is whether the doc

Index-time boost propagated to copyField?

2009-07-10 Thread Mat Brown
Hi all, If I have two fields that are copied into a copyField, and I index data in these fields using different index-time boosts, are those boosts propagated into the copyField? Thanks! Mat

Re: Distributed Search in Solr

2009-07-10 Thread Grant Ingersoll
On Jul 9, 2009, at 11:58 PM, Sumit Aggarwal wrote: Hi, 1. Calls made to multiple shards are made in some concurrent fashion or serially? Concurrent 2. Any idea of algorithm followed for merging data? I mean how efficient it is? Not sure, but given that Yonik implemented it, I suspect

Re: printing scores

2009-07-10 Thread Marc Sturlese
Well I was asking it because I have a custom FieldComparatorSource that uses lucene score among other params to calculate the sorting. The thing is that with my own lucene servlet I am getting different results than using solr now (because score values are different and Solr is giving me back the

Re: Modifying a stored field after analyzing it?

2009-07-10 Thread solrcoder
markrmiller wrote: > > When you specify a custom UpdateProcessor chain, you will normally make > the > RunUpdateProcessor the last processor in the chain, as it will add the doc > to Solr. > Rather than using the built in RunUpdateProcessor though, you could simply > specify your own UpdateProce

Re: Boosting for most recent documents

2009-07-10 Thread vivek sar
Thanks Bill. Couple of questions, 1) Would the function query load all unique terms (for that field) in memory the way sort (field cache) does? If so, that wouldn't work for us as we can have over 5 billion records spread across multiple shards (up to 10 indexer instances), that would surely kill

Question About Solr Cores

2009-07-10 Thread danben
Hi, I'm building an application that dynamically instantiates a large number of solr cores on a single machine (large would ideally be as high as I can get it, in the millions, if it is possible to do so without significant performance degradation and/or system failure). I already tried this sam

Re: Aggregating/Grouping Document Search Results on a Field

2009-07-10 Thread Bradford Stephens
Does the facet aggregation take place on the Solr search server, or the Solr client? It's pretty slow for me -- on a machine with 8 cores/ 8 GB RAM, 50 million document index (about 36M unique values in the "author" field), a query that returns 131,000 hits takes about 20 seconds to calculate the

Re: Custom sort

2009-07-10 Thread dontthinktwice
Marc Sturlese wrote: > > I have been able to create my custom field. The problem is that I have > laoded in the solr core a couple of HashMaps > from a DB with values that will influence in the sort. My problem is that > I don't know how to let my custom sort have access to this HashMaps. > I a

Solr 1.2 and 1.3 - different Stamming

2009-07-10 Thread Jae Joo
I have found that the stamming in solr 1.2 and 1.3 is different for "communication". We have index built in Solr 1.2 and the index is being queried by 1.3. Is there any way to adjust it? Jae joo

Update Preprocessing

2009-07-10 Thread jonarino
I am investigating the possibilities of preprocessing my data before it is indexed. Specifically, I would like to add fields or modify field values based on other fields in the XML I am injecting. I am a little confused on where this is supposed to happen; whether as part of the UpdateRequestProce

Re: Update Preprocessing

2009-07-10 Thread Mark Miller
On Fri, Jul 10, 2009 at 6:40 PM, jonarino wrote: > > I am investigating the possibilities of preprocessing my data before it is > indexed. Specifically, I would like to add fields or modify field values > based on other fields in the XML I am injecting. > I am a little confused on where this is s

Re: Solr 1.2 and 1.3 - different Stamming

2009-07-10 Thread Mark Miller
Sorry. From the CHANGES for 1.3: {quote} The Porter snowball based stemmers in Lucene were updated (LUCENE-1142), and are not guaranteed to be backward compatible at the index level (the stem of certain words may have changed). Re-indexing is recommended. {/quote} Would have been nice to leave a

Re: Aggregating/Grouping Document Search Results on a Field

2009-07-10 Thread Avlesh Singh
> > Does the facet aggregation take place on the Solr search server, or the > Solr client? > Solr server. Faceting is an expensive operation by nature, especially when the hits are large in number. Solr caches these values once computed. You might want to tweak cache related parameters in your sol

Re: Modifying a stored field after analyzing it?

2009-07-10 Thread Mark Miller
On Fri, Jul 10, 2009 at 3:42 PM, solrcoder wrote: > > > markrmiller wrote: > > > > When you specify a custom UpdateProcessor chain, you will normally make > > the > > RunUpdateProcessor the last processor in the chain, as it will add the > doc > > to Solr. > > Rather than using the built in RunUp

How to do a "reverse distance" search?

2009-07-10 Thread Development Team
Hi everybody, Let's say we have 10,000 traveling sales-people spread throughout the country. Each of them has has their own territory, and most of the territories overlap (eg. 100 sales-people in a particular city alone). Each of them also has a maximum distance they can travel. Some can trave

Re: Search results depending on search word length?

2009-07-10 Thread Jeff Newburn
I am guessing that the field is actually just a string or a really long word. Solr looks for occurrences of the term/token. It does not however search within a given token without the *. So in your example the system will not match thisisavery with thisisaverylongtesttitle even though they have

Using DirectConnection or EmbeddedSolrServer, within a component

2009-07-10 Thread Matt Mitchell
Hi, I'm experimenting with Solr components. I'd like to be able to use a nice-high-level querying interface like the DirectSolrConnection or EmbeddedSolrServer provides. Would it be considered absolutely insane to use one of those *within a component* (using the same core instance)? Matt

Re: Custom sort

2009-07-10 Thread Ben
It could be that you should be providing an implementation of "SortComparatorSource" I have missed the earlier part of this thread, I assume you're trying to implement some form of custom search? B dontthinktwice wrote: Marc Sturlese wrote: I have been able to create my custom field. The

RE: How to do a "reverse distance" search?

2009-07-10 Thread Stuart Yeates
The easiest modification is to use: calc_square_of_distance(CLIENT_LAT, CLIENT_LONG, lat, long) < maxSquareOfTravelDist This has the same ordering as before, but is much cheaper to calculate. You can then calculate the actual distance in the GUI, where you're only showing a handful of values.

Re: Custom sort

2009-07-10 Thread dontthinktwice
okobloko wrote: > > It could be that you should be providing an implementation of > "SortComparatorSource" > I have missed the earlier part of this thread, I assume you're trying to > implement some form of custom search? > > B > Yes, exactly. What I'm trying to do is sort the results of a

Re: SEVERE: java.lang.ArrayIndexOutOfBoundsException

2009-07-10 Thread Chris Hostetter
: :( : : that is all we have in there!!! : : Is there any way I can raise the logging level for it? it's not an issue of "logging level" -- that just affects which types of messages get logged, this message is getting logged so the level is fine. The problem is the log formatting. if this is

Re: I am getting HTTP Version Not Supported (505)Error

2009-07-10 Thread Chris Hostetter
: data). I given the prepared url in URL calss i got the HTTP Version Not : Supported and the error code is 505. Solr never generates that error code. what servlet container are you using? : String urlStr = solrUrl + "/update?stream.body=" + strToAdd; : System.out.println("...

Re: Index-time boost propagated to copyField?

2009-07-10 Thread Koji Sekiguchi
Mat Brown wrote: Hi all, If I have two fields that are copied into a copyField, and I index data in these fields using different index-time boosts, are those boosts propagated into the copyField? Thanks! Mat No, but the norms of source fields of copyField are "propagated" into the destinat

solr jmx connection

2009-07-10 Thread J G
Hello, I have a SOLR JMX connection issue. I am running my JMX MBeanServer through Tomcat, meaning I am using Tomcat's MBeanServer rather than any other MBeanServer implemenation. I am having a hard time trying to figure out the correct JMX Service URL on my localhost for the accessing the SO