Index corruption due to permissions and autocommit

2009-04-30 Thread Jacob Singh
We rebooted a machine, and the permissions on the external drive where the index was stored had changed. We didn't realize it immediately, because searches were working and updates were not throwing errors back to the client. These ended up in catalina.out Apr 22, 2009 11:57:12 PM org.apache.sol

Re: NullPointException during creating snapshot

2009-04-30 Thread Noble Paul നോബിള്‍ नोब्ळ्
The snapshoot feature is not yet tested . Could you please post the full stacktrace from the server console. I shall open an issue --Noble On Fri, May 1, 2009 at 3:33 AM, Jian Han Guo wrote: > Hi, > > If there is no new document added since the last snapshot is created, the > request solr/replic

Re: Error java.net.SocketException: Connection reset with longer Solr Query

2009-04-30 Thread Noble Paul നോബിള്‍ नोब्ळ्
did you try to POST the query? On Fri, May 1, 2009 at 12:43 AM, ANKITBHATNAGAR wrote: > > Hi Guys, > I am using solr 1.3 for performing search. > > I am using facet search and I am getting : connection reset error when the > Query > > http://localhost:9000/solr/select/?q=*%3A*&fq=indexid_s%3A628&

Re: understanding facets and tokens

2009-04-30 Thread Otis Gospodnetic
Hello Simon, I'll assume you are using Solr 1.3. Grab the latest Solr nightly and try with that - your multi-token facets should be faster (are you sure, sure sure you are ending up with a single token). Also, unrelated to this most probably is the suspiciously large JVM heap. My guess is i

Re: Getting better performance frm Solr

2009-04-30 Thread Otis Gospodnetic
Several options: - use tmpfs (loads everything in RAM) - force the FS/OS to cache your index (e.g. cat /path/to/index/* > /dev/null) - warm up your index well (e.g. *:* + sort queries if you sort) - make use of Solr caches Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ---

Re: Best way to gather span/token positions from query? (mis-posted to dev list...)

2009-04-30 Thread Otis Gospodnetic
Sean, I haven't looked into this very closely yet, but since you are on the bleeding edge Solr and are looking for pointers, let me point out the new FieldAnalysisRequestHandler and DocumentAnalysisRequestHandler. [o...@localhost src]$ ff Field\*java | grep Handler.java ./java/org/apache/solr

Re: Filter query with wildcard, fq=a*

2009-04-30 Thread Otis Gospodnetic
Hi, Yeah, most of the time that is the case because people tend to lower-case when indexing in order to get case-insensitive searches. Technically, wildcard queries could match even when they are upper-case - if the input was not lower-cased at index time. Otis -- Sematext -- http://sematex

Re: Filter query with wildcard, fq=a*

2009-04-30 Thread Otis Gospodnetic
Hi Andrew, Is this "starts with letter _" query going to be common in your system? If so, may I suggest you add another field that stored just the initial letter and stick that in fq? That will get rid of a single-char prefix wildcard query, which is *expensive*! (read: slow!) Otis -- Semat

Re: NullPointException during creating snapshot

2009-04-30 Thread Otis Gospodnetic
Jianhan, You didn't mention how old your version of Solr is. Try with the latest nightly build first, please. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Jian Han Guo > To: solr-user@lucene.apache.org > Sent: Thursday, April 30, 200

Re: BooleanQuery

2009-04-30 Thread Otis Gospodnetic
All the reviews for all the hotels or all the reviews for all the matched hotels? Your query doesn't look like the latter, and the latter would make sense. For the latter, it should be something like this: +(hot.id:AAA OR hot.id:BBB) rev.hot.id:AAA rev.hot.id:BBB Otis -- Sematext -- http://sem

Re: BooleanQuery

2009-04-30 Thread Avlesh Singh
> > (+(rev.headline:beach^2.0) | rev.comments:beach^2.0)~0.01 ()) (+ > hot.id:5823 +hot.id:5847) > Your hot.id's are not OR'd. They are AND'ed. (+hot.id:5823 +hot.id:5847) should have been (hot.id:5823 hot.id:5847). Cheers Avlesh On Fri, May 1, 2009 at 4:26 AM, Ankush Goyal wrote: > Hey Guys,

Re: Term highlighting with MoreLikeThisHandler?

2009-04-30 Thread Walter Underwood
It can be pretty confusing. The user didn't type those terms and the terms can be pretty odd. Effective, but odd. They might be half-phrases or other things that just look bad and distracting. wunder On 4/30/09 6:23 AM, "Eric Sabourin" wrote: > solr find the specified document, extracts its int

Re: facet results in order of rank

2009-04-30 Thread ristretto.rb
Thanks for the reply. Hopefully I'll get more, and turn this into a mini project I can commit back to the project, or at least make available to anyone who'd like the functionality.Of course, if I'm the only one who cares, it could be a long road. :) gene On Fri, May 1, 2009 at 9:41 AM, En

Re: Problems searching ranges of values with double datatype

2009-04-30 Thread Iris Soto
Iris Soto escribió: Iris Soto escribió: Hi all, I am having problems with search for double datatype, for example: I'm testing comparing results of search in a range of "0 TO 1000", with the result of the sum from searches between ranges "0 TO 500" and "501 TO 1000" and there are diferent r

BooleanQuery

2009-04-30 Thread Ankush Goyal
Hey Guys, I have 2 indexes with one having hotel content and other containing reviews for hotels. When a user queries for a location the logic first calls the hotel index to get hotels for the location, then it needs to call review index to ask for all the reviews for all the hotels. So, I need

Re: Problems searching ranges of values with double datatype

2009-04-30 Thread Koji Sekiguchi
Use sdouble instead of double for range queries since the lexicographic ordering isn't equal to the numeric ordering. Koji Iris Soto wrote: Iris Soto escribió: Hi all, I am having problems with search for double datatype, for example: I'm testing comparing results of search in a range

Re: Stats field with decimal values

2009-04-30 Thread Nasseam Elkarra
I get the error doing stats or facet query on an sfloat. So the following query: q = *:*&facet = true &facet.mincount=1&facet.sort=true&facet.limit=-1&facet.field=salePrice Gives the following error: java.lang.StringIndexOutOfBoundsException: String index out of range: 2 at java.

NullPointException during creating snapshot

2009-04-30 Thread Jian Han Guo
Hi, If there is no new document added since the last snapshot is created, the request solr/replication?command=snapshoot returns the following response 08java.lang.NullPointerException:java.lang.NullPointerException Is this something expected? There was no error message found in the log file.

Re: Problems searching ranges of values with double datatype

2009-04-30 Thread Iris Soto
Iris Soto escribió: Hi all, I am having problems with search for double datatype, for example: I'm testing comparing results of search in a range of "0 TO 1000", with the result of the sum from searches between ranges "0 TO 500" and "501 TO 1000" and there are diferent results. I've read tha

Last modified time for cores, taking into account uncommitted changes

2009-04-30 Thread James Brady
Hi, The lastModified field the Solr status seems to only be updated when a commit/optimize operation takes place. Is there any way to determine when a core has been changed, including any uncommitted add operations? Thanks, James

RE: facet results in order of rank

2009-04-30 Thread Ensdorf Ken
> Hello Solrites (or Solrorians) I prefer "Solrdier" :) > > Is it possible to get the average ranking score for a set of docs that > would be returned for a given facet value. > > If not in SOLR, what about Lucene? > > How hard to implement? > > I have years of Java experience, but no Lucene codi

Problems searching ranges of values with double datatype

2009-04-30 Thread Iris Soto
Hi all, I am having problems with search for double datatype, for example: I'm testing comparing results of search in a range of "0 TO 1000", with the result of the sum from searches between ranges "0 TO 500" and "501 TO 1000" and there are diferent results. I've read that the correct datatyp

Re: facet results in order of rank

2009-04-30 Thread ristretto.rb
Hello Solrites (or Solrorians) Is it possible to get the average ranking score for a set of docs that would be returned for a given facet value. If not in SOLR, what about Lucene? How hard to implement? I have years of Java experience, but no Lucene coding experience. Would be happy to impleme

Re: Error java.net.SocketException: Connection reset with longer Solr Query

2009-04-30 Thread Caio Quiozini
Hi ANKITBHATNAGAR have you already tried to set up the column36 as sfloat and a query like column36:[xxx.yy OR zzz.aa OR ... ] I think it should help to make the decrease the number of characters. A time ago I have expected problems cause the size of the url passed to tomcat. ANKITBHATN

Error java.net.SocketException: Connection reset

2009-04-30 Thread ANKITBHATNAGAR
Hi Guys, I am using solr 1.3 for performing search. I am using facet search and I am getting : connection reset error when the Query http://localhost:9000/solr/select/?q=*%3A*&fq=indexid_s%3A628&fq=column36_s%3A%220.25%22+OR+column36_s%3A%220.35%22+OR+column36_s%3A%220.44%22+OR+column36_s%3A%220

Re: Stats field with decimal values

2009-04-30 Thread Shalin Shekhar Mangar
On Fri, May 1, 2009 at 12:27 AM, Nasseam Elkarra wrote: > Hello, > > I'm getting an error when trying to create stats on an sfloat field. The > field is for price and when there is no decimal it works fine but when there > is a decimal (e.g., 24.99) I get an error: > java.lang.StringIndexOutOfBoun

Stats field with decimal values

2009-04-30 Thread Nasseam Elkarra
Hello, I'm getting an error when trying to create stats on an sfloat field. The field is for price and when there is no decimal it works fine but when there is a decimal (e.g., 24.99) I get an error: java.lang.StringIndexOutOfBoundsException: String index out of range: 2 Changing the fiel

Problems searching ranges of values with double datatype

2009-04-30 Thread Iris Soto
Hi all, I am having problems with search for double datatype. I'm testing comparing this result with sum of results of searches of ranges "0 TO 500" and "501 TO 1000" and there are diferent results. I've read that the correct datatype for money fields is double but something is not working

custome query parser.

2009-04-30 Thread Raju444us
How to write a custom query parser?When i get a query from client I have to parse it and append that field with a charecter for searching on that field.Can anyone tried this. Please help me in doing this.How to configure this queryparser in solr. Thanks, Raju -- View this message in context: ht

Re: How to submit code improvement/suggestions

2009-04-30 Thread Shalin Shekhar Mangar
Also see http://wiki.apache.org/solr/HowToContribute On Thu, Apr 30, 2009 at 11:03 PM, Eric Pugh wrote: > Yup! > > One thing though is that if you see some big changes you want to make, you > should probably join the solr-dev list and broach the topic there first to > make sure you are headed on

Re: How to submit code improvement/suggestions

2009-04-30 Thread Eric Pugh
Yup! One thing though is that if you see some big changes you want to make, you should probably join the solr-dev list and broach the topic there first to make sure you are headed on the right path. The committers typically don't want to introduce change for change's sake, but cleanup an

How to submit code improvement/suggestions

2009-04-30 Thread Amit Nithian
My apologies if this sounds like a silly question but for this project, how do I go about submitting code suggestions/improvements? They aren't necessarily bugs as such but rather just cleaning up some perceived strangeness (or even suggesting a package change). Would I need to create a JIRA ticket

Re: Term highlighting with MoreLikeThisHandler?

2009-04-30 Thread Matt Weber
Yes, I understand you can't highlight a documented within a document. However, with MLT you a using the interesting terms from the source document(s) to find similar results. An obvious solution would be highlighting the interesting terms that matched and thus made the result similar. T

[Job] Solr Search Opportunity

2009-04-30 Thread Solr Opportunity
A top 100 website is currently looking for a developer with Solr/Lucene experience to serve as a search architect/engineer. You will be working on an R&D effort for a multi-lingual search platform based upon Apache Solr. This position is based just out of Atlanta, GA, and we are looking for someo

Re: Unique Identifiers

2009-04-30 Thread Shalin Shekhar Mangar
On Thu, Apr 30, 2009 at 8:42 PM, ahammad wrote: > > Hello, > > How would I go about creating an aggregate entry? Does it go in the > data-config.xml file? > > You can use the TemplateTransformer to fix the table name followed by the pk of the table. e.g. -- Regards, Shalin Shekhar Mangar.

Re: Unique Identifiers

2009-04-30 Thread ahammad
Hello, How would I go about creating an aggregate entry? Does it go in the data-config.xml file? Also, out of curiosity, how can I access the UUIDField variable? It mat be required for something else. Cheers Erik Hatcher wrote: > > > On Apr 28, 2009, at 9:49 AM, ahammad wrote: >> Is it poss

Re: Filter query with wildcard, fq=a*

2009-04-30 Thread Rob Casson
it sounds to me like the field you're using (artistText) is tokenized and lowercased. it might be good to go over the wiki pages again: http://wiki.apache.org/solr/SolrFacetingOverview if you keep having problems, post your schema...cheers, rob On Thu, Apr 30, 2009 at 10:21 AM, Andrew McC

Re: java version for solr.

2009-04-30 Thread Walter Underwood
Java 1.4.2 is obsolete and unsupported. It is foolish to run production on such old software. Java 1.4 was originally released in 2002 and Sun stopped supporting it in Dec 2008. That means no more fixes, including security fixes. See this page: http://java.sun.com/j2se/1.4.2/ The only way to get

Re: Filter query with wildcard, fq=a*

2009-04-30 Thread Andrew McCombe
Hi It has now introduced a new issue.  I am now getting results for my query but they are not what I wanted.  I'm using a Dismax handler and am trying to filter a result set with a filter query: http://localhost:8180/solr/select?indent=on&version=2.2&q=i+love&start=0&rows=30&fl=*%2Cscore&qt=disma

Re: Filter query with wildcard, fq=a*

2009-04-30 Thread Andrew McCombe
Hi I've sorted it. Wildcard term must be lowercase to get results. Thanks Andrew 2009/4/30 Andrew McCombe > Hi > > I have half a million records indexed and need to filter results on a term > by the first letter. For Example the search term is 'I love' and returns a > few thousand records.

Re: Getting better performance frm Solr

2009-04-30 Thread Erick Erickson
>From Erik Hatcher: Try adding &debugQuery=true to your request and look at the timings in the response. What is the QueryComponent time? As I understand it, if you look carefully at the output you'll see where time is spent, including the raw time the search takes. That'll inform your next step

Best way to gather span/token positions from query? (mis-posted to dev list...)

2009-04-30 Thread Sean O'Connor
Hello, I'm trying to find a decent approach for getting token positions out of (or is that into?) Solr query results. Is the best approach to extend a QueryComponent and/or HighlightComponent? I'm new to solr, and still on fairly shaky ground so any pointers or suggestions are quite welcome.

Filter query with wildcard, fq=a*

2009-04-30 Thread Andrew McCombe
Hi I have half a million records indexed and need to filter results on a term by the first letter. For Example the search term is 'I love' and returns a few thousand records. I need to filter these results for all artists beginning with 'A'. I've tried: 'fq=artistText:A*' But then get no resul

Re: Term highlighting with MoreLikeThisHandler?

2009-04-30 Thread Eric Sabourin
solr find the specified document, extracts its interesting terms as configured, and uses the interesting terms for its query does it not? If so, would it be that inappropriate to highlight snippets in the "similar" documents it finds showing which interesting terms occur in which fields? Just a th

RE: java version for solr.

2009-04-30 Thread Radha C.
Thanks all. Is there any other way like retroweaver? -Original Message- From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.p...@gmail.com] Sent: Thursday, April 30, 2009 5:58 PM To: solr-user@lucene.apache.org Subject: Re: java version for solr. Solr uses java.util.concurrent packages which

Re: java version for solr.

2009-04-30 Thread Noble Paul നോബിള്‍ नोब्ळ्
Solr uses java.util.concurrent packages which is not available in java1.4. So it may be impossible to make Solr work in java 1.4 On Thu, Apr 30, 2009 at 5:47 PM, Smiley, David W. wrote: > Solr indeed requires Java 1.5. > > I am not sure if anyone has tried this but you may be able to get it

Re: java version for solr.

2009-04-30 Thread Smiley, David W.
Solr indeed requires Java 1.5. I am not sure if anyone has tried this but you may be able to get it to work after applying Retroweaver: http://retroweaver.sourceforge.net/ However I don't think retroweaver re-targets classes/methods in 1.5 that are not in 1.4. Not sure. ~ David Smiley On

java version for solr.

2009-04-30 Thread Radha C.
Hello List, Our production server is using j2sdk1.4.2_12 and solr requires Java 1.5 . So it is must to use java 1.5 in order to use Solr?. Can anyone tel me what issues can be faced if we use java1.4 ? So do we need to implement any utility to make use of solr in java 1.4 environment? Than

Re: stress tests to DIH and deduplication patch

2009-04-30 Thread Marc Sturlese
I have already ran out of memory after a cronjob indexing as much times as possible during a day. Will activate GC loggin to see what it says... Thnks! Shalin Shekhar Mangar wrote: > > On Wed, Apr 29, 2009 at 7:44 PM, Marc Sturlese > wrote: > >> >> Hey there, I am doing some stress tests index

Re: Date faceting - howto improve performance

2009-04-30 Thread Marcus Herou
Thanks should have grep'ed the source of course (like I always seem to end up with doing haha) /M On Wed, Apr 29, 2009 at 10:13 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > Some basic documentation is in the example schema.xml. Ask away if you have > specific questions. > > On T

Re: how to reset the index in solr

2009-04-30 Thread Marcus Herou
Or curl http://$server:$port/solr/$core/update -H "Content-Type: text/xml" --data-binary '*:*' curl http://$server:$port/solr/$core/update -H "Content-Type: text/xml" --data-binary '' curl http://$server:$port/solr/$core/update -H "Content-Type: text/xml" --data-binary '' //Marcus On Thu, Ap

fragmenter regexp

2009-04-30 Thread arno13
Hi, I don't succeed to use the fragmenter regexp functionality in solr I'm using solr 1.3 and I defined my fragmenter like this in the sorlconfigxml: [-\w ,/\n\"']{100,200} my query is the following: /solr/select?indent=on&version=2.2&q=fever&rows=100&start

Getting better performance frm Solr

2009-04-30 Thread mirage1987
hi, I was wondering how we can improve the query response time from solr. Is it possible to put my index into ram to increase performance.Does solr provide any such functionality??? (like RAMDirectory in lucene.) (working on linux) I tried to put the index in /dev/shm/ but didn't get ny perfo

Re: Problem adding unicoded docs to Solr through SolrJ

2009-04-30 Thread Michael Ludwig
ahmed baseet schrieb: I tried something stupid but working though. I first converted the whole string to byte array and then used that byte array to create a new utf-8 encoded sting like this, // Encode in Unicode UTF-8 byte [] utfEncodeByteArray = textOnly.getBytes(); This yi

Re: Problem adding unicoded docs to Solr through SolrJ

2009-04-30 Thread Gunnar Wagenknecht
ahmed baseet schrieb: > I first converted the whole string to > byte array and then used that byte array to create a new utf-8 encoded sting > like this, I'm not sure that this is required at all. Java strings have the same representation internally no matter what they were created from. Thus, the