Fun with Spatial (Haversine formula)

2010-08-17 Thread Lance Norskog
The Haversine formula in o.a.s.s.f.d.DistanceUtils.java gives these results for a 0.1 degree difference in miles: equator horizontal 0.1 deg: lat/lon 0.0/0.0 -> 396.320504 equator vertical 0.1 deg: lat/lon 0.0/0.0 -> 396.320504 NYC horizontal 0.1 deg: lat/lon -72.0/0.

Re: indexing???

2010-08-17 Thread satya swaroop
hi, 1) i use tika 0.8... 2)the url is https://issues.apache.org/jira/browse/PDFBOX-709 and the file is samplerequestform.pdf 3)the entire error is::; curl " http://localhost:8080/solr/update/extract?stream.file=/home/satya/my_workings/satya_ebooks/8-Linux/sample

Re: Solr date "NOW" - format?

2010-08-17 Thread Shawn Heisey
My first attempt was adding this to the dismax handler: pd:[NOW-1MONTH TO NOW]^5.0 pd:[NOW-3MONTHS TO NOW-1MONTH]^3.0 pd:[NOW-1YEAR TO NOW-3MONTHS]^2.0 pd:[* TO NOW-1YEAR]^1.0 This results in scores that are quite a bit lower (9.5 max score instead of 11.7), but the order looks the same. No r

Re: OutOfMemoryErrors

2010-08-17 Thread Grijesh.singh
ramBufferSize is preferred to be 128MB more than that it does not seemes to improve performance -- View this message in context: http://lucene.472066.n3.nabble.com/OutOfMemoryErrors-tp1181731p1199592.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: OutOfMemoryErrors

2010-08-17 Thread rajini maski
Yeah fine..I will do that...Before the merge Factor was 10 itself ...After finding this error I just set its value higher assuming if that could be error anyway... Will re change it.. The ramBufferSize is 256MB... Do I need to change this value to higher? On Wed, Aug 18, 2010 at 12:27 AM, Jay

Re: Solr date "NOW" - format?

2010-08-17 Thread Shawn Heisey
Would I do separate bq values for each of the ranges, or is there a way to include them all at once? If it's the latter, I'll need a full example with a field name, because I'm clueless. :) On 8/17/2010 2:29 PM, Lance Norskog wrote: I think 'bq=' is what you want. In dismax the main query st

Integrating Solr's SynonymFilter in lucene

2010-08-17 Thread Arun Rangarajan
I am trying to have multi-word synonyms work in lucene using Solr's * SynonymFilter*. I need to match synonyms at index time, since many of the synonym lists are huge. Actually they are really not synonyms, but are words that belong to a concept. For example, I would like to map {"New York", "Los

changable DIH datasource based on environment variables

2010-08-17 Thread Tommy Chheng
I defined my DIH datasource in solrconfig.xml. Is there a way to define two sets of data sources and use one based on the current system's environment variable?(ex. APP_ENV=production or APP_ENV=development) I run the DIH on my local machine and remote server. They use different mysql dataso

queryResultCache has no hits for date boost function

2010-08-17 Thread Peter Karich
Hi all, my queryResultCache has no hits. But if I am removing one line from the bf section in my dismax handler all is fine. Here is the line: recip(ms(NOW,date),3.16e-11,1,1) According to http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents this should be fi

Re: Function query to boost scores by a constant if all terms are present

2010-08-17 Thread Ahmet Arslan
> Most of the time, items that match all three terms will > float to the top by > normal ranking, but sometimes there are only two terms that > are like a rash > across the record, and they end up with a higher score than > some items that > match all three query terms. > > I'd like to boost items

Re: Solr synonyms format query time vs index time

2010-08-17 Thread Lance Norskog
solr/admin/analysis.jsp lets you see how this works. Use the index boxes. Lance On Tue, Aug 17, 2010 at 11:56 AM, Steven A Rowe wrote: > Hi Michael, > > I think the problem you're seeing is that no document contains "reebox", and > you've used the "explicit" syntax (source=>dest) instead of the

Re: Solr date "NOW" - format?

2010-08-17 Thread Lance Norskog
I think 'bq=' is what you want. In dismax the main query string is assumed to go against a bunch of fields. This query is in the standard (Lucene++) format. The query strings should handle the ^number syntax. http://www.lucidimagination.com/search/document/CDRG_ch07_7.4.2.9 On Tue, Aug 17, 2010 a

sort order of "missing" items

2010-08-17 Thread Brad Dewar
When items are sorted, are all the docs with the sort field missing considered "tied" in terms of their sort order, or are they "indeterminate", or do they have some arbitrary order imposed on them (e.g. _docid_)? For example, would "b" be considered as part of the sort in the following query,

Re: Sort by date, filter by score?

2010-08-17 Thread Ahmet Arslan
> They want to sort by a date field but filter out all > results below a minimum relevancy score.  Is this possible?  Earlier Yonik proposed a solution to a similar need. http://search-lucene.com/m/4AHNF17wIJW1

Sort by date, filter by score?

2010-08-17 Thread Shawn Heisey
I have had a request from our development team. I did some searching and could not find an answer. They want to sort by a date field but filter out all results below a minimum relevancy score. Is this possible? I suspect that our only option will be to do the search sorted by relevancy and

Re: OutOfMemoryErrors

2010-08-17 Thread Jay Hill
A merge factor of 100 is very high and out of the norm. Try starting with a value of 10. I've never seen a running system with a value anywhere near this high. Also, what is your setting for ramBufferSizeMB? -Jay On Tue, Aug 17, 2010 at 10:46 AM, rajini maski wrote: > yeah sorry I forgot to men

RE: Solr synonyms format query time vs index time

2010-08-17 Thread Steven A Rowe
Hi Michael, I think the problem you're seeing is that no document contains "reebox", and you've used the "explicit" syntax (source=>dest) instead of the "equivalent" syntax (term,term,term). I'm guessing that if you convert your synonym file from: reebox => Reebok to: reebox

Re: OutOfMemoryErrors

2010-08-17 Thread rajini maski
yeah sorry I forgot to mention others... 100 1000 10 1 above are the values Is this because of values here...initially I had mergeFactor parameter -10 and maxMergedocs-1With the same error i changed them to above values..Yet I got that error after index was about 2lacs docs..

Re: OutOfMemoryErrors

2010-08-17 Thread Erick Erickson
There are more merge paramaters, what values do you have for these: 10 1000 2147483647 1 See: http://wiki.apache.org/solr/SolrConfigXml Hope that formatting comes through the various mail programs OK Also, what else happens while you're indexing? Do you search while indexing? How often

Re: Solr-HOW TO HANDLE THE LOCK FILE CREATION WHILE INDEXING AND OPERATION TIMED OUT WEB EXCEPTION ERROR

2010-08-17 Thread rajini maski
Yes it is netwoked kind and in WindowsSolr version is Solr-1.4.0 , Tomcat 6. Exception is system.net.web exception error "Operation has timed out" httprequest.getresponse failed For web exception error do I need to change ramBufferSize paramter and merge factors parameters in config.xml

Re: OutOfMemoryErrors

2010-08-17 Thread rajini maski
100 JVM Initial memory pool -256MB Maximum memory pool -1024MB long:ID str:Body 12 fields I have a solr instance in solr folder (D:/Solr) free space in disc is 24.3GB .. How will I get to know what portion of memory is solr using ? On Tue, Aug 17, 2010 at 10:11 PM, Erick E

Re: OutOfMemoryErrors

2010-08-17 Thread Erick Erickson
You shouldn't be getting this error at all unless you're doing something out of the ordinary. So, it'd help if you told us: >What parameters you have set for merging >What parameters you have set for the JVM >What kind of documents are you indexing? The memory you have is irrelevant if you only a

Re: Solr date "NOW" - format?

2010-08-17 Thread Shawn Heisey
On 4/9/2010 7:35 PM, Lance Norskog wrote: Function queries are notoriously slow. Another way to boost by year is with range queries: [NOW-6MONTHS TO NOW]^5.0 , [NOW-1YEARS TO NOW-6MONTHS]^3.0 [NOW-2YEARS TO NOW-1YEARS]^2.0 [* TO NOW-2YEARS]^1.0 Notice that you get to have a non-linear curve whe

Function query to boost scores by a constant if all terms are present

2010-08-17 Thread Bill Dueber
Let me describe what I'm trying to accomplish, first, since what I think is the solution is almost always wrong. :-) I'm doing dismax queries with mm set such that not all terms need to match, e.g. only 2 of 3 query terms need to match. Most of the time, items that match all three terms will floa

Re: Solr-HOW TO HANDLE THE LOCK FILE CREATION WHILE INDEXING AND OPERATION TIMED OUT WEB EXCEPTION ERROR

2010-08-17 Thread Erick Erickson
It would help a lot if you included the stack trace of the exception, perhaps it'll be in your SOLR logs. Also, what is your environment? Are you using any kind of networked drive for your index? Windows? What version of SOLR are you using? Anything else you think would be useful. Best Erick On

Re: autocomplete: case-insensitive and middle word

2010-08-17 Thread Avlesh Singh
This thread might help - http://www.lucidimagination.com/search/document/9edc01a90a195336/enhancing_auto_complete Cheers Avlesh @avlesh | http://webklipper.com On Tue, Aug 17, 2010 at 8:30 PM, Paul wrote: > I have a couple questions about implementing an autocomplete

autocomplete: case-insensitive and middle word

2010-08-17 Thread Paul
I have a couple questions about implementing an autocomplete function in solr. Here's my scenario: I have a name field that usually contains two or three names. For instance, let's suppose it contains: John Alfred Smith Alfred Johnson John Quincy Adams Fred Jones I'd like to have the autocomplet

Re: Search document design problem

2010-08-17 Thread Peter Karich
Hi Wenca, > But maybe the right term for this use case is "filter" not a "search" ... > in this case I should use the term "search". no problem. a search query to solr could also contain filters. so 'search' is ok for all situations, I think ;-) > Afterwards the user selects a price level and a

Re: indexing???

2010-08-17 Thread Ken Krugler
On Aug 16, 2010, at 10:38pm, satya swaroop wrote: hi all, the error i got is ""Unexpected RuntimeException from org.apache.tika.parser.pdf.pdfpar...@8210fc"" when i indexed a file similar to the one in https://issues.apache.org/jira/browse/PDFBOX-709/samplerequestform.pdf 1. This URL d

Re: OutOfMemoryErrors

2010-08-17 Thread Peter Karich
Which method do you use to index? If you are using SolrJ you can use the streaming update server it is a better option for the solr server, because the server does not need to held it all in memory. (if you are using the post.jar file there was bug which causes OOMs but I didn't remember exactly .

Re: Search document design problem

2010-08-17 Thread Wenca
Hi Peter, in fact I mainly want to search Hotels by any combination of its fields and its rooms and packages. Users can setup any combination in a dynamic form that changes after every change of the query. But maybe the right term for this use case is "filter" not a "search". The form enable

synonyms in EmbeddedSolrServer

2010-08-17 Thread Tim Terlegård
Synonyms doesn't seem to work in EmbeddedSolrServer (solr 1.4.0) when mixing in multi word synonyms. It works fine when I run solr standalone. Did anyone else experience this? I have this in synonyms.txt: word => some, other stuff I index "some" and then search for "word". With a standalone solr

Re: stream.url problem

2010-08-17 Thread Travis Low
"Connection refused" (in any context) almost always means that nothing is listening on the TCP port that you are trying to connect to. So either the process you are connecting to isn't running, or you are trying to connect to the wrong port. On Tue, Aug 17, 2010 at 6:18 AM, satya swaroop wrote:

Re: Search document design problem

2010-08-17 Thread Peter Karich
Hi Wenca, I am not sure wether my information here is really helpful for you, sorry if not ;-) > I want only hotels that have room with 2 beds and the room has a package with all inclusive boarding and price lower than 400. you should tell us what you want to search and filter? Do you want only

Re: OutOfMemoryErrors

2010-08-17 Thread rajini maski
I am getting it while indexing data to solr not while querying... Though I have enough memory space upto 40GB and I my indexing data is just 5-6 GB yet that particular error is seldom observed... (SEVERE ERROR : JAVA HEAP SPACE , OUT OF MEMORY ERROR ) I could see one lock file generated in the data

Re: OutOfMemoryErrors

2010-08-17 Thread Peter Karich
> Is there a way to verify that I have added correctlly? > on linux you can do ps -elf | grep Boot and see if the java command has the parameters added. @all: why and when do you get those OOMs? while querying? which queries in detail? Regards, Peter.

Re: stream.url problem

2010-08-17 Thread rajini maski
If the connector port number in your localhost is same as in other system then this error is probable..You can change port number in server.xml of your system or other system and make them different...If it is different only then one other probablity is remote access enabled or not... Rajani Maski

Re: stream.url problem

2010-08-17 Thread Tim Terlegård
> hi all, >       i am indexing the documents to solr that are in my system. now i need > to index the files that are in remote system, i enabled the remote streaming > to true in solrconfig.xml and when i use the stream.url it shows the error > as ""connection refused"" and the detail of the error

stream.url problem

2010-08-17 Thread satya swaroop
hi all, i am indexing the documents to solr that are in my system. now i need to index the files that are in remote system, i enabled the remote streaming to true in solrconfig.xml and when i use the stream.url it shows the error as ""connection refused"" and the detail of the error is::: w

Re: Search document design problem

2010-08-17 Thread Wenca
Oops, it seems that the mailing list does not support attachments. Here's a link to the diagram image: http://dl.dropbox.com/u/10214557/model.png Wenca Dne 17.8.2010 11:30, Wenca napsal(a): Hi all, I would like to use Solr to replace our site search based on MySQL but I am not sure how to ma

Search document design problem

2010-08-17 Thread Wenca
Hi all, I would like to use Solr to replace our site search based on MySQL but I am not sure how to map entities into the search index. The model is described byt the attached UML class diagram. I have a Hotel that resides in some City in some Country. The hotel has various Rooms. For each R

Re: OutOfMemoryErrors

2010-08-17 Thread Grijesh.singh
U can add like this it will work I am using it JAVA_OPTS="$JAVA_OPTS -Xms1024m -Xmx4096m " -- View this message in context: http://lucene.472066.n3.nabble.com/OutOfMemoryErrors-tp1181731p1183229.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: OutOfMemoryErrors

2010-08-17 Thread Chamnap Chhorn
Is there a way to verify that I have added correctlly? On Tue, Aug 17, 2010 at 2:41 PM, Chamnap Chhorn wrote: > Should I add this line with double quote or not? because if I don't, it > doesn't work at all in my /etc/init.d/tomcat6. > > > export CATALINA_OPTS="-Xms256m -Xmx1024m"; > > On Tue, Aug

Re: OutOfMemoryErrors

2010-08-17 Thread Chamnap Chhorn
Should I add this line with double quote or not? because if I don't, it doesn't work at all in my /etc/init.d/tomcat6. export CATALINA_OPTS="-Xms256m -Xmx1024m"; On Tue, Aug 17, 2010 at 1:36 PM, Grijesh.singh wrote: > > put that line in your startup script or u can set as env var > export CATALI

Re: maxMergeDocs and performance tuning

2010-08-17 Thread Andrew Clegg
Okay, thanks Marc. I don't really have any complaints about performance (yet!) but I'm still wondering how the mechanics work, e.g. when you have a number of segments equal to mergeFactor, and each contains maxMergeDocs documents. The docs are a bit fuzzy on this... -- View this message in conte