Re: sort by function
Can you please do some math to show the principle? Do you want to do something like this: finalScore = score * rank finalScore = rank ??? If the first is the case, than it is done by default (have a look at the wiki-example for making more recent documents more relevant). If the second is the case, than I would say you need a new sort-function (never realized something like that). Hope this helps - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/sort-by-function-tp814380p821239.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: sort by function
Forget what I said about the second case. The second case is a simple sort on your field. -- View this message in context: http://lucene.472066.n3.nabble.com/sort-by-function-tp814380p821252.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Connection Pool
Sorry for hijacking the thread, but I have an additional question Is there a way to achieve similar performance (SUSS like) when targeting extract request handler (/update/extract)? I guess one way can be to extract content on the client side and then use SUSS to send update request but then extraction needs to be taken care of locally in an asynchronous/batch manner. Regards Monmohan On Sun, May 16, 2010 at 5:19 AM, Lance Norskog wrote: > Connection spooling is specified by the underlying apache commons > connection manager when you create the Server. > > The SUSS does socket pooling by default and is the preferred way to do > concurrent indexing. There are some quirks in the Server > implementation set, and SUSS avoids them. Unless you are willing to > root around in the SolrJ Server code and understand exactly how it > works, stay with the SUSS. > > On Fri, May 14, 2010 at 6:44 AM, gabriele renzi wrote: > > On Fri, May 14, 2010 at 3:35 PM, Anderson vasconcelos > > wrote: > >> Hi > >> I wanna to know if has any connection pool client to manage the > connections > >> with solr. In my system, we have a lot of concurrency index request. I > cant > >> shared my connection, i need to create one per transaction. But if i > create > >> one per transaction, i think the performance will down. > >> > >> How you resolve this problem? > > > > The commonsHttpSolrServer class does connection pooling, and IIRC also > > the StreamingUpdateSolrServer. > > > > > > > > -- > > blog en: http://www.riffraff.info > > blog it: http://riffraff.blogsome.com > > > > > > -- > Lance Norskog > goks...@gmail.com >
match to non tokenizable word ("helloworld")
I get no match when searching for "helloworld", even though I have "hello world" in my index. How do people usually deal with this? Write a custom analyzer, with help from a collection of all dictionary words? thanks for suggestions/comments. _ Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox. http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1
Re: bi-directional replication on solr 1.4?
On 5/14/10 8:08 PM, Chris Hostetter wrote: : It looks like SnapPuller.java doesn't allow for the possibility of the : slave having a later index version than the master. It only checks : whether the versions are equal. : : It's easy enough to add that check and prevent the index fetch when : the slave has a later version (in fact I'm running it in a sandbox I'm not 100% positive, but i believe a change like that could cause problems if the index on the master is completley rebuild from scratch. indexVersion is garunteed to increase as the index is modified, (ie: add or merge segments) but i think an entirely new index (ie: delete the entire index directory as deleteByQuery("*:*) does and then reindex) could concievably result i na new index with a lower indexVersion number then the index it replaces. I think you should be good because the version starts as the current time in milliseconds? (and then is incremented by 1 on every commit thereafter) Yonik / Miller: does the SolrCloud branch already have support for master failover in a situation like this (ie: a two node "cloud") ? No - no master/failover support in SolrCloud yet. We havn't integrated at all with replication yet. -Hoss -- - Mark http://www.lucidimagination.com
Re: match to non tokenizable word ("helloworld")
You might want to look at ngrams and/or shingles. In this case I suspect that ngrams are better suited, I don't think shingles applies with the direction you stated, but your problem description is so short I thought I'd mention it. Although your collection of words can work (think synonyms) if you have a pre-determined, probably small, list of equivalencies... Best Erick On Sun, May 16, 2010 at 12:58 PM, siping liu wrote: > > I get no match when searching for "helloworld", even though I have "hello > world" in my index. How do people usually deal with this? Write a custom > analyzer, with help from a collection of all dictionary words? > > > > thanks for suggestions/comments. > > _ > Hotmail has tools for the New Busy. Search, chat and e-mail from your > inbox. > > http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1 >
Re: date slider
> Now I also want to offer a slider to define the range to > include in the result set. However here I do not want to do > faceting, instead I just want to find out the min and max > date values in the result (without any of the facet filters > applies) so I know the start and end points for the slider. > The user can then move the sliders to further filter the > result set. > > How can I best go about fetching just those min and max > values, ideally without having to add a separate query just > for this? http://wiki.apache.org/solr/StatsComponent can give you min and max values. Since it calculates additional statistics, i am not sure which one is faster: fetching min and max separately or using stats component. q=query&start=0&rows=1&fl=date&sort=date asc q=query&start=0&rows=1&fl=date&sort=date desc
Connection Reset Errors on a Distributed Index
Hello, For reference, I've posted about this before (but have new information now): http://lucene.472066.n3.nabble.com/Connection-reset-errors-during-commits-optimize-td484058.html#a484058 and have seen other similar posts as well: http://lucene.472066.n3.nabble.com/Question-on-Solr-Distributed-Search-td495188.html#a495191 During the aftermath of commits on a distributed index (3 shards, about 3M documents each with many many facets), I'm getting ConnectionReset errors (see below for the full trace). The place in the code where it happens is where the 'master server' is waiting on results from the other shards. I've been combing through the logs of all the shards at the time of the exceptions and have noticed that every exception is thrown on a search which does not appear in one of the other shard's logs. In addition, the shard which doesn't record the search is usually busy warming up uninverted fields for facet searches, via a newSearcher query, at the time. These exceptions are problematic because they tend to slow the server down to a crawl, sometimes permanently. Does anyone have any advice for me on how to proceed? Is it possible that a shard would stop responding while uninverting fields? Thanks, -Harish Trace: SEVERE: org.apache.solr.common.SolrException: org.apache.solr.client.solrj.SolrServerException: java.net.SocketException: Connection reset at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:282) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.solr.client.solrj.SolrServerException: java.net.SocketException: Connection resetat org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:472) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243) at org.apache.solr.handler.component.HttpCommComponent$1.call(SearchHandler.java:422) at org.apache.solr.handler.component.HttpCommComponent$1.call(SearchHandler.java:394) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138)at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138)at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) ... 1 moreCaused by: java.net.SocketException: Connection resetat java.net.SocketInputStream.read(SocketInputStream.java:168)at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)at java.io.BufferedInputStream.read(BufferedInputStream.java:237)at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78) at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106) at org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413) at org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1973) at org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735) at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098) at org.apache.commons.httpc
Re: date slider
> http://wiki.apache.org/solr/StatsComponent can give you > min and max values. Sorry my bad, I just tested StatsComponent with tdate field. And it is not working for date typed fields. Wiki says it is for numeric fields.
Re: date slider
On 16.05.2010, at 21:01, Ahmet Arslan wrote: http://wiki.apache.org/solr/StatsComponent can give you min and max values. Sorry my bad, I just tested StatsComponent with tdate field. And it is not working for date typed fields. Wiki says it is for numeric fields. ok thx for checking. is my use case really so unusual? i guess i could store a unix timestamp or i just do a fixed range. hmm if i use facets with a really large gap will it always give me at least the min and max maybe? will try it out when i get home. regards Lukas
RE: Solr Deployment Question
They are two web applications running on a single Tomcat instance. Thanks Madu -Original Message- From: findbestopensource [mailto:findbestopensou...@gmail.com] Sent: Friday, 14 May 2010 4:38 PM To: solr-user@lucene.apache.org Subject: Re: Solr Deployment Question Please explain how you have handled two indexes in a single VM. Is it multi core? To identify memory consumption, You need to calculate usedmemory before and after loading the indexes, basically calculate usedmemory before and after any check point you want to analyse. Their difference will give you the actual memory consumption. Regards Aditya http://www.findbestopensource.com On Fri, May 14, 2010 at 11:14 AM, Maduranga Kannangara < mkannang...@infomedia.com.au> wrote: > But even we used a single index, we were running out of memory. > What do you mean by "active"? No queries on the masters. > Only one index is being processed/optimized. > > Also, if I may add to my same question, how can I find the > amount of memory that an index would use, theoretically? > i.e.: Is there a formulae etc? > > Thanks > Madu > > > > -Original Message- > From: findbestopensource [mailto:findbestopensou...@gmail.com] > Sent: Friday, 14 May 2010 3:34 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr Deployment Question > > You may use one index at a time, but both indexes are active and loaded all > its terms in memory. Memory consumption will be certainly more. > > Regards > Aditya > http://www.findbestopensource.com > > On Fri, May 14, 2010 at 10:28 AM, Maduranga Kannangara < > mkannang...@infomedia.com.au> wrote: > > > Hi > > > > We use separate JVMs to Index and Query. > > (Client applications will query only slaves, > > while master does only indexing) > > > > Recently we moved a two master indexes to > > a single JVM. Our memory allocation was for > > each index was 512Mb and 1Gb. > > > > Once we moved both indexes to a single VM, > > we thought it would still Index using 1Gb as we > > use only one index at a time. But for our surprise > > it needed more than that (1.2Gb) even though > > only one index was used at a time. > > > > Can I know why, or can I know how to find > > why this is? > > > > Solr 1.4 > > Java 1.6.0_20 > > > > We use a VPS for deployment. > > > > Thanks in advance > > Madu > > > > > > >
Solr Search problem; cannot search the existing word in the index content
Hi, I'm working on the index/search project recently and i found solr which is very fascinating to me. I followed the test successful from the tutorial page. Starting up jetty and run adding new xml (user:~/solr/example/exampledocs$ *java -jar post.jar *.xml*) so far so good at this stage. Now i have create my own testing westpac.xml file with real data I intend to implement, putting in exampledocs and again ran the command (user:~/solr/example/exampledocs$ *java -jar post.jar westpac.xml*). Everything went on very well however when i searched for "*rhode*" which is in the content. And Index returned nothing. Could anyone guide me what I did wrong why i couldn't search for that word even though that word is in my index content. thanks, Mint