Re: SORTING RESULTS BASED ON RELAVANCY
The default sort is by relevancy. So, if you are getting it in the wrong order, it think it is relevant in different ways. Depending on algorithm you use, there are different boosting functions. You may need to give more details. Algorithm, how would you know if relevance sorting working, etc. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Wed, Sep 18, 2013 at 1:50 PM, PAVAN wrote: > Hi, > >i am using fuzzy logic and it is giving exact results but i need to > sort the results based on relavancy. Means closer match results comes > first. > > anyone can help with this.. > > > Regards, > Pavan. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/SORTING-RESULTS-BASED-ON-RELAVANCY-tp4090789.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: Re-Ranking results based on DocValues with custom function.
Got it! Just for you to share ... and maybe for inclusion in the Java API docs of ValueSource :) For sorting one needs to implement the method public double doubleVal(int) of the class ValueSource then it works like a charm. cheers, Mathias On Tue, Sep 17, 2013 at 6:28 PM, Chris Hostetter wrote: > > : It basically allows for searching for text (which is associated to an > : image) in an index and then getting the distance to a sample image > : (base64 encoded byte[] array) based on one of five different low level > : content based features stored as DocValues. > > very cool. > > : So there one little tiny question I still have ;) When I'm trying to > : do a "sort" I'm getting > : > : "msg": "sort param could not be parsed as a query, and is not a field > : that exists in the index: > : lirefunc(cl_hi,FQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA=)", > : > : for the call > http://localhost:9000/solr/lire/select?q=*%3A*&sort=lirefunc(cl_hi%2CFQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA%3D)+asc&fl=id%2Ctitle%2Clirefunc(cl_hi%2CFQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA%3D)&wt=json&indent=true > > Hmmm... > > i think the crux of the issue is your string literal. function parsing > tries to make live easy for you by not requiring string literals to be > quoted unless they conflict with other function names or field names > etc on top of that the sort parsing code is kind of hueristic based > (because it has to account for both functions or field names or wildcards, > followed by other sort clauses, etc...) so in that context the special > characters like '=' in your base64 string literal might be confusing hte > hueristics. > > can you try to quote the string literal it and see if that works? > > For example, when i try using strdist with your base64 string in a sort > param using the example configs i get the same error... > > http://localhost:8983/solr/select?q=*:*&sort=strdist%28name,FQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA=,jw%29+asc > > but if i quote the string literal it works fine... > > http://localhost:8983/solr/select?q=*:*&sort=strdist%28name,%27FQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA=%27,jw%29+asc > > > > -Hoss -- Dr. Mathias Lux Assistant Professor, Klagenfurt University, Austria http://tinyurl.com/mlux-itec
Re: Stop zookeeper from batch
Yeah, but its not yet into the zookeeper's latest releases. Is it fine with using it. On Wed, Sep 18, 2013 at 2:39 AM, Furkan KAMACI wrote: > Are you looking for that: > > https://issues.apache.org/jira/browse/ZOOKEEPER-1122 > > 16 Eylül 2013 Pazartesi tarihinde Prasi S adlı > kullanıcı şöyle yazdı: > > Hi, > > We have setup solrcloud with zookeeper and 2 tomcats . we are using a > batch > > file to start the zookeeper, uplink config files and start tomcats. > > > > Now, i need to stop zookeeper from the batch file. How is this possible. > > > > Im using Windows server. Zookeeper 3.4.5 version. > > > > Pls help. > > > > Thanks, > > Prasi > > >
Re: FAcet with " " values are displayes in output
Filter them out in your query, or in your display code. Upayavira On Wed, Sep 18, 2013, at 06:36 AM, Prasi S wrote: > Hi , > Im using solr 4.4 for our search. When i query for a keyword, it returns > empty valued facets in the response > > > > > > *1* > 1 > > > > > > > I have also tried using facet.missing parameter., but no change. How can > we > handle this. > > > Thanks, > Prasi
Re: SORTING RESULTS BASED ON RELAVANCY
On 18 September 2013 12:39, Alexandre Rafalovitch wrote: > The default sort is by relevancy. So, if you are getting it in the wrong > order, it think it is relevant in different ways. Depending on algorithm > you use, there are different boosting functions. [...] Also, you can get an explanation of the scoring by adding &debugQuery=on to the Solr search URL. Please see http://wiki.apache.org/solr/CommonQueryParameters#debugQuery Regards, Gora
Re: Solr SpellCheckComponent only shows results with certain fields
what about this query? try to see if you get suggestions here: /solr/collection1/select?q=*%3Abecaus&wt=json&indent=true&spellcheck=true On Wed, Sep 18, 2013 at 4:02 AM, jazzy wrote: > I'm trying to get the Solr SpellCheckComponent working but am running into > some issues. When I run > .../solr/collection1/select?q=%3A&wt=json&indent=true > > These results are returned > > { > "responseHeader": { > "status": 0, > "QTime": 1, > "params": { > "indent": "true", > "q": "*:*", > "_": "1379457032534", > "wt": "json" > } > }, > "response": { > "numFound": 2, > "start": 0, > "docs": [ > { > "enterprise_name": "because", > "name": "doc1", > "enterprise_id": "100", > "_version_": 1446463888248799200 > }, > { > "enterprise_name": "what", > "name": "RZTEST", > "enterprise_id": "102", > "_version_": 1446464432735518700 > } > ] > } > } > Those are the values that I have indexed. Now when I want to query for > spelling I get some weird results. > > When I run > > .../solr/collection1/select?q=name%3Arxtest&wt=json&indent=true&spellcheck=true > > The results are accurate and I get > > { > "responseHeader":{ > "status":0, > "QTime":4, > "params":{ > "spellcheck":"true", > "indent":"true", > "q":"name:rxtest", > "wt":"json"}}, > "response":{"numFound":0,"start":0,"docs":[] > }, > "spellcheck":{ > "suggestions":[ > "rxtest",{ > "numFound":1, > "startOffset":5, > "endOffset":11, > "suggestion":["rztest"]}]}} > Anytime I run a query without the name values I get 0 results back. > > /solr/collection1/select?q=enterprise_name%3Abecaus&wt=json&indent=true&spellcheck=true > > { > "responseHeader":{ > "status":0, > "QTime":5, > "params":{ > "spellcheck":"true", > "indent":"true", > "q":"enterprise_name:becaus", > "wt":"json"}}, > "response":{"numFound":0,"start":0,"docs":[] > }, > "spellcheck":{ > "suggestions":[]}} > My guess is that there is something wrong in my scheme but everything looks > fine. > > Schema.xml > > > required="true" /> > stored="true"/> > > multiValued="true" /> > > stored="true"/> > stored="true" multiValued="true"/> > stored="true" multiValued="true"/> > > > > > > positionIncrementGap="100"> > > > words="stopwords.txt" /> > > > > > > words="stopwords.txt" /> > ignoreCase="true" expand="true"/> > > > > solrconfig.xml > > > > >explicit >10 >text > >default > > wordbreak > > false > > false > > 5 > > > > spellcheck > > > > > > > > default > > solr.IndexBasedSpellChecker > > name > > ./spellchecker > > 0.5 > > .0001 > true > > > > wordbreak > solr.WordBreakSolrSpellChecker > name > true > true > 3 > true > > > > text_general > > > Any help would be appreciated. > Thanks! > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-SpellCheckComponent-only-shows-results-with-certain-fields-tp4090727.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Regards, Raheel Hasan
Re: FAcet with " " values are displayes in output
How to filter them in the query itself? Thanks, Prasi On Wed, Sep 18, 2013 at 1:06 PM, Upayavira wrote: > Filter them out in your query, or in your display code. > > Upayavira > > On Wed, Sep 18, 2013, at 06:36 AM, Prasi S wrote: > > Hi , > > Im using solr 4.4 for our search. When i query for a keyword, it returns > > empty valued facets in the response > > > > > > > > > > > > *1* > > 1 > > > > > > > > > > > > > > I have also tried using facet.missing parameter., but no change. How can > > we > > handle this. > > > > > > Thanks, > > Prasi >
Re: FAcet with " " values are displayes in output
This is likely because you added an empty value to the Country field for one (in that result set) document. I imagine this is a data issue and that you either need to clean up the data or avoid indexing blank values. Erik On Sep 18, 2013, at 1:36 AM, Prasi S wrote: > Hi , > Im using solr 4.4 for our search. When i query for a keyword, it returns > empty valued facets in the response > > > > > > *1* > 1 > > > > > > > I have also tried using facet.missing parameter., but no change. How can we > handle this. > > > Thanks, > Prasi
Generating similar (related) searches a la Google
I am using Apache Solr 3.6. I have been playing around with the idea of providing a "similar" search in the same way Google provides a link against some results with the ability to search for pages similar to the current result: E.g. related:lucene.apache.org/solr/ apache solr One method I tried was to use MoreLikeThis on my title field to generate a list of results: ?q=experiment&fl=key,id,title&fq=view:item&bf=title^100 dc.description.abstract_sm^50&mlt=true&mlt.fl=title which gives me moreLikeThis results. If an item has a matching moreLikeThis result with numFound not equal to zero I can go ahead and link to a new query using my /mlt request handler, using a unique item key and the keyword to build the query: q=key:com_jspace.item.96 AND experiment&fl=title&mlt.fl=title&start=0&rows=10&mlt.interestingTerms=details This works well, providing me with paging, etc but one downside is the inability to highlight results with the keyword "experiment". It is my understanding that highlighting is not available as part of the mlt request handler so I'm wondering if there is another way to generate my search results for items related to another item? Or perhaps I'm approaching this all wrong. Any direction, even "you can't do that" much appreciated. Cheers Hayden
Solr Cloud dataimport freezes
Hi guys, I have a problem with data import (based on database sql) in Solr Cloud. I'm trying to import ~500 000 000 of documents and I've created 30 logical shards on 2 physical machines. Documents are distributed by composite id. After some time (5-10 minutes; about 400 000 documents) Solr Cloud stops indexing documents. This is because indexing thread parks and waits on semaphore: org.apache.solr.update.SolrCmdDistributor#semaphore.acquire() in method submit. While indexing I see jdbc calls in stack trace but after it parks on semaphore I don't see any jdbc calls (I see only Solr and JDK method calls). Version of Solr: 4.4 Version of Lucene: 4.4 *With one shard and one physical machines everything is OK* *With one shard and two physical machines (one leader, one replica) everything is OK* This is really big problem for us because of large number of documents we have to shard index. We have unique queries with sorting so it leads to 1 minute long response times without sharding. Best, Kowish -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Cloud-dataimport-freezes-tp4090812.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Cloud dataimport freezes
Update: - it works for 8 shards. I'm going to test it on 16 shards. Any ideas what is going on? :-) -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Cloud-dataimport-freezes-tp4090812p4090832.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Problem indexing windows files
Thanks for your help I try to look at the logs but didn't see anything in solr or manifolcf log files. I don't know where is Tika log file I download the binary of solr 4.4 and I am using the example in there On Wed, Sep 18, 2013 at 12:02 AM, Furkan KAMACI wrote: > Firstly; > > This may not be a Solr related problem. Did you check the log file of Solr? > Tika mayhave some circumstances at some kind of situations. For example > when parsing HTML that has a base64 encoded image it may have some > problems. If you find the correct logs you can detect it. On the other take > care of Manifold, there may be some problem too. > > 17 Eylül 2013 Salı tarihinde Yossi Nachum adlı > kullanıcı şöyle yazdı: > > Hi, > > > > I am trying to index my windows pc files with manifoldcf version 1.3 and > > solr version 4.4. > > > > I create output connection and repository connection and started a new > job > > that scan my E drive. > > > > Everything seems like it work ok but after a few minutes solr stop > getting > > new files to index. I am seeing that through tomcat log file. > > > > On manifold crawler ui I see that the job is still running but after few > > minutes I am getting the following error: > > "Error: Repeated service interruptions - failure processing document: > Read > > timed out" > > > > I am seeing that tomcat process is constantly consume 100% of one cpu (I > > have two cpu's) even after I get the error message from manifolfcf > crawler > > ui. > > > > I check the thread dump in solr admin and saw that the following threads > > take the most cpu/user time > > " > > http-8080-3 (32) > > > >- java.io.FileInputStream.readBytes(Native Method) > >- java.io.FileInputStream.read(FileInputStream.java:236) > >- java.io.BufferedInputStream.fill(BufferedInputStream.java:235) > >- java.io.BufferedInputStream.read1(BufferedInputStream.java:275) > >- java.io.BufferedInputStream.read(BufferedInputStream.java:334) > >- org.apache.tika.io.ProxyInputStream.read(ProxyInputStream.java:99) > >- java.io.FilterInputStream.read(FilterInputStream.java:133) > >- org.apache.tika.io.TailStream.read(TailStream.java:117) > >- org.apache.tika.io.TailStream.skip(TailStream.java:140) > >- > org.apache.tika.parser.mp3.MpegStream.skipStream(MpegStream.java:283) > >- org.apache.tika.parser.mp3.MpegStream.skipFrame(MpegStream.java:160) > >- > > > org.apache.tika.parser.mp3.Mp3Parser.getAllTagHandlers(Mp3Parser.java:193) > >- org.apache.tika.parser.mp3.Mp3Parser.parse(Mp3Parser.java:71) > >- > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) > >- > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) > >- > > > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) > >- > > > > > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219) > >- > > > > > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) > >- > > > > > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) > >- > > > > > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241) > >- org.apache.solr.core.SolrCore.execute(SolrCore.java:1904) > >- > > > > > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659) > >- > > > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362) > >- > > > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158) > >- > > > > > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > >- > > > > > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > >- > > > > > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > >- > > > > > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > >- > > > > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) > >- > > > > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > >- > > > > > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > >- > > > > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) > >- > > > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) > >- > > > > > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) > >- > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) > >- java.lang.Thread.run(Thread.java:679) > > > > " > > > > does anyone know what can I do? how to debug this issue? how can I check > > which file cause tika to work so hard? > > I don't see anything in the log files and I
SolrCloud - Service unavailable
Hi All, I am using 3 Solr instances behind an Amazon ELB with 1 shared. Serving data via Solr works as expected, however I noticed a few times a 503 error was poping up from the applications accessing Solr. Accessing Solr is done via the AWS ELB. 3 Zookeeper instances also run on the same instances as Solr on a separate disk. Solr version is 4.4. This issue seems to be a sporadic issue. Has anyone else observed this kind of behavior ? Thanks, Indika
Installation issue with solr server
Hi, I have installed solr server on Ubuntu 12.04 LTS.I am able to access http://machineip:8983/solr but when i do curl "http://machineip:8983/solr"; Its giving me "Proxy authorization error" What can be the problem? Is it due to corporate firewalls? I have given proxy settings in .bashrc file ,/etc/apt/apt.conf and in /etc/environment file and restarted the machine But did not work. Regards, Chhaya Vishwakarma The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and using or disseminating the information, and must notify the sender and delete it from their system. L&T Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail"
Re: SORTING RESULTS BASED ON RELAVANCY
Hi alex, Thanks for you replycan you please check the following details and give me suggestions how can i do it, then it will be more helpful to me i am passing query parameters like http://localhost:8080/solr/core/c=cityname&s=iphne+4&s1=iphne~0.5&s2=4~0.5 here "s" is the main string and splitted into s1 and s2 for fuzzy if a user search for "iphne 4" first it has to check exact match it is not found then i am splitting string into two strings s1 and s2. i am adding ~0.5 for both s1 and s2. i need "iphone 4" result first and i did the configuration in the following way... AND fsw_title 15 {!edismax v=$s} city:All OR _query_:"{!field f=city v=$c}" true mpId 3 true edismax false tsw_title^15.0 tf_title^10.0 tsw_keywords^1 keywords^0.5 fsw_title~1^50.0 fsw_title~1^25.0 sum(product(typeId,100),weightage) OR fsw_title 20 _query_:"{!edismax qf=$qfs1 pf=$pfx pf2=$pf2x v=$s1}" AND _query_:"{!edismax qf=$qfs2 v=$s2}" city:All OR _query_:"{!field f=city v=$c}" true mpId 5 true false fsw_title^30 tsw_title^20 tf_title^15.0 keywords^1.0 tsw_title^15.0 tf_title^10.0 tsw_keywords^1 keywords^0.5 fsw_title~1^100.0 fsw_title~1^50.0 fsw_title~1^25.0 product(typeId,100) -- View this message in context: http://lucene.472066.n3.nabble.com/SORTING-RESULTS-BASED-ON-RELAVANCY-tp4090789p4090794.html Sent from the Solr - User mailing list archive at Nabble.com.
Installation issue with solr server
Hi, I have installed solr server on Ubuntu 12.04 LTS.I am able to access http://machineip:8983/solr but when i do curl “http://machineip:8983/solr” Its giving me “Proxy authorization error” What can be the problem? Is it due to corporate firewalls? I have given proxy settings in .bashrc file ,/etc/apt/apt.conf and in /etc/environment file and restarted the machine But did not work. Regards, Chhaya Vishwakarma -- View this message in context: http://lucene.472066.n3.nabble.com/Installation-issue-with-solr-server-tp4090816.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Installation issue with solr server
On 18 September 2013 16:26, Chhaya Vishwakarma wrote: > Hi, > > I have installed solr server on Ubuntu 12.04 LTS.I am able to access > http://machineip:8983/solr but when i do curl "http://machineip:8983/solr"; > Its giving me "Proxy authorization error" > What can be the problem? Is it due to corporate firewalls? > I have given proxy settings in .bashrc file ,/etc/apt/apt.conf and in > /etc/environment file and restarted the machine > But did not work. This question is off-topic for this list. You would be better off asking on an Ubuntu-specific list. However, please see if this helps: http://askubuntu.com/questions/15719/where-are-the-system-wide-proxy-server-settings Regards, Gora
Re: Solr Cloud dataimport freezes
On 9/18/2013 3:40 AM, kowish.adamosh wrote: > I have a problem with data import (based on database sql) in Solr Cloud. I'm > trying to import ~500 000 000 of documents and I've created 30 logical > shards on 2 physical machines. Documents are distributed by composite id. > After some time (5-10 minutes; about 400 000 documents) Solr Cloud stops > indexing documents. This is because indexing thread parks and waits on > semaphore: > org.apache.solr.update.SolrCmdDistributor#semaphore.acquire() in method > submit. There are some SolrCloud bugs that we expect will be fixed in version 4.5. Basically what happens is that when a large number of updates are being distributed from whichever core receives them to the appropriate shard replicas, managing all those requests results in a deadlock. If everything goes well with the release, 4.5 will be out sometime within the next two weeks. You can always download and build the "branches/lucene_solr_4_5" code branch from SVN if you want to try out what will become Solr 4.5: http://wiki.apache.org/solr/HowToContribute#Getting_the_source_code SOLR-4816 is semi-related, because it helps avoid the problem in the first place when using CloudSolrServer in a java program. I'm having a hard time finding the jira issue number(s) for the underlying problem(s), but I know some changes were committed recently specifically for this problem. Thanks, Shawn
Re: SolrCloud - Service unavailable
On 9/18/2013 8:12 AM, Indika Tantrigoda wrote: > I am using 3 Solr instances behind an Amazon ELB with 1 shared. Serving > data via Solr works as expected, however I noticed a few times a 503 error > was poping up from the applications accessing Solr. Accessing Solr is done > via the AWS ELB. > > 3 Zookeeper instances also run on the same instances as Solr on a separate > disk. > > Solr version is 4.4. > > This issue seems to be a sporadic issue. Has anyone else observed this kind > of behavior ? What kind of session timeouts have you configured on the amazon load balancer? I've never used amazon services, but hopefully this is configurable. If the timeout is low enough, it could be just that the request is taking longer than that to execute. You may need to increase that timeout. Aside from general performance issues, one thing that can cause long request times is stop-the-world Java garbage collections. This can be a sign that your heap is too small, too large, or that your garbage collection hasn't been properly tuned. http://wiki.apache.org/solr/SolrPerformanceProblems#GC_pause_problems http://wiki.apache.org/solr/SolrPerformanceProblems#How_much_heap_space_do_I_need.3F That same wiki page has another section about the OS disk cache. Not having enough memory for this is the cause of a lot of performance issues: http://wiki.apache.org/solr/SolrPerformanceProblems#OS_Disk_Cache Thanks, Shawn
Solrcloud - adding a node as a replica?
Hi, How do I add a node as a replica to a solrcloud cluster? Here is my situation: some time ago, I created several collections with replicationFactor=2. Now I need to add a new replica. I thought just starting a new node and re-using the same zokeeper instance would make it automatically a replica, but that isn't the case. Do I need to delete and re-create my collections with the right replicationFactor (3 in this case) again? I am using solr 4.3.0. Thanks, didier
Re: Solr SpellCheckComponent only shows results with certain fields
Hey, I figured it out! So the reason that only the name was working was because name was the only field configured in the solrconfig. Once I did that then I followed this link to solve the rest of the problem. SOLR suggester multiple field autocomplete -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-SpellCheckComponent-only-shows-results-with-certain-fields-tp4090727p4090891.html Sent from the Solr - User mailing list archive at Nabble.com.
Memory Using In Faceted Search (UnInvertedField's)
Hello, I'm using Solr 4.3.1 for faceted search and have 4 fields used for faceting. My question is about memory consumtion. I've set up heap size to use 6Gb of RAM, but I see in resource monitor it uses much more than that - up to 10Gb where 4 Gb is reported as shareable memory. I've calculated the size of cached set of UnInverted fields and it's 2Gb - I'm fine with that, both GC monitor and 'fieldValueCache' stats in 'Plugins/Stats' UI for Solr report that. But I can't understand what's that memory that's being reserved after filling in fieldValueCache with uninverted fields (right in UnInvertedField.uninvert method) and not used (or not released). Is that some memory leak? Or is that something I should tune with garbage collector by making it more aggressive (GC only shows me 2.x Gb in Old Space and I see those UnInvertedField's there in heap dump)? Some info: Index size is 76 Gb. I have 6 shards there. Windows OS. Java 6.0.24. Best regards, Anton.
Re: FAcet with " " values are displayes in output
Any analysis happening on the country field during indexing? If so then facets are on tokens. -- View this message in context: http://lucene.472066.n3.nabble.com/FAcet-with-values-are-displayes-in-output-tp4090777p4090904.html Sent from the Solr - User mailing list archive at Nabble.com.
DIH field defaults or re-assigning field values
Hi All, I'm using the DataImportHandler to import documents to my index. I assign one of my document's fields by using a sub-entity from the root to look for a value in a file. I've got this part working. If the value isn't in the file or the file doesn't exist I'd like the field to be assigned a default value. Is there a way to do this? I think I'm looking for a way to re-assign the value of a field. If this is possible then I can assign the default value in the root entity and overwrite it if the value is found in the sub-entity. Ideas? Thanks, Tricia
Solr 4.4.0: Plugin init failure for [schema.xml] analyzer/tokenizer
Hello Experts, I am having trouble in upgrading from Solr 3.6 to Solr 4.4.0. I have placed required jars in "lib" directory. When I start the Tomcat instance it throws following error. Also pasted part of "conf/schema.xml" file. Solr 4.4.0 works perfect if I comment following lines. Schema.xml: Error log: 575 [coreLoadExecutor-3-thread-1] INFO org.apache.solr.schema.IndexSchema â Reading Solr Schema from schema.xml 583 [coreLoadExecutor-3-thread-2] INFO org.apache.solr.schema.IndexSchema â [nipTrendHistory] Schema name=NIP 597 [coreLoadExecutor-3-thread-1] INFO org.apache.solr.schema.IndexSchema â [nip] Schema name=NIP 649 [coreLoadExecutor-3-thread-1] ERROR org.apache.solr.core.CoreContainer â Unable to create core: nip org.apache.solr.common.SolrException: Plugin init failure for [schema.xml] fieldType "delimiterPatternMultiValue": Plugin init failure for [schema.xml] analyzer/tokenizer: class com.mycomp.as.sts.nps.solr.analysis.MultiValueTokenizerFactory at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:467) at org.apache.solr.schema.IndexSchema.(IndexSchema.java:164) at org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55) at org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:619) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:657) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:364) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.solr.common.SolrException: Plugin init failure for [schema.xml] analyzer/tokenizer: class com.mycomp.as.sts.nps.solr.analysis.MultiValueTokenizerFactory at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177) at org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:362) at org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:95) at org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151) ... 16 more Caused by: java.lang.ClassCastException: class com.mycomp.as.sts.nps.solr.analysis.MultiValueTokenizerFactory at java.lang.Class.asSubclass(Class.java:3018) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:543) at org.apache.solr.schema.FieldTypePluginLoader$2.create(FieldTypePluginLoader.java:342) at org.apache.solr.schema.FieldTypePluginLoader$2.create(FieldTypePluginLoader.java:335) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151) ... 20 more 651 [coreLoadExecutor-3-thread-1] ERROR org.apache.solr.core.CoreContainer â null:org.apache.solr.common.SolrException: Unable to create core: nip at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1150) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:666) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:364) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) C
Re: Solr 4.4.0: Plugin init failure for [schema.xml] analyzer/tokenizer
: I am having trouble in upgrading from Solr 3.6 to Solr 4.4.0. I have : placed required jars in "lib" directory. When I start the Tomcat : instance it throws following error. Also pasted part of : "conf/schema.xml" file. which jars exactly? where did you get the jars from? what version of solr were your custom jars compiled against : Caused by: java.lang.ClassCastException: class com.mycomp.as.sts.nps.solr.analysis.MultiValueTokenizerFactory According to that, your "MultiValueTokenizerFactory" can not be cast to the expected type -- this might be because of a classloader problem (ie: you may have multiple instances of the MultiValueTokenizerFactory class loaded that are confusing things, or multiple isntances of the lucene TokenizerFactory, etc... alternatively it might be because your MultiValueTokenizerFactory was compiled against the wrong version of solr (but IIRC that should genrate a diff error ... off the top of my head i'm not certain). -Hoss
Re: SORTING RESULTS BASED ON RELAVANCY
: Thanks for you replycan you please check the following details and : give me suggestions how can i do it, then it will be more helpful to me you need to show us some examples of your documents and the debugQuery=true output for those documents for us to better understand the behavior you are seeing. Unless i'm missing something: FuzzyQuery defaults to using the "TopTermsScoringBooleanQueryRewrite" method based on the terms found in the index that match the fuzzy expression. So the results of a simple fuzzy query should already come back based on the tf/idf scores of the terms. : if a user search for "iphne 4" first it has to check exact match it is not : found then i am splitting string into two strings s1 and s2. i am adding : ~0.5 for both s1 and s2. : i need "iphone 4" result first Ok, but you haven't given us any indication of what you are *actually* getting as your first result, so we're really can't even begin to guess why you aren't getting the results you expect. if you are seeing identical scores for all documents, then it's possibly because of some of hte other ways you have combined up custom params to build a complex query. In particular: no where in your example URL, or the configured defaults you pasted below, do you show us how you are ultimaitely building up the "q" param from the various custom params you have defined... : and i did the configuration in the following way... : : : AND :fsw_title :15 :{!edismax v=$s} : :city:All OR _query_:"{!field f=city v=$c}" :true :mpId :3 :true : :edismax :false :tsw_title^15.0 tf_title^10.0 tsw_keywords^1 : keywords^0.5 :fsw_title~1^50.0 :fsw_title~1^25.0 :sum(product(typeId,100),weightage) : : OR :fsw_title :20 :_query_:"{!edismax qf=$qfs1 pf=$pfx pf2=$pf2x v=$s1}" AND : _query_:"{!edismax qf=$qfs2 v=$s2}" : :city:All OR _query_:"{!field f=city v=$c}" :true :mpId :5 :true : :false :fsw_title^30 tsw_title^20 tf_title^15.0 : keywords^1.0 :tsw_title^15.0 tf_title^10.0 tsw_keywords^1 : keywords^0.5 :fsw_title~1^100.0 :fsw_title~1^50.0 :fsw_title~1^25.0 :product(typeId,100) -Hoss
Re: Memory Using In Faceted Search (UnInvertedField's)
On 9/18/2013 11:08 AM, an...@swooptalent.com wrote: > I'm using Solr 4.3.1 for faceted search and have 4 fields used for faceting. > My question is about memory consumtion. > I've set up heap size to use 6Gb of RAM, but I see in resource monitor it > uses much more than that - up to 10Gb where 4 Gb is reported as shareable > memory. > I've calculated the size of cached set of UnInverted fields and it's 2Gb - > I'm fine with that, both GC monitor and 'fieldValueCache' stats in > 'Plugins/Stats' UI for Solr report that. But I can't understand what's that > memory that's being reserved after filling in fieldValueCache with uninverted > fields (right in UnInvertedField.uninvert method) and not used (or not > released). > Is that some memory leak? Or is that something I should tune with garbage > collector by making it more aggressive (GC only shows me 2.x Gb in Old Space > and I see those UnInvertedField's there in heap dump)? I have noticed the same thing. I do not think there is an actual problem, but just something strange with the operating system memory reporting. https://www.dropbox.com/s/zacp4n3gu8wb9ab/idxb1-top-sorted-mem.png In the screenshot above, you can see that there is 64GiB total memory. There is 9012k being used by the OS disk cache and 9853824k free memory. If you add these two numbers up, you get a number that's roughly 51 GiB (54302836k). You can also see that it says Solr (4.2.1) has a resident size of 16g, with 11g of that in shareable memory. FYI, the max java heap is 6g, verified by the Solr dashboard and tools like jconsole. With these numbers, if Solr really did have a memory resident size of 16g, Solr's memory size plus the combined total of cached and free memory would require 3g of swap, but as you can see, there is zero swap in use. I don't know if the reporting problem can be fixed. It is interesting to know that the same thing happens on both Linux and Windows. Thanks, Shawn
Re: Solrcloud - adding a node as a replica?
Are yoh looking for that: http://lucene.472066.n3.nabble.com/SOLR-Cloud-Collection-Management-quesiotn-td4063305.html 18 Eylül 2013 Çarşamba tarihinde didier deshommes adlı kullanıcı şöyle yazdı: > Hi, > How do I add a node as a replica to a solrcloud cluster? Here is my > situation: some time ago, I created several collections > with replicationFactor=2. Now I need to add a new replica. I thought just > starting a new node and re-using the same zokeeper instance would make it > automatically a replica, but that isn't the case. Do I need to delete and > re-create my collections with the right replicationFactor (3 in this case) > again? I am using solr 4.3.0. > > Thanks, > didier >
Re: Re: Unable to getting started with SOLR
I suggest you to start from here: http://wiki.apache.org/solr/HowToCompileSolr 15 Eylül 2013 Pazar tarihinde Erick Erickson adlı kullanıcı şöyle yazdı: > If you're using the default jetty container, there's no log unless > you set it up, the content is echoed to the screen. > > About a zillion people have downloaded this and started it > running without issue, so you need to give us the exact > steps you followed. > > If you checked the code out from SVN, you need to build it, > go into /solr and execute > > ant example dist > > the "dist" bit isn't strictly necessary, but it builds the jars > that you link to if you try to develop custom plugins etc. > > Best, > Erick > > > On Fri, Sep 13, 2013 at 3:56 AM, Rah1x wrote: > >> I have the same issue can anyone tell me if they found a solution? >> >> >> >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p4089761.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >
Re: SORTING RESULTS BASED ON RELAVANCY
: Unless i'm missing something: FuzzyQuery defaults to using the : "TopTermsScoringBooleanQueryRewrite" method based on the terms found in : the index that match the fuzzy expression. So the results of a simple : fuzzy query should already come back based on the tf/idf scores of the : terms. to give a concrete example... using 4.4, with the example configs & sample data, this query... http://localhost:8983/solr/select?defType=edismax&qf=features&q=blak~2&fl=score,id,features&debugQuery=true ...matches two documents with differnet scores. the resulting scores are based on both the edit distance of the word that matches the fuzzy term (which durring query-rewriting is used as a term boost), and the tf/idf of those terms... A doc that contains "black" (edit distance 1 => boost * 0.75)... 0.39237294 = (MATCH) sum of: 0.39237294 = (MATCH) weight(features:black^0.75 in 26) [DefaultSimilarity], result of: 0.39237294 = score(doc=26,freq=1.0 = termFreq=1.0), product of: 0.83205026 = queryWeight, product of: 0.75 = boost 3.7725887 = idf(docFreq=1, maxDocs=32) 0.29406872 = queryNorm 0.4715736 = fieldWeight in 26, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 3.7725887 = idf(docFreq=1, maxDocs=32) 0.125 = fieldNorm(doc=26) ...compared to a doc that contains "book" (edit distance 2 => boost * 0.5)... 0.22888422 = (MATCH) sum of: 0.22888422 = (MATCH) weight(features:book^0.5 in 5) [DefaultSimilarity], result of: 0.22888422 = score(doc=5,freq=1.0 = termFreq=1.0), product of: 0.5547002 = queryWeight, product of: 0.5 = boost 3.7725887 = idf(docFreq=1, maxDocs=32) 0.29406872 = queryNorm 0.4126269 = fieldWeight in 5, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 3.7725887 = idf(docFreq=1, maxDocs=32) 0.109375 = fieldNorm(doc=5) -Hoss
Re: solr performance against oracle
Martin Fowler and Sadagale has a nice book about such kind of architectural designs: NoSQL Distilled Emerging Polyglot Persistence.If you read it you will see why to use a NoSQL or an RDBMS or both of them. On the other hand I have over 50+ millions of documents at a replicated nodes of SolrCloud and my average response time is ~10 ms So it depends on your architecture, configuration and hardware specifications. 12 Eylül 2013 Perşembe tarihinde Chris Hostetter adlı kullanıcı şöyle yazdı: > > Setting asside the excellent responses that have already been made in this > thread, there are fundemental discrepencies in what you are comparing in > your respective timing tests. > > first off: a micro benchmark like this is virtually useless -- unless you > really plan on only ever executing a single query in a single run of a > java application that then terminates, trying to time a single query is > silly -- you should do lots and lots of iterations using a large set of > sample inputs. > > Second: what you are timing is vastly different between the two cases. > > In your Solr timing, no communication happens over the wire to the solr > server until the call to server.query() inside your time stamps -- if you > were doing multiple requests using the same SolrServer object, the HTTP > connection would get re-used, but as things stand your timing includes all > of hte network overhead of connecting to the server, sending hte request, > and reading the response. > > in your oracle method however, the timestamps you record are only arround > the call to executeQuery(), rs.next(), and rs.getString() ... you are > ignoring the timing neccessary for the getConnection() and > prepareStatement() methods, which may be significant as they both involved > over the wire communication with the remote server (And it's not like > these are one time execute and forget about them methods ... in a real > long lived application you'd need to manage your connections, re-open if > they get closed, recreate the prepared statement if your connection has to > be re-open, etc... ) > > Your comparison is definitly apples and oranges. > > > Lastly, as others have mentioned: 150-200ms to request a single document > by uniqueKey from an index containing 800K docs seems ridiculously slow, > and suggests that something is poorly configured about your solr instance > (another apples to oranges comparison: you've got an ad-hoc solr > installation setup on your laptop and you're benchmarking it against a > remote oracle server running on dedicated remote hardware that has > probably been heavily tunned/optimized for queries). > > You haven't provided us any details however about how your index is setup, > or how you have confiugred solr, or what JVM options you are using to run > solr, or what physical resources are available to your solr process (disk, > jvm heap ram, os file system cache ram) so there isn't much we can offer > in the way of advice on how to speed things up. > > > FWIW: On my laptop, using Solr 4.4 w/ the example configs and built in > jetty (ie: "java -jart start.jar") i got a 3.4 GB max heap, and a 1.5 GB > default heap, with plenty of physical ram left over for the os file system > cache of an index i created containing 1,000,000 documents with 6 small > fields containing small amounts of random terms. I then used curl to > execute ~4150 requests for documents by id (using simple search, not the > /get RTG handler) and return the results using JSON. > > This commpleted in under 4.5 seconds, or ~1.0ms/request. > > Using the more verbose XML response format (after restarting solr to > ensure nothing in the query result caches) only took 0.3 seconds longer on > the total time (~1.1ms/request) > > $ time curl -sS ' http://localhost:8983/solr/collection1/select?q=id%3A[1-100:241]&wt=json&indent=true' > /dev/null > > real0m4.471s > user0m0.412s > sys 0m0.116s > $ time curl -sS ' http://localhost:8983/solr/collection1/select?q=id%3A[1-100:241]&wt=xml&indent=true' > /dev/null > > real0m4.868s > user0m0.376s > sys 0m0.136s > $ java -version > java version "1.7.0_25" > OpenJDK Runtime Environment (IcedTea 2.3.10) (7u25-2.3.10-1ubuntu0.12.04.2) > OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode) > $ uname -a > Linux frisbee 3.2.0-52-generic #78-Ubuntu SMP Fri Jul 26 16:21:44 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux > > > > > > > -Hoss >
Re: Querying a non-indexed field?
: Subject: Re: Querying a non-indexed field? : : No. --wunder To elaborate just a bit... : query on a few indexed fields, getting a small # of results. I want to : restrict this further based on values from non-indexed, stored fields. : I can obviously do this myself, but it would be nice if Solr could do ...you could implement this in a custom SearchComponent, or custom qparser that would generate PostFilter compatible queries, that looked at the stored field values -- but it's extremeley unlikeley that you would ever convince any of the lucene/solr devs to agree to commit a general purpose version of this type of logic into the code base -- because in the general case (arbitrary unknown number of documents matching the main query) it would be extremely inefficient and would encourage "bad" user behavior. -Hoss
Merge problem with lucene 3 and 4 indices
We have a process that builds small indices, and merges them into the master, and are in the process of going from solr 3.5 to solr 4.3. So, during this process we are going to have to merge indices built with solr 3 with ones built with solr 4. I'm running into a problem with an index built from that process. It was merged from a set of solr 3 indices by solr 4 code, but it wrote a solr 3 segment. Searching on the index works fine, however, this code: Directory[] indexes = new Directory[1]; indexes[0] = new NIOFSDirectory(new File(dir)); writer.addIndexes(indexes); fails in addIndexes() with Exception in thread "main" java.io.FileNotFoundException: _6.tis at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:266) at org.apache.lucene.index.SegmentInfoPerCommit.sizeInBytes(SegmentInfoPerCommit.java:88) at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:2319) If I rename the files from _0 to _6, it fails with Exception in thread "main" java.io.FileNotFoundException: /_ 0.si (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.(RandomAccessFile.java:212) at org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:410) at org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.(NIOFSDirectory.java:123) at org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:80) at org.apache.lucene.codecs.lucene3x.Lucene3xSegmentInfoReader.read(Lucene3xSegmentInfoReader.java:103) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:301) at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:347) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:783) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:630) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:343) If I copy the _0 files to _6, the merge works fine, but I don't understand why its trying to find the _6 segment in the first place. Hex dump of the segments_1 file: 000: 3fd7 6c17 0873 6567 6d65 6e74 7300 ?.l..segments... 010: 0300 0100 020: 0102 5f30 084c 7563 656e 6533 78ff .._0.Lucene3x... 030: ff00 040: 00f6 90cf 88 . Index directory: drwxrwxr-x 2 hhight general 4.0K Sep 18 17:19 ./ drwxrwxr-x 3 hhight general 4.0K Sep 18 13:46 ../ -rw-rw-r-- 1 hhight general 34M Sep 18 13:25 _0.fdt -rw-rw-r-- 1 hhight general 26K Sep 18 13:25 _0.fdx -rw-rw-r-- 1 hhight general 2.7K Sep 18 13:25 _0.fnm -rw-rw-r-- 1 hhight general 3.9M Sep 18 13:25 _0.frq -rw-rw-r-- 1 hhight general 195K Sep 18 13:25 _0.nrm -rw-rw-r-- 1 hhight general 8.5M Sep 18 13:25 _0.prx -rw-rw-r-- 1 hhight general 343 Sep 18 13:25 _0.si -rw-rw-r-- 1 hhight general 118K Sep 18 13:25 _0.tii -rw-rw-r-- 1 hhight general 8.5M Sep 18 13:25 _0.tis -rw-rw-r-- 1 hhight general 29 Sep 18 13:25 _0_upgraded.si -rw-rw-r-- 1 hhight general 69 Sep 18 13:25 segments_1 -rw-rw-r-- 1 hhight general 20 Sep 18 13:25 segments.gen Any suggestions on the cause of this?
RE: Solr 4.4.0: Plugin init failure for [schema.xml] analyzer/tokenizer
Thanks for the reply. * Following are the jars placed in "tomcat/lib" dir: annotations-api.jar el-api.jarjsp-api.jar lucene-core.jar solr-core-1.3.0.jar tomcat-dbcp.jar catalina-ant.jar jasper-el.jar jul-to-slf4j-1.6.6.jar private solr-dataimporthandler-4.4.0.jar tomcat-i18n-es.jar catalina-ha.jar jasper.jarlog4j-1.2.16.jar servlet-api.jar solr-dataimporthandler-extras-4.4.0.jar tomcat-i18n-fr.jar catalina.jar jasper-jdt.jarlog4j.properties slf4j-api-1.6.6.jar solr-solrj-4.4.0.jar tomcat-i18n-ja.jar catalina-tribes.jar jcl-over-slf4j-1.6.6.jar lucene-analyzers-common-4.2.0.jar slf4j-log4j12-1.6.6.jar tomcat-coyote.jar Jars in "tomcat/ webapps/ROOT/WEB-INF/lib/" commons-cli-1.2.jar hadoop-common-2.0.5-alpha.jar lucene-core-4.4.0.jar nps-solr-plugin-1.0-SNAPSHOT.jar commons-codec-1.7.jarhadoop-hdfs-2.0.5-alpha.jar lucene-grouping-4.4.0.jar org.restlet-2.1.1.jar commons-configuration-1.6.jarhttpclient-4.2.3.jar lucene-highlighter-4.4.0.jar org.restlet.ext.servlet-2.1.1.jar commons-fileupload-1.2.1.jar httpcore-4.2.2.jar lucene-memory-4.4.0.jar protobuf-java-2.4.0a.jar commons-io-2.1.jar httpmime-4.2.3.jar lucene-misc-4.4.0.jar solr-core-4.4.0.jar commons-lang-2.6.jar joda-time-2.2.jar lucene-queries-4.4.0.jar solr-dataimporthandler-4.4.0.jar concurrentlinkedhashmap-lru-1.2.jar lucene-analyzers-common-4.4.0.jar lucene-queryparser-4.4.0.jar solr-solrj-4.4.0.jar guava-14.0.1.jar lucene-analyzers-kuromoji-4.4.0.jar lucene-spatial-4.4.0.jar spatial4j-0.3.jar hadoop-annotations-2.0.5-alpha.jar lucene-analyzers-phonetic-4.4.0.jar lucene-suggest-4.4.0.jar wstx-asl-3.2.7.jar hadoop-auth-2.0.5-alpha.jar lucene-codecs-4.4.0.jar noggit-0.5.jarzookeeper-3.4.5.jar * I downloaded solr-4.4.0 instance from apache website. (http://www.apache.org/dyn/closer.cgi/lucene/solr/4.4.0) Most of the jars are from the "dist" directory and "example" directory. * Custom jars are compiled for Solr 4.4.0 version. I copied most of the jars from apache website. And few jars from www.java2s.com Thanks Abhi -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Wednesday, September 18, 2013 1:41 PM To: solr-user@lucene.apache.org Subject: Re: Solr 4.4.0: Plugin init failure for [schema.xml] analyzer/tokenizer : I am having trouble in upgrading from Solr 3.6 to Solr 4.4.0. I have : placed required jars in "lib" directory. When I start the Tomcat : instance it throws following error. Also pasted part of : "conf/schema.xml" file. which jars exactly? where did you get the jars from? what version of solr were your custom jars compiled against : Caused by: java.lang.ClassCastException: class com.mycomp.as.sts.nps.solr.analysis.MultiValueTokenizerFactory According to that, your "MultiValueTokenizerFactory" can not be cast to the expected type -- this might be because of a classloader problem (ie: you may have multiple instances of the MultiValueTokenizerFactory class loaded that are confusing things, or multiple isntances of the lucene TokenizerFactory, etc... alternatively it might be because your MultiValueTokenizerFactory was compiled against the wrong version of solr (but IIRC that should genrate a diff error ... off the top of my head i'm not certain). -Hoss
Re: DIH field defaults or re-assigning field values
You could also do this in request update processor. There is a default value one there. Also, I think field definition in schema allows defaults. Regards, Alex On 19 Sep 2013 02:20, "P Williams" wrote: > Hi All, > > I'm using the DataImportHandler to import documents to my index. I assign > one of my document's fields by using a sub-entity from the root to look for > a value in a file. I've got this part working. If the value isn't in the > file or the file doesn't exist I'd like the field to be assigned a default > value. Is there a way to do this? > > I think I'm looking for a way to re-assign the value of a field. If this > is possible then I can assign the default value in the root entity and > overwrite it if the value is found in the sub-entity. Ideas? > > Thanks, > Tricia >
Re: Querying a non-indexed field?
Moreover, you may be trying to save/optimize in a wrong place. Maybe these additional indexed fields are not so costly. Maybe you can optimize in some other part of your setup. Otis Solr & ElasticSearch Support http://sematext.com/ On Sep 18, 2013 5:47 PM, "Chris Hostetter" wrote: > > : Subject: Re: Querying a non-indexed field? > : > : No. --wunder > > To elaborate just a bit... > > : query on a few indexed fields, getting a small # of results. I want to > : restrict this further based on values from non-indexed, stored fields. > : I can obviously do this myself, but it would be nice if Solr could do > > ...you could implement this in a custom SearchComponent, or custom qparser > that would generate PostFilter compatible queries, that looked at the > stored field values -- but it's extremeley unlikeley that you would ever > convince any of the lucene/solr devs to agree to commit a general purpose > version of this type of logic into the code base -- because in the general > case (arbitrary unknown number of documents matching the main query) it > would be extremely inefficient and would encourage "bad" user behavior. > > -Hoss >
Re: SolrCloud - Service unavailable
Thanks Shawn, the links will be useful. I am still not sure if its related due to a timeout because the 503 error is coming from Tomcat, which means the requests are going through. I can access the Solr admin panel and I see a message saying the core was not initialized. Thanks, Indika On 18 September 2013 21:27, Shawn Heisey wrote: > On 9/18/2013 8:12 AM, Indika Tantrigoda wrote: > > I am using 3 Solr instances behind an Amazon ELB with 1 shared. Serving > > data via Solr works as expected, however I noticed a few times a 503 > error > > was poping up from the applications accessing Solr. Accessing Solr is > done > > via the AWS ELB. > > > > 3 Zookeeper instances also run on the same instances as Solr on a > separate > > disk. > > > > Solr version is 4.4. > > > > This issue seems to be a sporadic issue. Has anyone else observed this > kind > > of behavior ? > > What kind of session timeouts have you configured on the amazon load > balancer? I've never used amazon services, but hopefully this is > configurable. If the timeout is low enough, it could be just that the > request is taking longer than that to execute. You may need to increase > that timeout. > > Aside from general performance issues, one thing that can cause long > request times is stop-the-world Java garbage collections. This can be a > sign that your heap is too small, too large, or that your garbage > collection hasn't been properly tuned. > > http://wiki.apache.org/solr/SolrPerformanceProblems#GC_pause_problems > > http://wiki.apache.org/solr/SolrPerformanceProblems#How_much_heap_space_do_I_need.3F > > That same wiki page has another section about the OS disk cache. Not > having enough memory for this is the cause of a lot of performance issues: > > http://wiki.apache.org/solr/SolrPerformanceProblems#OS_Disk_Cache > > Thanks, > Shawn > >
Migrating a existing/splited shard to new node
Hi, We have started working on our current search from master/slave to SolrCloud. I have couple of questions related with expanding the nodes dynamically. Please help. 1. What is best way to migrate an existing shard to new node? is it just a creating a core on new node manually as below or there is another way? http://localhost:/solr/admin/cores?action=CREATE&name=testcollection_shard1_replica1&collection=testcollection&shard=shard1&collection.configName=collection1 2. How to create new replica dynamically? is just creating a new core as below or there is another way? http://localhost:/solr/admin/cores?action=CREATE&name=testcollection_shard1_replica2&collection=testcollection&shard=shard1&collection.configName=collection1 3. How to add a brand new shard to collection dynamically? is it just creating a new core with new shard name on a new node as below? will on newly created shard documents be distributed automatically? or this is not the right way and we should use shard splitting? http://localhost:/solr/admin/cores?action=CREATE&name=testcollection_shard2_replica1&collection=testcollection&shard=shard2&collection.configName=collection1 Thank you so much for help!! -Umesh -- View this message in context: http://lucene.472066.n3.nabble.com/Migrating-a-existing-splited-shard-to-new-node-tp4090991.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Need help with delta import
I am using below configuration file and The problem is I do not see any solr documents committed into Solr Core Selector 'db' When i run full-import,Is give me message. Indexing completed. Added/Updated: 0 documents. Deleted 0 documents. Requests: 1, Fetched: 8, Skipped: 0, Processed: 0 When i run delta-import,It gives me message. Requests: 0, Fetched: 0, Skipped: 0, Processed: 0 solrconfig.xml == 4.4 db1-data-config.xml schema.xml solrp_id db1-data-config.xml = -- View this message in context: http://lucene.472066.n3.nabble.com/Need-help-with-delta-import-tp4025003p4090999.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: FAcet with " " values are displayes in output
No analysis is done on the facets. The facets are string fields. On Wed, Sep 18, 2013 at 11:59 PM, tamanjit.bin...@yahoo.co.in < tamanjit.bin...@yahoo.co.in> wrote: > Any analysis happening on the country field during indexing? If so then > facets are on tokens. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/FAcet-with-values-are-displayes-in-output-tp4090777p4090904.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: FAcet with " " values are displayes in output
q=country:[* TO *] will find all docs that have a value in a field. However, it seems you have a space, which *is* a value. I think Eric is right - track down that record and fix the data. Upayavira On Wed, Sep 18, 2013, at 09:23 AM, Prasi S wrote: > How to filter them in the query itself? > > Thanks, > Prasi > > > On Wed, Sep 18, 2013 at 1:06 PM, Upayavira wrote: > > > Filter them out in your query, or in your display code. > > > > Upayavira > > > > On Wed, Sep 18, 2013, at 06:36 AM, Prasi S wrote: > > > Hi , > > > Im using solr 4.4 for our search. When i query for a keyword, it returns > > > empty valued facets in the response > > > > > > > > > > > > > > > > > > *1* > > > 1 > > > > > > > > > > > > > > > > > > > > > I have also tried using facet.missing parameter., but no change. How can > > > we > > > handle this. > > > > > > > > > Thanks, > > > Prasi > >