Date Search with q query parameter
Hi, I am facing an issue with the date field, I have in my records. e.g. I am using q query parameter and passing some string as search criteria like "test". While creating query with q parameter, how query forms is: column1:test | column2:test | column3:test . ... I have one column as date column, which is appended with _dt like column4_dt. Now, when it creates the query like column1:test | column2:test | column3:test | column4_dt:test Here it throws an exception saying "Invalid date format". Please suggest how I can prevent this. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Date-Search-with-q-query-parameter-tp22471072p22471072.html Sent from the Solr - User mailing list archive at Nabble.com.
Highlighting the searched term in resultset
I was wondering if there is any way of highlighting the searched term in the resultset directly instead of having it as a separate "lst" element. Doing it through xsl transformation would be one way. Has anybody implemented any other better solution ? e.g iPhone iphone sell buy/str> 2007-11-20T05:36:29Z 2007-11-17T06:00:00Z ARTICLE TIA.
Re: Date Search with q query parameter
Is your final query in this format ? col1:[2009-01-01T00:00:00Z+TO+2009-01-01T23:59:59Z] From: dabboo To: solr-user@lucene.apache.org Sent: Thursday, March 12, 2009 12:27:48 AM Subject: Date Search with q query parameter Hi, I am facing an issue with the date field, I have in my records. e.g. I am using q query parameter and passing some string as search criteria like "test". While creating query with q parameter, how query forms is: column1:test | column2:test | column3:test . ... I have one column as date column, which is appended with _dt like column4_dt. Now, when it creates the query like column1:test | column2:test | column3:test | column4_dt:test Here it throws an exception saying "Invalid date format". Please suggest how I can prevent this. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Date-Search-with-q-query-parameter-tp22471072p22471072.html Sent from the Solr - User mailing list archive at Nabble.com.
How to do Date Search in Solr
Hi, I am implementing date search criteria in Solr using q query parameter with dismax request. This is the data, I have. 2.4177726E-8 product_4100025 product_4100025 productIndex productIndex Fiction Fiction CE CE 0205583527 0205583527 9781592402939 9781592402939 Daughter of the Mountains Daughter of the Mountains 2/1/1993 2/1/1993 4100025 4100025 1993-02-01T12:00:00Z 51159 51159 Now, if I want to search the records only on the date field like productPublicationDate_product_dt">1993-02-01T12:00:00Z, how I will have to give my parameter to get the matching value. Also, how I can form date range queries with dismax request. Thanks in advance. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/How-to-do-Date-Search-in-Solr-tp22471461p22471461.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr 1.3 and Solr 1.4 difference?
Hi What is the exact difference between Solr 1.3 and Solr 1.4 (Nightly build as of now)?? Heard SolrJ is not part of Solr and performance is great in Solr 1.4 Please tell exactly what all going to differ in Solr 1.4 If possible please provide a pointer which describes the same. - Regards, Praveen -- View this message in context: http://www.nabble.com/Solr-1.3-and-Solr-1.4-difference--tp22471477p22471477.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Problem using DIH templatetransformer to create uniqueKey: solved
Folks, Template transformer will fail to return if a variable if undefined, however the regex transformer does still return. So where the following would fail:- This can be used instead:- So I guess we have the best of both worlds! Fergus. >Hmmm. Just gave that a go! No luck >But how many layers of defaults do we need? > > >Rgds Fergus > >>What about having the template transformer support ${field:default} >>syntax? I'm assuming it doesn't support that currently right? The >>replace stuff in the config files does though. >> >> Erik >> >> >>On Feb 13, 2009, at 8:17 AM, Fergus McMenemie wrote: >> >>> Paul, >>> >>> Following up your usenet sussgetion: >>> >>> >> ignoreMissingVariables="true"/> >>> >>> and to add more to what I was thinking... >>> >>> if the field is undefined in the input document, but the schema.xml >>> does allow a default value, then TemplateTransformer can use the >>> default value. If there is no default value defined in schema.xml >>> then it can fail as at present. This would allow "" or any other >>> value to be fed into TemplateTransformer, and still enable avoidance >>> of the partial strings you referred to. >>> >>> Regards Fergus. >>> > Hello, > > templatetransformer behaves rather ungracefully if one of the > replacement > fields is missing. Looking at TemplateString.java I see that left to itself fillTokens would replace a missing variable with "". It is an extra check in TemplateTransformer that is throwing the warning and stopping the row being returned. Commenting out the check seems to solve my problem. Having done this, an undefined replacement string in TemplateTransformer is replaced with "". However a neater fix would probably involve making use of the default value which can be assigned to a row? in schema.xml. > I am parsing a single XML document into multiple separate solr > documents. > It turns out that none of the source documents fields can be used > to create > a uniqueKey alone. I need to combine two, using template > transformer as > follows: > > dataSource="myfilereader" > processor="XPathEntityProcessor" > url="${jc.fileAbsolutePath}" > rootEntity="true" > stream="false" > forEach="/record | /record/mediaBlock" > transformer > ="DateFormatTransformer,TemplateTransformer,RegexTransformer" >> > > > sourceColName="fileAbsolutePath"/> > > > > The trouble is that vurl is only defined as a child of "/record/ > mediaBlock" > so my attempt to create id, the uniqueKey fails for the parent > document "/record" > > I am hacking around with "TemplateTransformer.java" to sort this > but was > wondering if there was a good reason for this behavior. > -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Re: Solr 1.3 and Solr 1.4 difference?
here is the exhaustive list of all changes in 1.4 http://svn.apache.org/repos/asf/lucene/solr/trunk/CHANGES.txt On Thu, Mar 12, 2009 at 3:29 PM, Praveen Kumar Jayaram wrote: > > Hi > > What is the exact difference between Solr 1.3 and Solr 1.4 (Nightly build as > of now)?? > Heard SolrJ is not part of Solr and performance is great in Solr 1.4 > Please tell exactly what all going to differ in Solr 1.4 > If possible please provide a pointer which describes the same. > > > > - > Regards, > Praveen > -- > View this message in context: > http://www.nabble.com/Solr-1.3-and-Solr-1.4-difference--tp22471477p22471477.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- --Noble Paul
Re: Date Search with q query parameter
On Thu, Mar 12, 2009 at 4:39 PM, dabboo wrote: > > Hi, > > I am able to rectify that exception but now what I am looking for is : How > I > can pass the value to the date field to search for the record of a specific > date value. e.g. I want to retrieve all the records of Jan 01, 2007. How I > will pass the value with the column name. If I pass the value it throws an > exception saying that "it is expecting TO" .. > The format for range search is your_date_field:[minDate TO maxDate] and for a normal term query it is your_date_field:the_date Each of the dates should be in the format described in the example schema.xml -- Regards, Shalin Shekhar Mangar.
Re: Date Search with q query parameter
Hi, Date range query is working fine for me. This is the query I entered. q=productPublicationDate_product_dt:1993-02-01T12:00:00Z&version=2.2&start=0&rows=10&indent=on&qt=dismaxrequest It threw this exception: type Status report message Invalid Date String:'1993-02-01t12' description The request sent by the client was syntactically incorrect (Invalid Date String:'1993-02-01t12'). thanks, Amit Garg Shalin Shekhar Mangar wrote: > > On Thu, Mar 12, 2009 at 4:39 PM, dabboo wrote: > >> >> Hi, >> >> I am able to rectify that exception but now what I am looking for is : >> How >> I >> can pass the value to the date field to search for the record of a >> specific >> date value. e.g. I want to retrieve all the records of Jan 01, 2007. How >> I >> will pass the value with the column name. If I pass the value it throws >> an >> exception saying that "it is expecting TO" .. >> > > The format for range search is your_date_field:[minDate TO maxDate] and > for > a normal term query it is your_date_field:the_date > > Each of the dates should be in the format described in the example > schema.xml > -- > Regards, > Shalin Shekhar Mangar. > > -- View this message in context: http://www.nabble.com/Date-Search-with-q-query-parameter-tp22471072p22474608.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Date Search with q query parameter
Hi, I am able to rectify that exception but now what I am looking for is : How I can pass the value to the date field to search for the record of a specific date value. e.g. I want to retrieve all the records of Jan 01, 2007. How I will pass the value with the column name. If I pass the value it throws an exception saying that "it is expecting TO" .. Please suggest. thanks, Amit Garg Venu Mittal wrote: > > Is your final query in this format ? > > col1:[2009-01-01T00:00:00Z+TO+2009-01-01T23:59:59Z] > > > > > From: dabboo > To: solr-user@lucene.apache.org > Sent: Thursday, March 12, 2009 12:27:48 AM > Subject: Date Search with q query parameter > > > Hi, > > I am facing an issue with the date field, I have in my records. > > e.g. I am using q query parameter and passing some string as search > criteria > like "test". While creating query with q parameter, how query forms is: > > column1:test | column2:test | column3:test . ... > > I have one column as date column, which is appended with _dt like > column4_dt. Now, when it creates the query like > > column1:test | column2:test | column3:test | column4_dt:test > > Here it throws an exception saying "Invalid date format". > > Please suggest how I can prevent this. > > Thanks, > Amit Garg > > -- > View this message in context: > http://www.nabble.com/Date-Search-with-q-query-parameter-tp22471072p22471072.html > Sent from the Solr - User mailing list archive at Nabble.com. > > > > -- View this message in context: http://www.nabble.com/Date-Search-with-q-query-parameter-tp22471072p22473029.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Tomcat holding deleted snapshots until it's restarted
The old IndexSearcher is beeing closed correctly: 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO org.apache.solr.core.SolrCore - [core_01] Registered new searcher searc...@c6692 main 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO org.apache.solr.search.SolrIndexSearcher - Closing searc...@1c5cd7 main hossman wrote: > > > : If the problem is not there the other thing that comes to my mind is > : lucene2.9-dev... maybe there's a problem closing indexWriter?... > opiously > : it's just a thought. > > you never answered yoniks question about wether you see any "Closing > Searcher" messagges in your log, also it's useful to know what you see in > the CORE section when you look at stats.jsp ... typically the "main" > searcher is listed there twice, but during warming you'll see the old > searcher as well ... if older searchers aren't getting closed for some > reason, they should be listed there. > > i'd start by confirming/ruling out hte old searchers before speculating > about the indexwriter or other problems. > > : > On a quiet system, you should see the original searcher closed right > : > after the new searcher is registered. > : > > : > Example: > : > Mar 11, 2009 2:22:25 PM org.apache.solr.core.SolrCore registerSearcher > : > INFO: [] Registered new searcher searc...@1f1cbf6 main > : > Mar 11, 2009 2:22:25 PM org.apache.solr.search.SolrIndexSearcher close > : > INFO: Closing searc...@acdd02 main > > > > -Hoss > > > -- View this message in context: http://www.nabble.com/Tomcat-holding-deleted-snapshots-until-it%27s-restarted-tp22451252p22475001.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Custom path for solr lib and data folder
Hoss Assume my current working directory is C:/MyApplication/searchApp and in the solr.xml i am specifying C:/lib as shared lib, then the console output contains the following line: INFO: loading shared library: C:\MyApplication\searchApp\C:\lib Thanks con hossman wrote: > > Adding '" + jars[j].toString() + "' to Solr classloaderAdding '" + > jars[j].toString() + "' to Solr classloader > > : > But how can i redirect solr to a seperate lib directrory that is > outside of > : > the solr.home > : > > : > Is this possible in solr 1.3 > : > : I don't believe it is possible (but please correct me if I'm wrong). > From > : SolrResourceLoader: > : > :log.info("Solr home set to '" + this.instanceDir + "'"); > :this.classLoader = createClassLoader(new File(this.instanceDir + > "lib/"), > : parent); > : > : So only a lib/ under Solr home directory is used. It would be a nice > > that's the lib directory specific to the core (hence it's relative the > instanceDir). > > In con's original post he was claiming to have problems getting > solr.xml's sharedLib option to point to an absolute path ... this should > work fine. > > con: when your solr.xml you should see an INFO message starting with > "loading shared library:..." -- what path is listed on that line? > > your sharedLib="%COMMON_LIB%" example won't work (for the reasons Noble > mentioned) but your sharedLib="C:\lib" should work (assuming that path > exists) and then immediately following the log message i mentioned > above, you should see INFO messages like... > Adding file:///...foo.jar to Solr classloader > ...for each jar in that directory. if there are none, or the directory > can't be found you might see "Reusing parent classloader" or "Can't > construct solr lib class loader" messages instead. > > what do you see in your logs? > > > > -Hoss > > > -- View this message in context: http://www.nabble.com/Custom-path-for-solr-lib-and-data-folder-tp22450530p22475244.html Sent from the Solr - User mailing list archive at Nabble.com.
How to remove stemming from the analyzer - Finding "blah" when searching for "blah*"
Hi, I am trying to disable stemming from the analyzer, but I am not sure how to do it. For instance, I have a field that contains "blah", but when I search for "blah*" it cannot find it, whereas if I search for "bla*" it does. I was using the text type field, from the example schema.xml. How should I modify it so that stemming is not done and I can find "blah" when I search for "blah*"? I have tried using the "textTight" type to no avail. Most of the fields in my documents have this structure: DOC1 field> gene name:brca2 DOC2 field> gene name:brca23 If I searched for "brca2*" I would like to find both documents. My field values normally contain colons ':' that should be used as stop words. Thank you in advance, Bruno
Re: How to remove stemming from the analyzer - Finding "blah" when searching for "blah*"
Remove the EnglishPorterFilterFactory from your "text" analyzer configuration (both index and query sides). And reindex all documents. Erik On Mar 12, 2009, at 8:28 AM, Bruno Aranda wrote: Hi, I am trying to disable stemming from the analyzer, but I am not sure how to do it. For instance, I have a field that contains "blah", but when I search for "blah*" it cannot find it, whereas if I search for "bla*" it does. I was using the text type field, from the example schema.xml. How should I modify it so that stemming is not done and I can find "blah" when I search for "blah*"? positionIncrementGap="100"> synonyms="synonyms.txt" ignoreCase="true" expand="true"/> I have tried using the "textTight" type to no avail. Most of the fields in my documents have this structure: DOC1 field> gene name:brca2 DOC2 field> gene name:brca23 If I searched for "brca2*" I would like to find both documents. My field values normally contain colons ':' that should be used as stop words. Thank you in advance, Bruno
Re: How to remove stemming from the analyzer - Finding "blah" when searching for "blah*"
Thanks for your answer, I am trying now with this custom text field: And still it does not find "blah" when using the wildcard and searching for "blah*". Am I missing something? Thanks, Bruno 2009/3/12 Erik Hatcher > Remove the EnglishPorterFilterFactory from your "text" analyzer > configuration (both index and query sides). And reindex all documents. > >Erik > > > On Mar 12, 2009, at 8:28 AM, Bruno Aranda wrote: > > Hi, >> >> I am trying to disable stemming from the analyzer, but I am not sure how >> to >> do it. >> >> For instance, I have a field that contains "blah", but when I search for >> "blah*" it cannot find it, whereas if I search for "bla*" it does. I was >> using the text type field, from the example schema.xml. How should I >> modify >> it so that stemming is not done and I can find "blah" when I search for >> "blah*"? >> >> >> >> >> >> >> > ignoreCase="true" >> words="stopwords.txt" >> enablePositionIncrements="true" >> /> >> > generateWordParts="1" generateNumberParts="1" catenateWords="1" >> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> >> >> > protected="protwords.txt"/> >> >> >> >> >> > ignoreCase="true" expand="true"/> >> > ignoreCase="true" >> words="stopwords.txt" >> enablePositionIncrements="true" >> /> >> > generateWordParts="1" generateNumberParts="1" catenateWords="0" >> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> >> >> > protected="protwords.txt"/> >> >> >> >> >> I have tried using the "textTight" type to no avail. Most of the fields in >> my documents have this structure: >> >> DOC1 field> gene name:brca2 >> DOC2 field> gene name:brca23 >> >> If I searched for "brca2*" I would like to find both documents. My field >> values normally contain colons ':' that should be used as stop words. >> >> Thank you in advance, >> >> Bruno >> > >
How to correctly boost results in Solr Dismax query
Hi, I have managed to build an index in Solr which I can search on keyword, produce facets, query facets etc. This is all working great. I have implemented my search using a dismax query so it searches predetermined fields. However, my results are coming back sorted by score which appears to be calculated by keyword relevancy only. I would like to adjust the score where fields have pre-determined values. I think I can do this with boost query and boost functions but the documentation here: http://wiki.apache.org/solr/DisMaxRequestHandler#head-6862070cf279d9a09bdab971309135c7aea22fb3 Is not particularly helpful. I tried adding adding a bq argument to my search: &bq=media:DVD^2 (yes, this is an index of films!) but I find when I start adding more and more: &bq=media:DVD^2&bq=media:BLU-RAY^1.5 I find the negative results - e.g. films that are DVD but are not BLU-RAY get negatively affected in their score. In the end it all seems to even out and my score is as it was before i started boosting. I must be doing this wrong and I wonder whether "boost function" comes in somewhere. Any ideas on how to correctly use boost? Cheers, Pete -- Pete Smith Developer No.9 | 6 Portal Way | London | W3 6RU | T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111 LOVEFiLM.com
Re: How to remove stemming from the analyzer - Finding "blah" when searching for "blah*"
What is the full query you're issuing to Solr and the corresponding request handler configuration? Chances are you're using the dismax query parser, which does not support wildcards. Other things to check, be sure you've tied the field to your new textIntact type, and that you're searching that field (see defaultField in schema.xml). Try something like /solr/select?q=field_name:blah* Erik On Mar 12, 2009, at 9:09 AM, Bruno Aranda wrote: Thanks for your answer, I am trying now with this custom text field: And still it does not find "blah" when using the wildcard and searching for "blah*". Am I missing something? Thanks, Bruno 2009/3/12 Erik Hatcher Remove the EnglishPorterFilterFactory from your "text" analyzer configuration (both index and query sides). And reindex all documents. Erik On Mar 12, 2009, at 8:28 AM, Bruno Aranda wrote: Hi, I am trying to disable stemming from the analyzer, but I am not sure how to do it. For instance, I have a field that contains "blah", but when I search for "blah*" it cannot find it, whereas if I search for "bla*" it does. I was using the text type field, from the example schema.xml. How should I modify it so that stemming is not done and I can find "blah" when I search for "blah*"? positionIncrementGap="100"> synonyms="synonyms.txt" ignoreCase="true" expand="true"/> I have tried using the "textTight" type to no avail. Most of the fields in my documents have this structure: DOC1 field> gene name:brca2 DOC2 field> gene name:brca23 If I searched for "brca2*" I would like to find both documents. My field values normally contain colons ':' that should be used as stop words. Thank you in advance, Bruno
RE: Combination of EmbeddedSolrServer and CommonHttpSolrServer
Hi Shalin Shekhar Mangar, Thanks for your inputs. Please see my comments below. I wish to know if there is any user who used EmbeddedSolrServer for indexing and CommonsHttpSolrServer for search. I have found that this combination offers better performance for indexing. Searching becomes flexible as you can search from more number of http clients simultaneously. Does anyone have any related performance data? Thanks, Ajit -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Wednesday, March 11, 2009 7:24 PM To: solr-user@lucene.apache.org Subject: Re: Combination of EmbeddedSolrServer and CommonHttpSolrServer On Wed, Mar 11, 2009 at 6:37 PM, Kulkarni, Ajit Kamalakar < ajkulka...@ptc.com> wrote: > > If we index the documents using CommonsHttpSolrServer and search using > the same, we get the updated results > > That means we can search the latest added document as well even if it is > not committed to the file system > That is not possible. Without calling commit, new documents will not be visible to a searcher. Ajit: When I tested using CommonsHttpSolrServer for indexing as well as searching, I could search the latest added document through solr admin page. I could also search the document through CommonsHttpSolrServer without explicitly calling commit. I am even more surprised to see the same result by using EmbeddedSolrServer for indexing and for searching CommonsHttpSolrServer. I used embeddedSolrServer = new EmbeddedSolrServer(SolrCore.getSolrCore()); which is deprecated API. For this I did not need to call commit on CommonsHttpSolrServer to get latest document searched on either solr admin page or even programmatically through CommonsHttpSolrServer However if I use CoreContainer multicore = new CoreContainer(); File home = new File( getSolrHome() ); File f = new File( home, "solr.xml" ); multicore.load( getSolrHome(), f ); embeddedSolrServer = new EmbeddedSolrServer( multicore, SolrIndexConstants.DEFAULT_CORE ); I had to use commit on CommonsHttpSolrServer to search the latest added documents and the document was available through solr admin page only when I programatcaaly searched after calling commit on CommonsHttpSolrServer This is consistent with what you mentioned above. > So it looks like there is some kind of cache that is used by both index > and search logic inside solr for a given SolrServer components (e. g. > CommonsHttpSolrServer, EmbeddedSolrServer) > Indexing does not create any cache. The caching is done only by the searcher. The old searcher/cache is discarded and a new searcher/cache is created when you call commit. Setting autoWarmCount on the caches in solrconfig.xml makes the new searcher run some of the most recently used queries on the old searcher to warm up the new cache. Calling commit on the SolrServer to synch with the index data may not be > good option as I suppose it to be expensive operation. > It is the only option. But you may be able to make the operation cheaper by tweaking the autowarmCount on the caches (this is specified in solrconfig.xml). However, caches are important for good search performance. Depending on your search traffic, you'll need to find a sweet spot. > The cache and hard disk data synchronization should be independent of > the SolrServer instances managed by Solr Web Application inside tomcat. > SolrServer is not really a server in itself. It is (a pointer to?) a server being used by a solrj client. The CommonsHttpSolrServer refers to a remote server url and makes calls through HTTP. SolrCore is the internal class which manages the state of the server. A SolrCore is created by the solr webapp. When you create another SolrCore for use by EmbeddedSolrServer, they do not know about each other. Therefore you need to notify it if you change the index through another core. Ajit: If the same JVM is managing responding searchers for EmbeddedSolrServer as well as CommonsHttpSolrServer, then why can't responding searcher be same? I understand that EmbeddedSolrServer and CommonsHttpSolrServer clients are separate but if searchers are managed in same JVM, theoretically we should be able to make singleton searcher attached to every kind of SolrServer. This searcher should be listener for indexer. Since searching is read operation, there won't be any threading or scalability issue but indexer should be one Since I don't have enough knowledge about solr and lucene so I may be totally wrong! > The issue still will be that EmbeddedSolrServer may directly access hard > index data as it may bypass the Solr web app totally > > I am embedding tomcat in my RMI server. > > The RMI Server is going to use EmbeddedSolrServer and it also hosts the > Solr WebApp inside its tomcat instance > > So I guess I should be able to manage a singleton cache th
Operators and Minimum Match with Dismax handler
Hi All, I have a question regarding the dismax handler and minimum match (mm=) I have an index which we are setting the default operator to AND. Am I right in saying that using the dismax handler, the default operator in the schema file is effectively ignored? (This is the conclusion I've made from testing myself) So I have set the mm value to 100% The issue I have with this, is that if I want to include an OR in my phrase, these are effectively getting ignored. The parser is still trying to match 100% of the search terms e.g. 'lucene OR query' still only finds matches for 'lucene AND query' the parsed query is: +(((drug_name:lucen) (drug_name:queri))~2) () I know I could programatically set the mm=0 if my phrase contains certain keywords, however this would get very complicated with more terms in the phrase (I'd have to preserve/inject operators to keep my "default") and I assume I would effectively be duplicating what dismax handler does for the most part already. Does anyone have any advise as to how I could deal with this kkind of problem? Thanks Waseem
Re: Solr 1.3; Data Import w/ Dynamic Fields
I was successful at distributing the Solr-1.4-DEV data import functionality within the Solr 1.3 war. 1. Copy the data import’s src directory from 1.4 to 1.3. 2. Made sure to used the data import’s build.xml already existing in Solr 1.3 3. Commented out all code within #SolrWriter.rollback method 4. Commented out the following import statements from #SolrWriter #import org.apache.solr.update.RollbackUpdateCommand; 5. Copied required libraries for logging from 1.4/lib to 1.3/lib slf4j-api-1.5.5.jar slf4j-jdk14-1.5.5.jar I was planning on replacing the Solr 1.4 logging scheme to the style in Solr 1.3, but that was unnecessary work. Continuing my testing with this customized distributing. Thanks again, Wesley. On 3/11/09 6:35 AM, "Shalin Shekhar Mangar" wrote: > On Wed, Mar 11, 2009 at 4:01 PM, Noble Paul നോബിള് नोब्ळ् < > noble.p...@gmail.com> wrote: > >> > I guess you can take the trunk and comment out the contents of >> > SolrWriter#rollback() and it should work with Solr1.3 >> > >> > > I agree. Rollback is the only feature which depends on enhancements in > Solr/Lucene libraries. So if you remove this feature, everything else should > work fine with 1.3 > > -- > Regards, > Shalin Shekhar Mangar. >
Is wiki page still accurate
Folks, Is this section title Full Import Example on http://wiki.apache.org/solr/DataImportHandler still accurate? The steps referring to the example-solr-home.jar and the SOLR-469 patch seem out of date with where 1.4 is today? Seems like the example-DIH stuff is simpler/more direct example??? Eric - Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com Free/Busy: http://tinyurl.com/eric-cal
Re: How to remove stemming from the analyzer - Finding "blah" when searching for "blah*"
Thanks again. This is the default request handler: explicit Doing this query: http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh Find 1 result. The term "Nefh" is found in the field "mitab". Doing: http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh* Finds nothing. I have realised that Ne* of Nef* do not return results as well, using the textIntact type... Thank you, Bruno 2009/3/12 Erik Hatcher > What is the full query you're issuing to Solr and the corresponding request > handler configuration? > > Chances are you're using the dismax query parser, which does not support > wildcards. Other things to check, be sure you've tied the field to your new > textIntact type, and that you're searching that field (see defaultField in > schema.xml). > > Try something like /solr/select?q=field_name:blah* > > >Erik > > On Mar 12, 2009, at 9:09 AM, Bruno Aranda wrote: > > Thanks for your answer, I am trying now with this custom text field: >> >> > positionIncrementGap="100" > >> >> >> > words="stopwords.txt"/> >> > generateWordParts="1" generateNumberParts="0" >> catenateWords="0" catenateNumbers="0" catenateAll="0" >> expand="0" splitOnCaseChange="0"/> >> >> >> >> >> >> And still it does not find "blah" when using the wildcard and searching >> for >> "blah*". Am I missing something? >> >> Thanks, >> >> Bruno >> >> 2009/3/12 Erik Hatcher >> >> Remove the EnglishPorterFilterFactory from your "text" analyzer >>> configuration (both index and query sides). And reindex all documents. >>> >>> Erik >>> >>> >>> On Mar 12, 2009, at 8:28 AM, Bruno Aranda wrote: >>> >>> Hi, >>> I am trying to disable stemming from the analyzer, but I am not sure how to do it. For instance, I have a field that contains "blah", but when I search for "blah*" it cannot find it, whereas if I search for "bla*" it does. I was using the text type field, from the example schema.xml. How should I modify it so that stemming is not done and I can find "blah" when I search for "blah*"? >>> positionIncrementGap="100"> >>> ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> >>> generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> >>> protected="protwords.txt"/> >>> ignoreCase="true" expand="true"/> >>> ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> >>> generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> >>> protected="protwords.txt"/> I have tried using the "textTight" type to no avail. Most of the fields in my documents have this structure: DOC1 field> gene name:brca2 DOC2 field> gene name:brca23 If I searched for "brca2*" I would like to find both documents. My field values normally contain colons ':' that should be used as stop words. Thank you in advance, Bruno >>> >>> >
Re: How to remove stemming from the analyzer - Finding "blah" when searching for "blah*"
On Mar 12, 2009, at 10:47 AM, Bruno Aranda wrote: Doing this query: http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh Find 1 result. The term "Nefh" is found in the field "mitab". Doing: http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh* Finds nothing. I have realised that Ne* of Nef* do not return results as well, using the textIntact type... Ah... the problem is that wildcarded query terms do not get analyzed, nor do they get lowercased (this is an open issue with Solr to at least make lowercasing configurable, Lucene supports it). Try lowercasing in your query client, that should do the trick. Erik
Re: How to remove stemming from the analyzer - Finding "blah" when searching for "blah*"
Thank you! Next time I will remind not to change the words to make the example simpler... blah is not the same as Nefh :-) Thanks, Bruno 2009/3/12 Erik Hatcher > > On Mar 12, 2009, at 10:47 AM, Bruno Aranda wrote: > >> Doing this query: >> >> http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh >> >> Find 1 result. The term "Nefh" is found in the field "mitab". >> >> Doing: >> >> http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh* >> >> Finds nothing. >> >> I have realised that Ne* of Nef* do not return results as well, using the >> textIntact type... >> > > Ah... the problem is that wildcarded query terms do not get analyzed, nor > do they get lowercased (this is an open issue with Solr to at least make > lowercasing configurable, Lucene supports it). > > Try lowercasing in your query client, that should do the trick. > >Erik > >
Programmatic access to other handlers
Hi, I've designed a "front" handler that will send request to other handlers and return a aggregated response. Inside this handler, I call other handlers like this (inside the method handleRequestBody): SolrCore core = req.getCore(); SolrRequestHandler mlt = core.getRequestHandler("/mlt"); ModifiableSolrParams params = new ModifiableSolrParams(req.getParams()); params.set("mlt.fl", "nFullText"); req.setParams(params); mlt.handleRequest(req, rsp); First question: is this the recommended way to call another handler? Second question: how could I call a handler of another core? -- View this message in context: http://www.nabble.com/Programmatic-access-to-other-handlers-tp22477731p22477731.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Tomcat holding deleted snapshots until it's restarted
I have noticed that the first time I execute full import (having an old index in the index folder) once it is done, the old indexsearcher will be closed: 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO org.apache.solr.core.SolrCore - [core_01] Registered new searcher searc...@c6692 main 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO org.apache.solr.search.SolrIndexSearcher - Closing searc...@1c5cd7 The problem is that if I do another full-import... the old searcher will not be closed, there will just appear the line: 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO org.apache.solr.core.SolrCore - [core_01] Registered new searcher searc...@c6692 main If I keep doing full-imports the ols searchers will never be closed. seems that they are just closed in the first full import... Does it mean something to anyone? Marc Sturlese wrote: > > The old IndexSearcher is beeing closed correctly: > > 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO > org.apache.solr.core.SolrCore - [core_01] Registered new searcher > searc...@c6692 main > 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO > org.apache.solr.search.SolrIndexSearcher - Closing searc...@1c5cd7 > > main > > hossman wrote: >> >> >> : If the problem is not there the other thing that comes to my mind is >> : lucene2.9-dev... maybe there's a problem closing indexWriter?... >> opiously >> : it's just a thought. >> >> you never answered yoniks question about wether you see any "Closing >> Searcher" messagges in your log, also it's useful to know what you see in >> the CORE section when you look at stats.jsp ... typically the "main" >> searcher is listed there twice, but during warming you'll see the old >> searcher as well ... if older searchers aren't getting closed for some >> reason, they should be listed there. >> >> i'd start by confirming/ruling out hte old searchers before speculating >> about the indexwriter or other problems. >> >> : > On a quiet system, you should see the original searcher closed right >> : > after the new searcher is registered. >> : > >> : > Example: >> : > Mar 11, 2009 2:22:25 PM org.apache.solr.core.SolrCore >> registerSearcher >> : > INFO: [] Registered new searcher searc...@1f1cbf6 main >> : > Mar 11, 2009 2:22:25 PM org.apache.solr.search.SolrIndexSearcher >> close >> : > INFO: Closing searc...@acdd02 main >> >> >> >> -Hoss >> >> >> > > -- View this message in context: http://www.nabble.com/Tomcat-holding-deleted-snapshots-until-it%27s-restarted-tp22451252p22478204.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr 1.4: filter documens using fields
Hi all! I'm using StandardRequestHandler and I wanted to filter results by two fields in order to avoid duplicate results (in this case the documents are very similar, with differences in fields that are not returned in a query response). For example, considering the response: 285 186_Testing 3142 Locais 285 186_Testing 3141 inventario 285 186_Testing 3141 inventario 285 186_Testing 3140 CPE 285 186_Testing 3140 CPE I wanted to filter by: instancekey and topologyid in order to get the following response: 285 186_Testing 3142 Locais 285 186_Testing 3141 inventario 285 186_Testing 3140 CPE I'm manage to do the filtering in the client, but then the paging doesn't work as it should (some pages may contain more duplicated results than others). Is there a way (query or other RequestHandler) to do this? Thanks, Rui Pereira
Re: Is wiki page still accurate
On Thu, Mar 12, 2009 at 8:05 PM, Eric Pugh wrote: > Folks, > > Is this section title Full Import Example on > http://wiki.apache.org/solr/DataImportHandler still accurate? The steps > referring to the example-solr-home.jar and the SOLR-469 patch seem out of > date with where 1.4 is today? > > Seems like the example-DIH stuff is simpler/more direct example??? > Yikes! I'll fix it. -- Regards, Shalin Shekhar Mangar.
RE: Replication in 1.3
Just so I'm clear on it, do you mean Windows replication via Cygwin is not supported or not possible? If it's possible, I'm just curious if anyone else on the list has experience with it. Thanks, Laurent -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Wednesday, March 11, 2009 5:03 PM To: solr-user@lucene.apache.org Subject: Re: Replication in 1.3 On Wed, Mar 11, 2009 at 1:29 PM, Vauthrin, Laurent wrote: > I'm hoping to use Solr version 1.4 but in the meantime I'm trying to get > replication to work in version 1.3. I'm running Tomcat as a Windows > service and have Cygwin installed. The rsync method of replication is not supported under Windows (due to differing OS/filesystem semantics). The Java-based synchronization in Solr 1.4 does support Windows though. -Yonik http://www.lucidimagination.com
Re: Is wiki page still accurate
On Thu, Mar 12, 2009 at 10:04 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Thu, Mar 12, 2009 at 8:05 PM, Eric Pugh < > ep...@opensourceconnections.com> wrote: > >> Folks, >> >> Is this section title Full Import Example on >> http://wiki.apache.org/solr/DataImportHandler still accurate? The steps >> referring to the example-solr-home.jar and the SOLR-469 patch seem out of >> date with where 1.4 is today? >> >> Seems like the example-DIH stuff is simpler/more direct example??? >> > > Yikes! I'll fix it. > I've updated the instructions. Thanks for reporting this, Eric. -- Regards, Shalin Shekhar Mangar.
Re: Replication in 1.3
On Thu, Mar 12, 2009 at 12:34 PM, Vauthrin, Laurent wrote: > Just so I'm clear on it, do you mean Windows replication via Cygwin is not > supported or not possible? Not really possible - the strategy the scripts use won't work on Windows because of the different filesystem semantics. Things like the fact that you can make a hard link, but you can't move or delete any of the links to an open file like you can with UNIX. -Yonik http://www.lucidimagination.com
Re: Tomcat holding deleted snapshots until it's restarted
: I have noticed that the first time I execute full import (having an old index : in the index folder) once it is done, the old indexsearcher will be closed: ... : The problem is that if I do another full-import... the old searcher will not : be closed, there will just appear the line: ... : If I keep doing full-imports the ols searchers will never be closed. seems : that they are just closed in the first full import... : Does it mean something to anyone? Hmmm... sounds like maybe DIH is triggering something weird. Just to clarify: a) what does the stats page show (in terms of the number of Searchers listed in the CORE section) after a couple of full imports? b) can you reproduce this doing full builds even with replication disabled? c) can you reproduce this using the example DIH configs? -Hoss
Re: Tomcat holding deleted snapshots until it's restarted
: Just to clarify: : a) what does the stats page show (in terms of the number of : Searchers listed in the CORE section) after a couple of full imports? After 4 full-imports it will show 3 indexsearchers. I have also printed the var "_searchers" from SolrCore.java and it shows me 3 indexsearchers. : b) can you reproduce this doing full builds even with replication : disabled? I have replication disabled. I use solr collection distribution but for all this tests I am not even using that. I just use one machine and index just in there : c) can you reproduce this using the example DIH configs? My configs look really similar than defaults. I get data from mysql database in data-config.xml. Solrconfig.xml has the caches and warmings same as defaults. I have disabled solrdeletionpolicystuff (and replication aswell). I have checked the oficial 1.3 release and I hace seen DirectUpdateHandler2.java is quite different that the one in the nightlys. In the commit void... 1.3 is calling a closeSearcher function : public void commit(CommitUpdateCommand cmd) throws IOException { if (cmd.optimize) { optimizeCommands.incrementAndGet(); } else { commitCommands.incrementAndGet(); } Future[] waitSearcher = null; if (cmd.waitSearcher) { waitSearcher = new Future[1]; } boolean error=true; iwCommit.lock(); try { log.info("start "+cmd); if (cmd.optimize) { closeSearcher(); openWriter(); writer.optimize(cmd.maxOptimizeSegments); } closeSearcher(); closeWriter(); These closeSearcher function doesn't exist in the nightly (I supose all the proces works in a different way now). It seems that once DataImportHandler does the first import touches something that makes IndexSearchers to not set free never again. hossman wrote: > > : I have noticed that the first time I execute full import (having an old > index > : in the index folder) once it is done, the old indexsearcher will be > closed: > ... > : The problem is that if I do another full-import... the old searcher will > not > : be closed, there will just appear the line: > ... > : If I keep doing full-imports the ols searchers will never be closed. > seems > : that they are just closed in the first full import... > : Does it mean something to anyone? > > Hmmm... sounds like maybe DIH is triggering something weird. > > Just to clarify: > a) what does the stats page show (in terms of the number of > Searchers listed in the CORE section) after a couple of full imports? > b) can you reproduce this doing full builds even with replication > disabled? > c) can you reproduce this using the example DIH configs? > > > > > -Hoss > > > -- View this message in context: http://www.nabble.com/Tomcat-holding-deleted-snapshots-until-it%27s-restarted-tp22451252p22481571.html Sent from the Solr - User mailing list archive at Nabble.com.
stemming (maybe?) question
is it possible to make solr think that "omeara" and "o'meara" are the same thing? -jsd-
RE: Replication in 1.3
Thanks for the reply. Hopefully 1.4 will come soon enough so that we can still use Windows. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Thursday, March 12, 2009 9:55 AM To: solr-user@lucene.apache.org Subject: Re: Replication in 1.3 On Thu, Mar 12, 2009 at 12:34 PM, Vauthrin, Laurent wrote: > Just so I'm clear on it, do you mean Windows replication via Cygwin is not supported or not possible? Not really possible - the strategy the scripts use won't work on Windows because of the different filesystem semantics. Things like the fact that you can make a hard link, but you can't move or delete any of the links to an open file like you can with UNIX. -Yonik http://www.lucidimagination.com
fl wildcards
If I wanted to hack Solr so that it has the ability to process wildcards for the field list parameter (fl), where would I look? (Perhaps I should look on the solr-dev mailing list, but since I am already on this one I thought I would start here). Thanks! -- -a "Ideally, a code library must be immediately usable by naive developers, easily customized by more sophisticated developers, and readily extensible by experts." -- L. Stein
Re: Tomcat holding deleted snapshots until it's restarted
On Thu, Mar 12, 2009 at 1:34 PM, Marc Sturlese wrote: > : Just to clarify: > : a) what does the stats page show (in terms of the number of > : Searchers listed in the CORE section) after a couple of full imports? > > After 4 full-imports it will show 3 indexsearchers. I have also printed the > var "_searchers" from SolrCore.java and it shows me 3 indexsearchers. Definitely seems like a bug somewhere... Could you try a recent nightly build to see if it's fixed or not? -Yonik http://www.lucidimagination.com
Adding authentication Token to the CommonsHttpSolrServer
Hi, We have installed the Solr in Tomcat server and enabled the security constraint at the Tomcat level.. We require to pass the authentication token(cookie) to the search call that is made using CommonsHttpSolrServer. Would like to know how can I add the token to the CommonsHttpSolrServer. Appreciate any idea on this. Thanks. Karthik
Re: stemming (maybe?) question
On Thu, Mar 12, 2009 at 1:36 PM, Jon Drukman wrote: > is it possible to make solr think that "omeara" and "o'meara" are the same > thing? WordDelimiter would handle it if the document had "o'meara" (but you may or may not want the other stuff that comes with WordDelimiterFilter). You could also use a PatternReplaceFilter to normalize tokens like this. -Yonik http://www.lucidimagination.com
DIH outer joins
I have queries with outer joins defined in some entities and for the same root object I can have two or more lines with different objects, for example: Taking the following 3 tables, andquery defined in the entity with outer joins between tables: Table1 -> Table2 -> Table3 I can have the following lines returned by the query: Table1Instance1 -> Table2Instance1 -> Table3Instance1 Table1Instance1 -> Table2Instance1 -> Table3Instance2 Table1Instance1 -> Table2Instance2 -> Table3Instance3 Table1Instance2 -> Table2Instance3 -> Table3Instance4 I wanted to have a single document per root object instance (in this case per Table1 instance) but with the values from the different lines returned. Is it possible to have this behavior in DataImportHandler? How? Thanks in advance, Rui Pereira
Re: Programmatic access to other handlers
I found this code to access other core from my custom requesthandler: CoreContainer.Initializer initializer = new CoreContainer.Initializer(); CoreContainer cores = initializer.initialize(); SolrCore otherCore = cores.getCore("otherCore"); It seems to work with some little testing. But is it a recommended approach? Pascal Dimassimo wrote: > > Hi, > > I've designed a "front" handler that will send request to other handlers > and return a aggregated response. > > Inside this handler, I call other handlers like this (inside the method > handleRequestBody): > > SolrCore core = req.getCore(); > SolrRequestHandler mlt = core.getRequestHandler("/mlt"); > ModifiableSolrParams params = new ModifiableSolrParams(req.getParams()); > params.set("mlt.fl", "nFullText"); > req.setParams(params); > mlt.handleRequest(req, rsp); > > First question: is this the recommended way to call another handler? > Second question: how could I call a handler of another core? > -- View this message in context: http://www.nabble.com/Programmatic-access-to-other-handlers-tp22477731p22483357.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Programmatic access to other handlers
If you are doing this in a RequestHandler, implement SolrCoreAware and you will get a callback with the Core http://wiki.apache.org/solr/SolrPlugins#head-8b3ac1fc3584fe1e822924b98af23d72b02ab134 On Mar 12, 2009, at 3:04 PM, Pascal Dimassimo wrote: I found this code to access other core from my custom requesthandler: CoreContainer.Initializer initializer = new CoreContainer.Initializer(); CoreContainer cores = initializer.initialize(); SolrCore otherCore = cores.getCore("otherCore"); It seems to work with some little testing. But is it a recommended approach? Pascal Dimassimo wrote: Hi, I've designed a "front" handler that will send request to other handlers and return a aggregated response. Inside this handler, I call other handlers like this (inside the method handleRequestBody): SolrCore core = req.getCore(); SolrRequestHandler mlt = core.getRequestHandler("/mlt"); ModifiableSolrParams params = new ModifiableSolrParams(req.getParams()); params.set("mlt.fl", "nFullText"); req.setParams(params); mlt.handleRequest(req, rsp); First question: is this the recommended way to call another handler? Second question: how could I call a handler of another core? -- View this message in context: http://www.nabble.com/Programmatic-access-to-other-handlers-tp22477731p22483357.html Sent from the Solr - User mailing list archive at Nabble.com.
DIH use of the ?command=full-import entity= command option
Hello, Can anybody describe the intended purpose, or provide a few examples, of how the DIH entity= command option works. Am I supposed to build a data-conf.xml file which contains many different alternate entities.. or Regards -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Re: Programmatic access to other handlers
Thanks ryantxu for your answer. I implement the interface and it returns me the current core. But how is it different from doing request.getCore() from handleRequestBody()? And I don't see how this can give me access to other cores. I think that what I need is to get access to an instance of CoreContainer, so I can call getCore(name) and getAdminCore to manage the different cores. So I'm wondering if this is a good way to get that instance: CoreContainer.Initializer initializer = new CoreContainer.Initializer(); CoreContainer cores = initializer.initialize(); ryantxu wrote: > > If you are doing this in a RequestHandler, implement SolrCoreAware and > you will get a callback with the Core > > http://wiki.apache.org/solr/SolrPlugins#head-8b3ac1fc3584fe1e822924b98af23d72b02ab134 > > > On Mar 12, 2009, at 3:04 PM, Pascal Dimassimo wrote: > >> >> I found this code to access other core from my custom requesthandler: >> >> CoreContainer.Initializer initializer = new >> CoreContainer.Initializer(); >> CoreContainer cores = initializer.initialize(); >> SolrCore otherCore = cores.getCore("otherCore"); >> >> It seems to work with some little testing. But is it a recommended >> approach? >> >> >> Pascal Dimassimo wrote: >>> >>> Hi, >>> >>> I've designed a "front" handler that will send request to other >>> handlers >>> and return a aggregated response. >>> >>> Inside this handler, I call other handlers like this (inside the >>> method >>> handleRequestBody): >>> >>> SolrCore core = req.getCore(); >>> SolrRequestHandler mlt = core.getRequestHandler("/mlt"); >>> ModifiableSolrParams params = new >>> ModifiableSolrParams(req.getParams()); >>> params.set("mlt.fl", "nFullText"); >>> req.setParams(params); >>> mlt.handleRequest(req, rsp); >>> >>> First question: is this the recommended way to call another handler? >>> Second question: how could I call a handler of another core? >>> >> >> -- >> View this message in context: >> http://www.nabble.com/Programmatic-access-to-other-handlers-tp22477731p22483357.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> > > > -- View this message in context: http://www.nabble.com/Programmatic-access-to-other-handlers-tp22477731p22486235.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Programmatic access to other handlers
: I implement the interface and it returns me the current core. But how is it : different from doing request.getCore() from handleRequestBody()? And I don't i think ryan missunderstood your goal .. that's just a way for you to get access to your core prior to handling requests. : see how this can give me access to other cores. I think that what I need is : to get access to an instance of CoreContainer, so I can call getCore(name) : and getAdminCore to manage the different cores. So I'm wondering if this is : a good way to get that instance: I'm not positive, but i think the code you listed will actaully reconstruct new copies of all of the cores. the simplest way to get access to the CoreContainer is via the CoreDescriptor... yourCore.getCoreDescriptor().getCoreContainer().getCore("otherCoreName"); (note i've never actually done this, it's just waht i remember off the top of my head from the past multicore design discussions ... the class/method names may be slightly wrong) -Hoss
Issues with stale searchers.
I have Solr 1.3 running on Apache Tomcat 5.5.27. I'm running into an issue where searchers are opened up right away when tomcat starts, and never goes away. This is causing read locks on the Lucene index holding open deleted files during merges. This causes our server to run out of disk space in our index. Wondering what is causing this issue as I have been searching for two days without any real answers. Thanks, LSOF Output java 7322 tomcat 70r REG 253,0 2569538 2883610 /opt/solr/data/index/_m5n.cfs (deleted) java 7322 tomcat 71r REG 253,0 2338291 2883609 /opt/solr/data/index/_m5m.cfs (deleted) java 7322 tomcat 72r REG 253,0 13398930 2883608 /opt/solr/data/index/_m5l.cfs (deleted) java 7322 tomcat 73r REG 253,0 2692917 2883598 /opt/solr/data/index/_m5k.cfs (deleted) java 7322 tomcat 74r REG 253,0 32324600 2883592 /opt/solr/data/index/_m5j.cfx (deleted) java 7322 tomcat 75r REG 253,0 6767344 2883603 /opt/solr/data/index/_m5j.cfs (deleted) java 7322 tomcat 76r REG 253,0 32324600 2883592 /opt/solr/data/index/_m5j.cfx (deleted) java 7322 tomcat 77r REG 253,0 15937346 2883600 /opt/solr/data/index/_m5i.cfs (deleted) Stats page on Solr Admin searc...@66952905 main class: org.apache.solr.search.SolrIndexSearcher version:1.0 description:index searcher stats: searcherName : searc...@66952905 main caching : true numDocs : 187169908 maxDoc : 187169908 readerImpl : ReadOnlyMultiSegmentReader readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index indexVersion : 1224609883675 openedAt : Thu Mar 12 17:13:15 CDT 2009 registeredAt : Thu Mar 12 17:13:23 CDT 2009 warmupTime : 0 name: core class: version:1.0 description:SolrCore stats: coreName : startTime : Thu Mar 12 17:13:15 CDT 2009 refCount : 2 aliases : [] name: searcher class: org.apache.solr.search.SolrIndexSearcher version:1.0 description:index searcher stats: searcherName : searc...@66952905 main caching : true numDocs : 187169908 maxDoc : 187169908 readerImpl : ReadOnlyMultiSegmentReader readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index indexVersion : 1224609883675 openedAt : Thu Mar 12 17:13:15 CDT 2009 registeredAt : Thu Mar 12 17:13:23 CDT 2009 warmupTime : 0 Jeremy Carroll Sr. Network Engineer Networked Insights
Re: DIH use of the ?command=full-import entity= command option
Wouldn't an entity be something such as a stream, or DB, a manifest- channel? The name source would be better to me but... there's the sQL data- sources. paul Le 12-mars-09 à 22:47, Fergus McMenemie a écrit : Can anybody describe the intended purpose, or provide a few examples, of how the DIH entity= command option works. Am I supposed to build a data-conf.xml file which contains many different alternate entities.. or smime.p7s Description: S/MIME cryptographic signature
Re: OR/NOT query syntax
I might be wrong on this, but since you can't do a query that's just a NOT statement, this wouldn't work either. I believe the NOT must negate results of a query, not the entire dataset. On Wed, Mar 11, 2009 at 6:56 PM, Andrew Wall wrote: > I'm attempting to write a solr query that ensures that if one field has a > particular value that another field also have a particular value. I've > arrived at this syntax, but it doesn't seem to work correctly. > > ((myField:superneat AND myOtherField:somethingElse) OR NOT myField:superneat) > > either operand functions correctly on its own - but not when joined together > with the "or not" condition. I don't understand why this syntax doesn't > work - can someone shed some light on this? > > Thanks! > Andrew Wall > -- Jonathan Haddad http://www.rustyrazorblade.com
Re: Issues with stale searchers.
On Thu, Mar 12, 2009 at 6:29 PM, Jeremy Carroll wrote: > I have Solr 1.3 running on Apache Tomcat 5.5.27. I'm running into an issue > where searchers are opened up right away when tomcat starts, and never goes > away. This is causing read locks on the Lucene index holding open deleted > files during merges. Deleted files being held open can be normal - that's the current IndexSearcher serving requests (even though those files may have been deleted by the IndexWriter already). Looking at your Stats, I only see one Searcher, so things look fine there too. -Yonik http://www.lucidimagination.com This causes our server to run out of disk space in our index. Wondering what is causing this issue as I have been searching for two days without any real answers. > > Thanks, > > LSOF Output > java 7322 tomcat 70r REG 253,0 2569538 > 2883610 /opt/solr/data/index/_m5n.cfs (deleted) > java 7322 tomcat 71r REG 253,0 2338291 > 2883609 /opt/solr/data/index/_m5m.cfs (deleted) > java 7322 tomcat 72r REG 253,0 13398930 > 2883608 /opt/solr/data/index/_m5l.cfs (deleted) > java 7322 tomcat 73r REG 253,0 2692917 > 2883598 /opt/solr/data/index/_m5k.cfs (deleted) > java 7322 tomcat 74r REG 253,0 32324600 > 2883592 /opt/solr/data/index/_m5j.cfx (deleted) > java 7322 tomcat 75r REG 253,0 6767344 > 2883603 /opt/solr/data/index/_m5j.cfs (deleted) > java 7322 tomcat 76r REG 253,0 32324600 > 2883592 /opt/solr/data/index/_m5j.cfx (deleted) > java 7322 tomcat 77r REG 253,0 15937346 > 2883600 /opt/solr/data/index/_m5i.cfs (deleted) > > Stats page on Solr Admin > > > searc...@66952905 main > class: org.apache.solr.search.SolrIndexSearcher > version: 1.0 > description: index searcher > stats: searcherName : searc...@66952905 main > caching : true > numDocs : 187169908 > maxDoc : 187169908 > readerImpl : ReadOnlyMultiSegmentReader > readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index > indexVersion : 1224609883675 > openedAt : Thu Mar 12 17:13:15 CDT 2009 > registeredAt : Thu Mar 12 17:13:23 CDT 2009 > warmupTime : 0 > > name: core > class: > version: 1.0 > description: SolrCore > stats: coreName : > startTime : Thu Mar 12 17:13:15 CDT 2009 > refCount : 2 > aliases : [] > > name: searcher > class: org.apache.solr.search.SolrIndexSearcher > version: 1.0 > description: index searcher > stats: searcherName : searc...@66952905 main > caching : true > numDocs : 187169908 > maxDoc : 187169908 > readerImpl : ReadOnlyMultiSegmentReader > readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index > indexVersion : 1224609883675 > openedAt : Thu Mar 12 17:13:15 CDT 2009 > registeredAt : Thu Mar 12 17:13:23 CDT 2009 > warmupTime : 0 > > Jeremy Carroll > Sr. Network Engineer > Networked Insights > >
Re: OR/NOT query syntax
On Wed, Mar 11, 2009 at 9:56 PM, Andrew Wall wrote: > I'm attempting to write a solr query that ensures that if one field has a > particular value that another field also have a particular value. I've > arrived at this syntax, but it doesn't seem to work correctly. > > ((myField:superneat AND myOtherField:somethingElse) OR NOT myField:superneat) Try (myField:superneat AND myOtherField:somethingElse) OR (*:* -myField:superneat) -Yonik http://www.lucidimagination.com
RE: Issues with stale searchers.
If that's the case it is causing out of disk issues with Solr. We have a 187m document count index which is about ~200Gb in size. Over a period of about a week after optimizations, etc... the open file but deleted count grows very large. Causing the system to not be able to optimize due to lack of disk space. Also new documents that are indexed are not showing up in search results. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Thursday, March 12, 2009 7:43 PM To: solr-user@lucene.apache.org Subject: Re: Issues with stale searchers. On Thu, Mar 12, 2009 at 6:29 PM, Jeremy Carroll wrote: > I have Solr 1.3 running on Apache Tomcat 5.5.27. I'm running into an issue > where searchers are opened up right away when tomcat starts, and never goes > away. This is causing read locks on the Lucene index holding open deleted > files during merges. Deleted files being held open can be normal - that's the current IndexSearcher serving requests (even though those files may have been deleted by the IndexWriter already). Looking at your Stats, I only see one Searcher, so things look fine there too. -Yonik http://www.lucidimagination.com This causes our server to run out of disk space in our index. Wondering what is causing this issue as I have been searching for two days without any real answers. > > Thanks, > > LSOF Output > java 7322 tomcat 70r REG 253,0 2569538 > 2883610 /opt/solr/data/index/_m5n.cfs (deleted) > java 7322 tomcat 71r REG 253,0 2338291 > 2883609 /opt/solr/data/index/_m5m.cfs (deleted) > java 7322 tomcat 72r REG 253,0 13398930 > 2883608 /opt/solr/data/index/_m5l.cfs (deleted) > java 7322 tomcat 73r REG 253,0 2692917 > 2883598 /opt/solr/data/index/_m5k.cfs (deleted) > java 7322 tomcat 74r REG 253,0 32324600 > 2883592 /opt/solr/data/index/_m5j.cfx (deleted) > java 7322 tomcat 75r REG 253,0 6767344 > 2883603 /opt/solr/data/index/_m5j.cfs (deleted) > java 7322 tomcat 76r REG 253,0 32324600 > 2883592 /opt/solr/data/index/_m5j.cfx (deleted) > java 7322 tomcat 77r REG 253,0 15937346 > 2883600 /opt/solr/data/index/_m5i.cfs (deleted) > > Stats page on Solr Admin > > > searc...@66952905 main > class: org.apache.solr.search.SolrIndexSearcher > version: 1.0 > description: index searcher > stats: searcherName : searc...@66952905 main > caching : true > numDocs : 187169908 > maxDoc : 187169908 > readerImpl : ReadOnlyMultiSegmentReader > readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index > indexVersion : 1224609883675 > openedAt : Thu Mar 12 17:13:15 CDT 2009 > registeredAt : Thu Mar 12 17:13:23 CDT 2009 > warmupTime : 0 > > name: core > class: > version: 1.0 > description: SolrCore > stats: coreName : > startTime : Thu Mar 12 17:13:15 CDT 2009 > refCount : 2 > aliases : [] > > name: searcher > class: org.apache.solr.search.SolrIndexSearcher > version: 1.0 > description: index searcher > stats: searcherName : searc...@66952905 main > caching : true > numDocs : 187169908 > maxDoc : 187169908 > readerImpl : ReadOnlyMultiSegmentReader > readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index > indexVersion : 1224609883675 > openedAt : Thu Mar 12 17:13:15 CDT 2009 > registeredAt : Thu Mar 12 17:13:23 CDT 2009 > warmupTime : 0 > > Jeremy Carroll > Sr. Network Engineer > Networked Insights > >
SolrJ : EmbeddedSolrServer and database data indexing
Is it possible to index DB data directly to solr using EmbeddedSolrServer. I tried using data-Config File and Full-import commad, it works. So assuming using CommonsHttpServer will also work. But can I do it with EmbeddedSolrServer?? Thanks in advance... Ashish -- View this message in context: http://www.nabble.com/SolrJ-%3A-EmbeddedSolrServer-and-database-data-indexing-tp22488697p22488697.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issues with stale searchers.
On Thu, Mar 12, 2009 at 9:38 PM, Jeremy Carroll wrote: > If that's the case it is causing out of disk issues with Solr. We have a 187m > document count index which is about ~200Gb in size. Over a period of about a > week after optimizations, etc... the open file but deleted count grows very > large. Causing the system to not be able to optimize due to lack of disk > space. Also new documents that are indexed are not showing up in search > results. Multiply the index size by 3 to get the max disk space. - 1 for the index currently open for searching - up to 1 for new segments written by the index writer (including merges) - up to 1 when the index writer does major merges or optimizes (the index writer can't delete the old segment files until it's sure that the new index has been written successfully). That said, what you are seeing could be normal, or could be a bug. -Yonik
Re: SolrJ : EmbeddedSolrServer and database data indexing
Is there any api in SolrJ that calls the dataImportHandler to execute commands like full-import and delta-import. Please help.. Ashish P wrote: > > Is it possible to index DB data directly to solr using EmbeddedSolrServer. > I tried using data-Config File and Full-import commad, it works. So > assuming using CommonsHttpServer will also work. But can I do it with > EmbeddedSolrServer?? > > Thanks in advance... > Ashish > -- View this message in context: http://www.nabble.com/SolrJ-%3A-EmbeddedSolrServer-and-database-data-indexing-tp22488697p22489420.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: DIH use of the ?command=full-import entity= command option
On Fri, Mar 13, 2009 at 3:17 AM, Fergus McMenemie wrote: > Hello, > > Can anybody describe the intended purpose, or provide a > few examples, of how the DIH entity= command option works. > > Am I supposed to build a data-conf.xml file which contains > many different alternate entities.. or > With the entity parameter you can specify the name of any root entity and import only that one. You can specify multiple entity parameters too. For example: /dataimport?command=full-import&entity=x&entity=y You may need to specify preImportDeleteQuery separately on each entity to make sure all documents are not deleted. -- Regards, Shalin Shekhar Mangar.
Re: DIH use of the ?command=full-import entity= command option
If my data-config.xml contains multiple root level entities what is the expected action if I call full-import without an entity=XXX sub-command? Does it process all entities one after the other or only the first? (It would be useful IMHO if it only did the first.) >On Fri, Mar 13, 2009 at 3:17 AM, Fergus McMenemie wrote: > >> Hello, >> >> Can anybody describe the intended purpose, or provide a >> few examples, of how the DIH entity= command option works. >> >> Am I supposed to build a data-conf.xml file which contains >> many different alternate entities.. or >> > >With the entity parameter you can specify the name of any root entity and >import only that one. You can specify multiple entity parameters too. For >example: >/dataimport?command=full-import&entity=x&entity=y > >You may need to specify preImportDeleteQuery separately on each entity to >make sure all documents are not deleted. >-- >Regards, >Shalin Shekhar Mangar. -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Re: CJKAnalyzer and Chinese Text sort
Thanks Hoss for your comments! I don't mind submitting it as a patch, shall I create a issue in Jira and submit the patch with that? Also, I didn't modify the core solr for locale based sorting; I just added the created a jar file with the class file & copied it over to the lib folder. As part of the patch, shall I add it to the core solr code-base (users who want to use this don't need anything extra to do) or add it as a contrib field (they need to compile it as jar and copy it over to the lib folder)? Thanks! -- Original Message -- From: Chris Hostetter To: solr-user@lucene.apache.org Subject: Re: CJKAnalyzer and Chinese Text sort Date: Wed, 11 Mar 2009 15:50:40 -0700 (PDT) First off: you can't sort on a field where any doc has more then one token -- that's why worting on a TextField doesn't work unless you use something like the KeywordTokenizer. Second... : I found out that reason the strings are not getting sorted is because : there is no way to pass the locale information to StrField, I ended up : extending StrField to take an additional attribute in schema.xml and : then had to override the getSortString method where in I create a new : Locale based on the schema attribute and pass it to the StrField. I put : this newly created jar file in the lib folder and everything seems to be : working fine after that. Since, my java knowledge is almost zilch, I was : wondering is this the right way to solve this problem or is there any : other recommended approach for this? I don't remember what the state of Locale-based sorting is, but the modifications you describe sound right based on what i remember ... would you be interested in submitting them back as a patch? http://wiki.apache.org/solr/HowToContribute -Hoss Be there without being there. Click now for great video conferencing solutions! http://thirdpartyoffers.netzero.net/TGL2231/fc/BLSrjnxPnB4hOQVqoEYkOC4tiqZzd7wrCMz9gjPk2mJcEaQiXNZxDIlo7b6/
Re: DIH use of the ?command=full-import entity= command option
On Fri, Mar 13, 2009 at 10:44 AM, Fergus McMenemie wrote: > If my data-config.xml contains multiple root level entities > what is the expected action if I call full-import without an > entity=XXX sub-command? > > Does it process all entities one after the other or only the > first? (It would be useful IMHO if it only did the first.) > It processes all entities one after the other. If you want to import only one, use the entity parameter. -- Regards, Shalin Shekhar Mangar.
Re: input XSLT
There is a fundamental problem with using 'pull' approach using DIH. Normally people want a delta imports which are done using a timestamp field. Now it may not always be possible for application servers to sync their timestamps (given protocol restrictions due to security reasons). Due to this Solr application is likely to miss a few records occasionally. Such a problem does not arise if applications themseleves identify their records and post. Should we not have such a feature in Solr, which will allow users to push data onto the index in whichever format they wish to? This will also facilitate plugging in solr seamlessly with all kinds of applications. Regards, CI On Wed, Mar 11, 2009 at 11:52 PM, Noble Paul നോബിള് नोब्ळ् < noble.p...@gmail.com> wrote: > On Tue, Mar 10, 2009 at 12:17 PM, CIF Search wrote: > > Just as you have an xslt response writer to convert Solr xml response to > > make it compatible with any application, on the input side do you have an > > xslt module that will parse xml documents to solr format before posting > them > > to solr indexer. I have gone through dataimporthandler, but it works in > data > > 'pull' mode i.e. solr pulls data from the given location. I would still > want > > to work with applications 'posting' documents to solr indexer as and when > > they want. > it is a limitation of DIH, but if you can put your xml in a file > behind an http server then you can fire a command to DIH to pull data > from the url quite easily. > > > > Regards, > > CI > > > > > > -- > --Noble Paul >
Re: How to correctly boost results in Solr Dismax query
Hi Pete, bq parameter works with q,alt query parameter. If you are passing the search criteria using q.alt query parameter then this bq parameter comes into picture. Also, q.alt doesnt support field boosting. If you want to boost the records with their field value then you must use q query parameter instead of q.alt. 'q' parameter actually uses qf parameters from solrConfig for field boosting. Let me know if you have any questions. Thanks, Amit Garg Pete Smith-3 wrote: > > Hi, > > I have managed to build an index in Solr which I can search on keyword, > produce facets, query facets etc. This is all working great. I have > implemented my search using a dismax query so it searches predetermined > fields. > > However, my results are coming back sorted by score which appears to be > calculated by keyword relevancy only. I would like to adjust the score > where fields have pre-determined values. I think I can do this with > boost query and boost functions but the documentation here: > > http://wiki.apache.org/solr/DisMaxRequestHandler#head-6862070cf279d9a09bdab971309135c7aea22fb3 > > Is not particularly helpful. I tried adding adding a bq argument to my > search: > > &bq=media:DVD^2 > > (yes, this is an index of films!) but I find when I start adding more > and more: > > &bq=media:DVD^2&bq=media:BLU-RAY^1.5 > > I find the negative results - e.g. films that are DVD but are not > BLU-RAY get negatively affected in their score. In the end it all seems > to even out and my score is as it was before i started boosting. > > I must be doing this wrong and I wonder whether "boost function" comes > in somewhere. Any ideas on how to correctly use boost? > > Cheers, > Pete > > -- > Pete Smith > Developer > > No.9 | 6 Portal Way | London | W3 6RU | > T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111 > > LOVEFiLM.com > > -- View this message in context: http://www.nabble.com/How-to-correctly-boost-results-in-Solr-Dismax-query-tp22476204p22490850.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: input XSLT
On Fri, Mar 13, 2009 at 11:36 AM, CIF Search wrote: > There is a fundamental problem with using 'pull' approach using DIH. > Normally people want a delta imports which are done using a timestamp > field. > Now it may not always be possible for application servers to sync their > timestamps (given protocol restrictions due to security reasons). Due to > this Solr application is likely to miss a few records occasionally. Such a > problem does not arise if applications themseleves identify their records > and post. Should we not have such a feature in Solr, which will allow users > to push data onto the index in whichever format they wish to? This will > also > facilitate plugging in solr seamlessly with all kinds of applications. > You can of course push your documents to Solr using the XML/CSV update (or using the solrj client). It's just that you can't push documents with DIH. http://wiki.apache.org/solr/#head-98c3ee61c5fc837b09e3dfe3fb420491c9071be3 -- Regards, Shalin Shekhar Mangar.
Re: input XSLT
But these documents have to be converted to a particular format before being posted. Any XML document cannot be posted to Solr (with XSLT handled by Solr internally). DIH handles any xml format, but it operates in pull mode. On Fri, Mar 13, 2009 at 11:45 AM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Fri, Mar 13, 2009 at 11:36 AM, CIF Search wrote: > > > There is a fundamental problem with using 'pull' approach using DIH. > > Normally people want a delta imports which are done using a timestamp > > field. > > Now it may not always be possible for application servers to sync their > > timestamps (given protocol restrictions due to security reasons). Due to > > this Solr application is likely to miss a few records occasionally. Such > a > > problem does not arise if applications themseleves identify their records > > and post. Should we not have such a feature in Solr, which will allow > users > > to push data onto the index in whichever format they wish to? This will > > also > > facilitate plugging in solr seamlessly with all kinds of applications. > > > > You can of course push your documents to Solr using the XML/CSV update (or > using the solrj client). It's just that you can't push documents with DIH. > > http://wiki.apache.org/solr/#head-98c3ee61c5fc837b09e3dfe3fb420491c9071be3 > > -- > Regards, > Shalin Shekhar Mangar. >