Re: very slow add/commit time
How many MB have you set of cache on your solrconfig.xml? On Tue, Nov 3, 2009 at 12:24 PM, Marc Des Garets wrote: > Hi, > > > > I am experiencing a problem with an index of about 80 millions documents > (41Gb). I am trying to update documents in this index using Solrj. > > > > When I do: > > solrServer.add(docs); //docs is a List that contains > 1000 SolrInputDocument (takes 36sec) > > solrServer.commit(false,false); //either never ends with a OutOfMemory > error or takes forever > > > > I have -Xms4g -Xmx4g > > > > Any idea what could be the problem? > > > > Thanks for your help. > > > -- > This transmission is strictly confidential, possibly legally privileged, > and intended solely for the > addressee. Any views or opinions expressed within it are those of the > author and do not necessarily > represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's > subsidiary companies. If you > are not the intended recipient then you must not disclose, copy or take any > action in reliance of this > transmission. If you have received this transmission in error, please > notify the sender as soon as > possible. No employee or agent is authorised to conclude any binding > agreement on behalf of > i-CD Publishing (UK) Ltd with another party by email without express > written confirmation by an > authorised employee of the Company. http://www.192.com (Tel: 08000 192 > 192). i-CD Publishing (UK) Ltd > is incorporated in England and Wales, company number 3148549, VAT No. GB > 673128728. -- Bruno Morelli Vargas Mail: brun...@gmail.com Msn: brun...@hotmail.com Icq: 165055101 Skype: morellibmv
Re: very slow add/commit time
Try raising you ramBufferSize (it helped a lot when my team had performance issues) And also try checkin this link (helps a lot) http://wiki.apache.org/solr/SolrPerformanceFactors Regards On Tue, Nov 3, 2009 at 12:38 PM, Marc Des Garets wrote: > If you mean ramBufferSizeMB, I have it set on 512. The maxBufferedDocs > is commented. If you mean queryResultMaxDocsCached, it is set on 200 but > is it used when indexing? > > -Original Message- > From: Bruno [mailto:brun...@gmail.com] > Sent: 03 November 2009 14:27 > To: solr-user@lucene.apache.org > Subject: Re: very slow add/commit time > > How many MB have you set of cache on your solrconfig.xml? > > On Tue, Nov 3, 2009 at 12:24 PM, Marc Des Garets > wrote: > > > Hi, > > > > > > > > I am experiencing a problem with an index of about 80 millions > documents > > (41Gb). I am trying to update documents in this index using Solrj. > > > > > > > > When I do: > > > > solrServer.add(docs); //docs is a List that > contains > > 1000 SolrInputDocument (takes 36sec) > > > > solrServer.commit(false,false); //either never ends with a OutOfMemory > > error or takes forever > > > > > > > > I have -Xms4g -Xmx4g > > > > > > > > Any idea what could be the problem? > > > > > > > > Thanks for your help. > > > > > > -- > > This transmission is strictly confidential, possibly legally > privileged, > > and intended solely for the > > addressee. Any views or opinions expressed within it are those of the > > author and do not necessarily > > represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's > > subsidiary companies. If you > > are not the intended recipient then you must not disclose, copy or > take any > > action in reliance of this > > transmission. If you have received this transmission in error, please > > notify the sender as soon as > > possible. No employee or agent is authorised to conclude any binding > > agreement on behalf of > > i-CD Publishing (UK) Ltd with another party by email without express > > written confirmation by an > > authorised employee of the Company. http://www.192.com (Tel: 08000 192 > > 192). i-CD Publishing (UK) Ltd > > is incorporated in England and Wales, company number 3148549, VAT No. > GB > > 673128728. > > > > > -- > Bruno Morelli Vargas > Mail: brun...@gmail.com > Msn: brun...@hotmail.com > Icq: 165055101 > Skype: morellibmv > -- > This transmission is strictly confidential, possibly legally privileged, > and intended solely for the > addressee. Any views or opinions expressed within it are those of the > author and do not necessarily > represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's > subsidiary companies. If you > are not the intended recipient then you must not disclose, copy or take any > action in reliance of this > transmission. If you have received this transmission in error, please > notify the sender as soon as > possible. No employee or agent is authorised to conclude any binding > agreement on behalf of > i-CD Publishing (UK) Ltd with another party by email without express > written confirmation by an > authorised employee of the Company. http://www.192.com (Tel: 08000 192 > 192). i-CD Publishing (UK) Ltd > is incorporated in England and Wales, company number 3148549, VAT No. GB > 673128728. -- Bruno Morelli Vargas Mail: brun...@gmail.com Msn: brun...@hotmail.com Icq: 165055101 Skype: morellibmv
Grouping
Is there a way to make a group by or distinct query? -- Bruno Morelli Vargas Mail: brun...@gmail.com Msn: brun...@hotmail.com Icq: 165055101 Skype: morellibmv
Re: SolrJ: Highlighting not Working
I've tried with default values and didn't work either. On Thu, Jun 18, 2009 at 2:31 PM, Mark Miller wrote: > Why do you have: > query.set("hl.maxAnalyzedChars", -1); > > Have you tried using the default? Unless -1 is an undoc'd feature, this > means you wouldnt get anything back! This should normally be a fairly hefty > value and defaults to 51200, according to the wiki. > > And why: > query.set("hl.fragsize", 1); > > That means a fragment could only be 1 char - again, I'd try the default > (take out the param), and adjust from there. > (wiki says the default is 100). > > Let us know how it goes. > > -- > - Mark > > http://www.lucidimagination.com > > > > Bruno wrote: > >> Hi guys. >> I new at using highlighting, so probably I'm making some stupid mistake, >> however I'm not founding anything wrong. >> I use highlighting from a query withing a EmbeddedSolrServer, and within >> the query I've set parameters necessary for enabling highlighting. Attached, >> follows my schema and solrconfig.xml , and down below follows the Java code. >> Content from the SolrDocumentList is not highlighted. >> >> EmbeddedSolrServer server = SolrServerManager./getServerEv/(); >> String queryString = filter; >> SolrQuery query = >> >> *new* SolrQuery(); >> >> query.setQuery(queryString); >> query.setHighlight(*true*); >> query.addHighlightField(/LOG_FIELD/); >> query.setHighlightSimplePost(""); >> query.setHighlightSimplePre(""); >> query.set("hl.usePhraseHighlighter", *true*); >> query.set("hl.highlightMultiTerm", *true*); >> query.set("hl.snippets", 100); >> query.set("hl.fragsize", 1); >> query.set("hl.mergeContiguous", *false*); >> query.set("hl.requireFieldMatch", *false*); >> query.set("hl.maxAnalyzedChars", -1); >> >> query.addSortField(/DATE_FIELD/, SolrQuery.ORDER./asc/); >> query.setFacetLimit(LogUtilProperties./getInstance/().getProperty(LogUtilProperties./LOGEVENT_SEARCH_RESULT_SIZE/, >> 1)); >> query.setRows(LogUtilProperties./getInstance/().getProperty(LogUtilProperties./LOGEVENT_SEARCH_RESULT_SIZE/, >> 1)); >> query.setIncludeScore(*true*); >> QueryResponse rsp = server.query(query); >> SolrDocumentList docs = rsp.getResults(); >> >> -- >> Bruno Morelli Vargas Mail: brun...@gmail.com > brun...@gmail.com> >> Msn: brun...@hotmail.com <mailto:brun...@hotmail.com> >> Icq: 165055101 >> Skype: morellibmv >> >> > > > -- Bruno Morelli Vargas Mail: brun...@gmail.com Msn: brun...@hotmail.com Icq: 165055101 Skype: morellibmv
Re: SolrJ: Highlighting not Working
Couple of things I've forgot to mention: Solr Version: 1.3 Enviroment: Websphere On Thu, Jun 18, 2009 at 2:34 PM, Bruno wrote: > I've tried with default values and didn't work either. > > > On Thu, Jun 18, 2009 at 2:31 PM, Mark Miller wrote: > >> Why do you have: >> query.set("hl.maxAnalyzedChars", -1); >> >> Have you tried using the default? Unless -1 is an undoc'd feature, this >> means you wouldnt get anything back! This should normally be a fairly hefty >> value and defaults to 51200, according to the wiki. >> >> And why: >> query.set("hl.fragsize", 1); >> >> That means a fragment could only be 1 char - again, I'd try the default >> (take out the param), and adjust from there. >> (wiki says the default is 100). >> >> Let us know how it goes. >> >> -- >> - Mark >> >> http://www.lucidimagination.com >> >> >> >> Bruno wrote: >> >>> Hi guys. >>> I new at using highlighting, so probably I'm making some stupid mistake, >>> however I'm not founding anything wrong. >>> I use highlighting from a query withing a EmbeddedSolrServer, and within >>> the query I've set parameters necessary for enabling highlighting. Attached, >>> follows my schema and solrconfig.xml , and down below follows the Java code. >>> Content from the SolrDocumentList is not highlighted. >>> >>> EmbeddedSolrServer server = SolrServerManager./getServerEv/(); >>> String queryString = filter; >>> SolrQuery query = >>> >>> *new* SolrQuery(); >>> >>> query.setQuery(queryString); >>> query.setHighlight(*true*); >>> query.addHighlightField(/LOG_FIELD/); >>> query.setHighlightSimplePost(""); >>> query.setHighlightSimplePre(""); >>> query.set("hl.usePhraseHighlighter", *true*); >>> query.set("hl.highlightMultiTerm", *true*); >>> query.set("hl.snippets", 100); >>> query.set("hl.fragsize", 1); >>> query.set("hl.mergeContiguous", *false*); >>> query.set("hl.requireFieldMatch", *false*); >>> query.set("hl.maxAnalyzedChars", -1); >>> >>> query.addSortField(/DATE_FIELD/, SolrQuery.ORDER./asc/); >>> query.setFacetLimit(LogUtilProperties./getInstance/().getProperty(LogUtilProperties./LOGEVENT_SEARCH_RESULT_SIZE/, >>> 1)); >>> query.setRows(LogUtilProperties./getInstance/().getProperty(LogUtilProperties./LOGEVENT_SEARCH_RESULT_SIZE/, >>> 1)); >>> query.setIncludeScore(*true*); >>> QueryResponse rsp = server.query(query); >>> SolrDocumentList docs = rsp.getResults(); >>> >>> -- >>> Bruno Morelli Vargas Mail: brun...@gmail.com >> brun...@gmail.com> >>> Msn: brun...@hotmail.com <mailto:brun...@hotmail.com> >>> Icq: 165055101 >>> Skype: morellibmv >>> >>> >> >> >> > > > -- > Bruno Morelli Vargas > Mail: brun...@gmail.com > Msn: brun...@hotmail.com > Icq: 165055101 > Skype: morellibmv > > -- Bruno Morelli Vargas Mail: brun...@gmail.com Msn: brun...@hotmail.com Icq: 165055101 Skype: morellibmv
Re: SolrJ: Highlighting not Working
Here is the query, search for the term "ipod" on the "log" field q=log%3Aipod+AND+requestid%3A1029+AND+logfilename%3Apayxdev-1245272062125-USS.log.zip&hl=true&hl.fl=log&hl.fl=message&hl.simple.post=%3Ci%3E&hl.simple.pre=%3C%2Fi%3E&hl.usePhraseHighlighter=true&hl.highlightMultiTerm=true&hl.snippets=100&hl.fragsize=100&hl.mergeContiguous=false&hl.requireFieldMatch=false&hl.maxAnalyzedChars=-1&sort=timestamp+asc&facet.limit=6000&rows=6000&fl=score On Thu, Jun 18, 2009 at 2:51 PM, Mark Miller wrote: > Nothing off the top of my head ... > > I can play around with some of the solrj unit tests a bit later and perhaps > see if I can dig anything up. > > Note: > if you expect wildcard/prefix/etc queries to highlight, they will not with > Solr 1.3. > > query.set("hl.highlightMultiTerm", *true*); > > The above only applies to solr 1.4. > So if your query is just a wildcard ... > > What is your query, by the way? > > > > -- > - Mark > > http://www.lucidimagination.com > > > > Bruno wrote: > >> Couple of things I've forgot to mention: >> >> Solr Version: 1.3 >> Enviroment: Websphere >> >> On Thu, Jun 18, 2009 at 2:34 PM, Bruno wrote: >> >> >> >>> I've tried with default values and didn't work either. >>> >>> >>> On Thu, Jun 18, 2009 at 2:31 PM, Mark Miller >> >wrote: >>> >>> >>> >>>> Why do you have: >>>> query.set("hl.maxAnalyzedChars", -1); >>>> >>>> Have you tried using the default? Unless -1 is an undoc'd feature, this >>>> means you wouldnt get anything back! This should normally be a fairly >>>> hefty >>>> value and defaults to 51200, according to the wiki. >>>> >>>> And why: >>>> query.set("hl.fragsize", 1); >>>> >>>> That means a fragment could only be 1 char - again, I'd try the default >>>> (take out the param), and adjust from there. >>>> (wiki says the default is 100). >>>> >>>> Let us know how it goes. >>>> >>>> -- >>>> - Mark >>>> >>>> http://www.lucidimagination.com >>>> >>>> >>>> >>>> Bruno wrote: >>>> >>>> >>>> >>>>> Hi guys. >>>>> I new at using highlighting, so probably I'm making some stupid >>>>> mistake, >>>>> however I'm not founding anything wrong. >>>>> I use highlighting from a query withing a EmbeddedSolrServer, and >>>>> within >>>>> the query I've set parameters necessary for enabling highlighting. >>>>> Attached, >>>>> follows my schema and solrconfig.xml , and down below follows the Java >>>>> code. >>>>> Content from the SolrDocumentList is not highlighted. >>>>> >>>>> EmbeddedSolrServer server = SolrServerManager./getServerEv/(); >>>>> String queryString = filter; >>>>> SolrQuery query = >>>>> >>>>> *new* SolrQuery(); >>>>> >>>>> query.setQuery(queryString); >>>>> query.setHighlight(*true*); >>>>> query.addHighlightField(/LOG_FIELD/); >>>>> query.setHighlightSimplePost(""); >>>>> query.setHighlightSimplePre(""); >>>>> query.set("hl.usePhraseHighlighter", *true*); >>>>> query.set("hl.highlightMultiTerm", *true*); >>>>> query.set("hl.snippets", 100); >>>>> query.set("hl.fragsize", 1); >>>>> query.set("hl.mergeContiguous", *false*); >>>>> query.set("hl.requireFieldMatch", *false*); >>>>> query.set("hl.maxAnalyzedChars", -1); >>>>> >>>>> query.addSortField(/DATE_FIELD/, SolrQuery.ORDER./asc/); >>>>> >>>>> query.setFacetLimit(LogUtilProperties./getInstance/().getProperty(LogUtilProperties./LOGEVENT_SEARCH_RESULT_SIZE/, >>>>> 1)); >>>>> >>>>> query.setRows(LogUtilProperties./getInstance/().getProperty(LogUtilProperties./LOGEVENT_SEARCH_RESULT_SIZE/, >>>>> 1)); >>>>> query.setIncludeScore(*true*); >>>>> QueryResponse rsp = server.query(query); >>>>> SolrDocumentList docs = rsp.getResults(); >>>>> >>>>> -- >>>>> Bruno Morelli Vargas Mail: brun...@gmail.com >>>> brun...@gmail.com> >>>>> Msn: brun...@hotmail.com <mailto:brun...@hotmail.com> >>>>> Icq: 165055101 >>>>> Skype: morellibmv >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> -- >>> Bruno Morelli Vargas >>> Mail: brun...@gmail.com >>> Msn: brun...@hotmail.com >>> Icq: 165055101 >>> Skype: morellibmv >>> >>> >>> >>> >> >> >> >> > > > > -- Bruno Morelli Vargas Mail: brun...@gmail.com Msn: brun...@hotmail.com Icq: 165055101 Skype: morellibmv
Re: SolrJ: Highlighting not Working
I've checked the NamedList you told me about, but it contains only one highlighted doc, when there I have more docs that sould be highlighted. On Thu, Jun 18, 2009 at 3:03 PM, Erik Hatcher wrote: > Note that highlighting is NOT part of the document list returned. It's in > an additional NamedList section of the response (with name="highlighting") > >Erik > > > On Jun 18, 2009, at 1:22 PM, Bruno wrote: > > Hi guys. >> >> I new at using highlighting, so probably I'm making some stupid mistake, >> however I'm not founding anything wrong. >> >> I use highlighting from a query withing a EmbeddedSolrServer, and within >> the query I've set parameters necessary for enabling highlighting. Attached, >> follows my schema and solrconfig.xml , and down below follows the Java code. >> Content from the SolrDocumentList is not highlighted. >> >> EmbeddedSolrServer server = SolrServerManager.getServerEv(); >> String queryString = filter; >> SolrQuery query = >> >> new SolrQuery(); >> >> query.setQuery(queryString); >> query.setHighlight(true); >> query.addHighlightField(LOG_FIELD); >> query.setHighlightSimplePost(""); >> query.setHighlightSimplePre(""); >> query.set("hl.usePhraseHighlighter", true); >> query.set("hl.highlightMultiTerm", true); >> query.set("hl.snippets", 100); >> query.set("hl.fragsize", 1); >> query.set("hl.mergeContiguous", false); >> query.set("hl.requireFieldMatch", false); >> query.set("hl.maxAnalyzedChars", -1); >> >> query.addSortField(DATE_FIELD, SolrQuery.ORDER.asc); >> query.setFacetLimit(LogUtilProperties.getInstance().getProperty(LogUtilProperties.LOGEVENT_SEARCH_RESULT_SIZE, >> 1)); >> query.setRows(LogUtilProperties.getInstance().getProperty(LogUtilProperties.LOGEVENT_SEARCH_RESULT_SIZE, >> 1)); >> query.setIncludeScore(true); >> QueryResponse rsp = server.query(query); >> SolrDocumentList docs = rsp.getResults(); >> >> -- >> Bruno Morelli Vargas >> Mail: brun...@gmail.com >> Msn: brun...@hotmail.com >> Icq: 165055101 >> Skype: morellibmv >> >> >> > > -- Bruno Morelli Vargas Mail: brun...@gmail.com Msn: brun...@hotmail.com Icq: 165055101 Skype: morellibmv
Re: SolrJ: Highlighting not Working
Just figured out what happened... It's necessary for the schema to have a uniqueKey set, otherwise, highlighting will have one or less entries, as the map's key is the doc uniqueKey, so on debuggin I figured out that the QueryResponse tries to put all highlighted results in a map with null key... at end, putting tons of entries all with null key will result on a one-entry-only map. Thanks for the help guys. On Thu, Jun 18, 2009 at 3:17 PM, Bruno wrote: > I've checked the NamedList you told me about, but it contains only one > highlighted doc, when there I have more docs that sould be highlighted. > > > On Thu, Jun 18, 2009 at 3:03 PM, Erik Hatcher > wrote: > >> Note that highlighting is NOT part of the document list returned. It's in >> an additional NamedList section of the response (with name="highlighting") >> >>Erik >> >> >> On Jun 18, 2009, at 1:22 PM, Bruno wrote: >> >> Hi guys. >>> >>> I new at using highlighting, so probably I'm making some stupid mistake, >>> however I'm not founding anything wrong. >>> >>> I use highlighting from a query withing a EmbeddedSolrServer, and within >>> the query I've set parameters necessary for enabling highlighting. Attached, >>> follows my schema and solrconfig.xml , and down below follows the Java code. >>> Content from the SolrDocumentList is not highlighted. >>> >>> EmbeddedSolrServer server = SolrServerManager.getServerEv(); >>> String queryString = filter; >>> SolrQuery query = >>> >>> new SolrQuery(); >>> >>> query.setQuery(queryString); >>> query.setHighlight(true); >>> query.addHighlightField(LOG_FIELD); >>> query.setHighlightSimplePost(""); >>> query.setHighlightSimplePre(""); >>> query.set("hl.usePhraseHighlighter", true); >>> query.set("hl.highlightMultiTerm", true); >>> query.set("hl.snippets", 100); >>> query.set("hl.fragsize", 1); >>> query.set("hl.mergeContiguous", false); >>> query.set("hl.requireFieldMatch", false); >>> query.set("hl.maxAnalyzedChars", -1); >>> >>> query.addSortField(DATE_FIELD, SolrQuery.ORDER.asc); >>> query.setFacetLimit(LogUtilProperties.getInstance().getProperty(LogUtilProperties.LOGEVENT_SEARCH_RESULT_SIZE, >>> 1)); >>> query.setRows(LogUtilProperties.getInstance().getProperty(LogUtilProperties.LOGEVENT_SEARCH_RESULT_SIZE, >>> 1)); >>> query.setIncludeScore(true); >>> QueryResponse rsp = server.query(query); >>> SolrDocumentList docs = rsp.getResults(); >>> >>> -- >>> Bruno Morelli Vargas >>> Mail: brun...@gmail.com >>> Msn: brun...@hotmail.com >>> Icq: 165055101 >>> Skype: morellibmv >>> >>> >>> >> >> > > > -- > Bruno Morelli Vargas > Mail: brun...@gmail.com > Msn: brun...@hotmail.com > Icq: 165055101 > Skype: morellibmv > > -- Bruno Morelli Vargas Mail: brun...@gmail.com Msn: brun...@hotmail.com Icq: 165055101 Skype: morellibmv
Re: Slowness during submit the index
d your email correctly, but I think you >>>> are saying >>>> you are indexing your DB content into a Solr index. If this is >>>> correct, here >>>> are things to look at: >>>> * is the java version the same on both machines (QA vs. PROD) >>>> * are the same java parameters being used on both machines >>>> * is the connection to the DB the same on both machines >>>> * are both the PROD and QA DB servers the same and are both DB >>>> instances the >>>> same >>>> ... >>>> >>>> >>>> Otis >>>> -- >>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch >>>> >>>> >>>> >>>> - Original Message >>>>> From: Francis Yakin >>>>> To: "solr-user@lucene.apache.org" >>>>> Sent: Friday, June 19, 2009 5:27:59 PM >>>>> Subject: Slowness during submit the index >>>>> >>>>> >>>>> We are experiencing slowness during reloading/resubmitting index >>>>> from >>> Database >>>>> to the master. >>>>> >>>>> We have two environments: >>>>> >>>>> QA and Prod. >>>>> >>>>> The slowness is happened only in Production but not in QA. >>>>> >>>>> It only takes one hours to reload 2.5Mil indexes compare 5-6 >>>>> hours to load >>> the >>>>> same size of index in Prod. >>>>> >>>>> I checked both the config files in QA and Prod, they are all >>>>> identical, >>>> except: >>>>> >>>>> >>>>> In QA: >>>>> false >>>>> In Prod: >>>>> true >>>>> >>>>> I believe that we use "http" protocol reload/submit the index >>>>> from Database >>> to >>>>> Solr Master. >>>>> I did test copying big files thru network from database to the >>>>> solr box, I >>>> don't >>>>> see any issue. >>>>> >>>>> We are running solr 1.2 >>>>> >>>>> Any inputs will be much appreciated. >> > > -- Enviado do meu celular Bruno Morelli Vargas Mail: brun...@gmail.com Msn: brun...@hotmail.com Icq: 165055101 Skype: morellibmv
Reading a parameter from a String.
I need to change a parameter from within a query string. :* AND requestid:100 AND timestamp:[2010-04-13T20:30:00.000Z TO 2010-04-13T21:00:00.000Z] AND source:"LogCollector-risidev3was2.201002020100._opt_ISI_logs.FNM.stdout_ISIREG_10.02.01_02.00.00.txt.tar.gz-stdout_ISIREG_10.02.01_02.00.00.txt.FNM.risidev3was2_opt_ISI_logs.201002020100" In this case I have to change the timestamp parameters. Is there a way? -- Bruno Morelli Vargas Mail: brun...@gmail.com Msn: brun...@hotmail.com Icq: 165055101 Skype: morellibmv
document support for file system crawling
Hi there, browsing through the message thread I tried to find a trail addressing file system crawls. I want to implement an enterprise search over a networked filesystem, crawling all sorts of documents, such as html, doc, ppt and pdf. Nutch provides plugins enabling it to read proprietary formats. Is there support for the same functionality in solr? Bruno -- View this message in context: http://www.nabble.com/document-support-for-file-system-crawling-tf2188066.html#a6053318 Sent from the Solr - User forum at Nabble.com.
Newbie: Searching across 2 collections ?
Hi All, Solr 5.4, Ubuntu I thought it was simple to request across two collections with the same schema but not. I have one solr instance launch. 300 000 records in each collection. I try to use this request without having both results: http://my_adress:my_port/solr/C1/select?collection=C1,C2&q=fid:34520196&wt=json this request returns only C1 results and if I do: http://my_adress:my_port/solr/C2/select?collection=C1,C2&q=fid:34520196&wt=json it returns only C2 results. I have 5 identical fields on both collection id, fid, st, cc, timestamp where id is the unique key field. Can someone could explain me why it doesn't work ? Thanks a lot ! Bruno --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com
Re: Newbie: Searching across 2 collections ?
yes id value is unique in C1 and unique in C2. id in C1 is never present in C2 id in C2 is never present in C1 Le 06/01/2016 11:12, Binoy Dalal a écrit : Are Id values for docs in both the collections exactly same? To get proper results, the ids should be unique across both the cores. On Wed, 6 Jan 2016, 15:11 Bruno Mannina wrote: Hi All, Solr 5.4, Ubuntu I thought it was simple to request across two collections with the same schema but not. I have one solr instance launch. 300 000 records in each collection. I try to use this request without having both results: http://my_adress:my_port /solr/C1/select?collection=C1,C2&q=fid:34520196&wt=json this request returns only C1 results and if I do: http://my_adress:my_port /solr/C2/select?collection=C1,C2&q=fid:34520196&wt=json it returns only C2 results. I have 5 identical fields on both collection id, fid, st, cc, timestamp where id is the unique key field. Can someone could explain me why it doesn't work ? Thanks a lot ! Bruno --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com -- Regards, Binoy Dalal --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com
Re: Newbie: Searching across 2 collections ?
Hi Susheel, Emir, yes I check, and I have one result in c1 and in c2 with the same query fid:34520196 http://xxx.xxx.xxx.xxx:/solr/c1/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 { "responseHeader":{ "status":0, "QTime":1, "params":{ "fl":"fid,cc*,st", "indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"EP1680447", "st":"LAPSED", "fid":"34520196"}] } } http://xxx.xxx.xxx.xxx:/solr/c2/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 { "responseHeader":{ "status":0, "QTime":0, "params":{ "fl":"id,fid,cc*,st", "indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"WO2005040212", "st":"PENDING", "cc_CA":"LAPSED", "cc_EP":"LAPSED", "cc_JP":"PENDING", "cc_US":"LAPSED", "fid":"34520196"}] }} I have the same xxx.xxx.xxx.xxx: (server:port). unique key field C1, C2 : id id data in C1 is different of id data in C2 Must I config/set something in solr ? thanks, Bruno Le 06/01/2016 14:56, Emir Arnautovic a écrit : Hi Bruno, Can you check counts? Is it possible that first page is only with results from collection that you sent request to so you assumed it returns only results from single collection? Thanks, Emir On 06.01.2016 14:33, Susheel Kumar wrote: Hi Bruno, I just tested this scenario in my local solr 5.3.1 and it returned results from two identical collections. I doubt if it is broken in 5.4 just double check if you are not missing anything else. Thanks, Susheel http://localhost:8983/solr/c1/select?q=id_type%3Ahello&wt=json&indent=true&collection=c1,c2 responseHeader": {"status": 0,"QTime": 98,"params": {"q": "id_type:hello"," indent": "true","collection": "c1,c2","wt": "json"}}, response": {"numFound": 2,"start": 0,"maxScore": 1,"docs": [{"id": "1"," id_type": "hello","_version_": 1522623395043213300},{"id": "3","id_type": " hello","_version_": 1522623422397415400}]} On Wed, Jan 6, 2016 at 6:13 AM, Bruno Mannina wrote: yes id value is unique in C1 and unique in C2. id in C1 is never present in C2 id in C2 is never present in C1 Le 06/01/2016 11:12, Binoy Dalal a écrit : Are Id values for docs in both the collections exactly same? To get proper results, the ids should be unique across both the cores. On Wed, 6 Jan 2016, 15:11 Bruno Mannina wrote: Hi All, Solr 5.4, Ubuntu I thought it was simple to request across two collections with the same schema but not. I have one solr instance launch. 300 000 records in each collection. I try to use this request without having both results: http://my_adress:my_port /solr/C1/select?collection=C1,C2&q=fid:34520196&wt=json this request returns only C1 results and if I do: http://my_adress:my_port /solr/C2/select?collection=C1,C2&q=fid:34520196&wt=json it returns only C2 results. I have 5 identical fields on both collection id, fid, st, cc, timestamp where id is the unique key field. Can someone could explain me why it doesn't work ? Thanks a lot ! Bruno --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com -- Regards, Binoy Dalal --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com
Re: Newbie: Searching across 2 collections ?
I have a dev' server, I will do some test on it... Le 06/01/2016 17:31, Susheel Kumar a écrit : I'll suggest if you can setup some some test data locally and try this out. This will confirm your understanding. Thanks, Susheel On Wed, Jan 6, 2016 at 10:39 AM, Bruno Mannina wrote: Hi Susheel, Emir, yes I check, and I have one result in c1 and in c2 with the same query fid:34520196 http://xxx.xxx.xxx.xxx: /solr/c1/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 { "responseHeader":{ "status":0, "QTime":1, "params":{ "fl":"fid,cc*,st", "indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"EP1680447", "st":"LAPSED", "fid":"34520196"}] } } http://xxx.xxx.xxx.xxx: /solr/c2/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 { "responseHeader":{ "status":0, "QTime":0, "params":{ "fl":"id,fid,cc*,st", "indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"WO2005040212", "st":"PENDING", "cc_CA":"LAPSED", "cc_EP":"LAPSED", "cc_JP":"PENDING", "cc_US":"LAPSED", "fid":"34520196"}] }} I have the same xxx.xxx.xxx.xxx: (server:port). unique key field C1, C2 : id id data in C1 is different of id data in C2 Must I config/set something in solr ? thanks, Bruno Le 06/01/2016 14:56, Emir Arnautovic a écrit : Hi Bruno, Can you check counts? Is it possible that first page is only with results from collection that you sent request to so you assumed it returns only results from single collection? Thanks, Emir On 06.01.2016 14:33, Susheel Kumar wrote: Hi Bruno, I just tested this scenario in my local solr 5.3.1 and it returned results from two identical collections. I doubt if it is broken in 5.4 just double check if you are not missing anything else. Thanks, Susheel http://localhost:8983/solr/c1/select?q=id_type%3Ahello&wt=json&indent=true&collection=c1,c2 responseHeader": {"status": 0,"QTime": 98,"params": {"q": "id_type:hello"," indent": "true","collection": "c1,c2","wt": "json"}}, response": {"numFound": 2,"start": 0,"maxScore": 1,"docs": [{"id": "1"," id_type": "hello","_version_": 1522623395043213300},{"id": "3","id_type":" hello","_version_": 1522623422397415400}]} On Wed, Jan 6, 2016 at 6:13 AM, Bruno Mannina wrote: yes id value is unique in C1 and unique in C2. id in C1 is never present in C2 id in C2 is never present in C1 Le 06/01/2016 11:12, Binoy Dalal a écrit : Are Id values for docs in both the collections exactly same? To get proper results, the ids should be unique across both the cores. On Wed, 6 Jan 2016, 15:11 Bruno Mannina wrote: Hi All, Solr 5.4, Ubuntu I thought it was simple to request across two collections with the same schema but not. I have one solr instance launch. 300 000 records in each collection. I try to use this request without having both results: http://my_adress:my_port /solr/C1/select?collection=C1,C2&q=fid:34520196&wt=json this request returns only C1 results and if I do: http://my_adress:my_port /solr/C2/select?collection=C1,C2&q=fid:34520196&wt=json it returns only C2 results. I have 5 identical fields on both collection id, fid, st, cc, timestamp where id is the unique key field. Can someone could explain me why it doesn't work ? Thanks a lot ! Bruno --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com -- Regards, Binoy Dalal --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com
Re: Newbie: Searching across 2 collections ?
Same result on my dev' server, it seems that collection param haven't effect on the query... Q: I don't see on the solr 5.4 doc, the "collection" param for select handler, is it always present in 5.4 version ? Le 06/01/2016 17:38, Bruno Mannina a écrit : I have a dev' server, I will do some test on it... Le 06/01/2016 17:31, Susheel Kumar a écrit : I'll suggest if you can setup some some test data locally and try this out. This will confirm your understanding. Thanks, Susheel On Wed, Jan 6, 2016 at 10:39 AM, Bruno Mannina wrote: Hi Susheel, Emir, yes I check, and I have one result in c1 and in c2 with the same query fid:34520196 http://xxx.xxx.xxx.xxx: /solr/c1/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 { "responseHeader":{ "status":0, "QTime":1, "params":{ "fl":"fid,cc*,st", "indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"EP1680447", "st":"LAPSED", "fid":"34520196"}] } } http://xxx.xxx.xxx.xxx: /solr/c2/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 { "responseHeader":{ "status":0, "QTime":0, "params":{ "fl":"id,fid,cc*,st", "indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"WO2005040212", "st":"PENDING", "cc_CA":"LAPSED", "cc_EP":"LAPSED", "cc_JP":"PENDING", "cc_US":"LAPSED", "fid":"34520196"}] }} I have the same xxx.xxx.xxx.xxx: (server:port). unique key field C1, C2 : id id data in C1 is different of id data in C2 Must I config/set something in solr ? thanks, Bruno Le 06/01/2016 14:56, Emir Arnautovic a écrit : Hi Bruno, Can you check counts? Is it possible that first page is only with results from collection that you sent request to so you assumed it returns only results from single collection? Thanks, Emir On 06.01.2016 14:33, Susheel Kumar wrote: Hi Bruno, I just tested this scenario in my local solr 5.3.1 and it returned results from two identical collections. I doubt if it is broken in 5.4 just double check if you are not missing anything else. Thanks, Susheel http://localhost:8983/solr/c1/select?q=id_type%3Ahello&wt=json&indent=true&collection=c1,c2 responseHeader": {"status": 0,"QTime": 98,"params": {"q": "id_type:hello"," indent": "true","collection": "c1,c2","wt": "json"}}, response": {"numFound": 2,"start": 0,"maxScore": 1,"docs": [{"id": "1"," id_type": "hello","_version_": 1522623395043213300},{"id": "3","id_type":" hello","_version_": 1522623422397415400}]} On Wed, Jan 6, 2016 at 6:13 AM, Bruno Mannina wrote: yes id value is unique in C1 and unique in C2. id in C1 is never present in C2 id in C2 is never present in C1 Le 06/01/2016 11:12, Binoy Dalal a écrit : Are Id values for docs in both the collections exactly same? To get proper results, the ids should be unique across both the cores. On Wed, 6 Jan 2016, 15:11 Bruno Mannina wrote: Hi All, Solr 5.4, Ubuntu I thought it was simple to request across two collections with the same schema but not. I have one solr instance launch. 300 000 records in each collection. I try to use this request without having both results: http://my_adress:my_port /solr/C1/select?collection=C1,C2&q=fid:34520196&wt=json this request returns only C1 results and if I do: http://my_adress:my_port /solr/C2/select?collection=C1,C2&q=fid:34520196&wt=json it returns only C2 results. I have 5 identical fields on both collection id, fid, st, cc, timestamp where id is the unique key field. Can someone could explain me why it doesn't work ? Thanks a lot ! Bruno --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com -- Regards, Binoy Dalal --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com
Re: Newbie: Searching across 2 collections ?
Hi Ester, yes, i saw it, but if I use: q={!join from=fid to=fid}fid:34520196 (with or not &collection=c1,c2) I have only the result from the collection used in the select/c1 Le 06/01/2016 17:52, esther.quan...@lucidworks.com a écrit : Hi Bruno, You might consider using the JoinQueryParser. Details here : https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-JoinQueryParser Best, Esther Le 6 janv. 2016 à 08:48, Bruno Mannina a écrit : Same result on my dev' server, it seems that collection param haven't effect on the query... Q: I don't see on the solr 5.4 doc, the "collection" param for select handler, is it always present in 5.4 version ? Le 06/01/2016 17:38, Bruno Mannina a écrit : I have a dev' server, I will do some test on it... Le 06/01/2016 17:31, Susheel Kumar a écrit : I'll suggest if you can setup some some test data locally and try this out. This will confirm your understanding. Thanks, Susheel On Wed, Jan 6, 2016 at 10:39 AM, Bruno Mannina wrote: Hi Susheel, Emir, yes I check, and I have one result in c1 and in c2 with the same query fid:34520196 http://xxx.xxx.xxx.xxx: /solr/c1/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 { "responseHeader":{ "status":0, "QTime":1, "params":{ "fl":"fid,cc*,st", "indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"EP1680447", "st":"LAPSED", "fid":"34520196"}] } } http://xxx.xxx.xxx.xxx: /solr/c2/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 { "responseHeader":{ "status":0, "QTime":0, "params":{ "fl":"id,fid,cc*,st", "indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"WO2005040212", "st":"PENDING", "cc_CA":"LAPSED", "cc_EP":"LAPSED", "cc_JP":"PENDING", "cc_US":"LAPSED", "fid":"34520196"}] }} I have the same xxx.xxx.xxx.xxx: (server:port). unique key field C1, C2 : id id data in C1 is different of id data in C2 Must I config/set something in solr ? thanks, Bruno Le 06/01/2016 14:56, Emir Arnautovic a écrit : Hi Bruno, Can you check counts? Is it possible that first page is only with results from collection that you sent request to so you assumed it returns only results from single collection? Thanks, Emir On 06.01.2016 14:33, Susheel Kumar wrote: Hi Bruno, I just tested this scenario in my local solr 5.3.1 and it returned results from two identical collections. I doubt if it is broken in 5.4 just double check if you are not missing anything else. Thanks, Susheel http://localhost:8983/solr/c1/select?q=id_type%3Ahello&wt=json&indent=true&collection=c1,c2 responseHeader": {"status": 0,"QTime": 98,"params": {"q": "id_type:hello"," indent": "true","collection": "c1,c2","wt": "json"}}, response": {"numFound": 2,"start": 0,"maxScore": 1,"docs": [{"id": "1"," id_type": "hello","_version_": 1522623395043213300},{"id": "3","id_type":" hello","_version_": 1522623422397415400}]} On Wed, Jan 6, 2016 at 6:13 AM, Bruno Mannina wrote: yes id value is unique in C1 and unique in C2. id in C1 is never present in C2 id in C2 is never present in C1 Le 06/01/2016 11:12, Binoy Dalal a écrit : Are Id values for docs in both the collections exactly same? To get proper results, the ids should be unique across both the cores. On Wed, 6 Jan 2016, 15:11 Bruno Mannina wrote: Hi All, Solr 5.4, Ubuntu I thought it was simple to request across two collections with the same schema but not. I have one solr instance launch. 300 000 records in each collection. I try to use this request without having both results: http://my_adress:my_port /solr/C1/select?collection=C1,C2&q=fid:34520196&wt=json this request returns only C1 results and if I do: http://my_adress:my_port /solr/C2/select?collection=C1,C2&q=fid:34520196&wt=json it returns on
Re: Newbie: Searching across 2 collections ?
:( not work for me http://my_adress:my_port/solr/c1/select?q={!join from=fid to=fid fromIndex=c2}fid:34520196&wt=json the result is always the same, it answer only for c1 34520196 has result in both collections Le 06/01/2016 18:16, Binoy Dalal a écrit : Bruno, Use join like so: {!join from=f1 to=f2 fromIndex=c2} On c1 On Wed, 6 Jan 2016, 22:30 Bruno Mannina wrote: Hi Ester, yes, i saw it, but if I use: q={!join from=fid to=fid}fid:34520196 (with or not &collection=c1,c2) I have only the result from the collection used in the select/c1 Le 06/01/2016 17:52, esther.quan...@lucidworks.com a écrit : Hi Bruno, You might consider using the JoinQueryParser. Details here : https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-JoinQueryParser Best, Esther Le 6 janv. 2016 à 08:48, Bruno Mannina a écrit : Same result on my dev' server, it seems that collection param haven't effect on the query... Q: I don't see on the solr 5.4 doc, the "collection" param for select handler, is it always present in 5.4 version ? Le 06/01/2016 17:38, Bruno Mannina a écrit : I have a dev' server, I will do some test on it... Le 06/01/2016 17:31, Susheel Kumar a écrit : I'll suggest if you can setup some some test data locally and try this out. This will confirm your understanding. Thanks, Susheel On Wed, Jan 6, 2016 at 10:39 AM, Bruno Mannina wrote: Hi Susheel, Emir, yes I check, and I have one result in c1 and in c2 with the same query fid:34520196 http://xxx.xxx.xxx.xxx: /solr/c1/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 { "responseHeader":{ "status":0, "QTime":1, "params":{ "fl":"fid,cc*,st", "indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"EP1680447", "st":"LAPSED", "fid":"34520196"}] } } http://xxx.xxx.xxx.xxx: /solr/c2/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 { "responseHeader":{ "status":0, "QTime":0, "params":{ "fl":"id,fid,cc*,st", "indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"WO2005040212", "st":"PENDING", "cc_CA":"LAPSED", "cc_EP":"LAPSED", "cc_JP":"PENDING", "cc_US":"LAPSED", "fid":"34520196"}] }} I have the same xxx.xxx.xxx.xxx: (server:port). unique key field C1, C2 : id id data in C1 is different of id data in C2 Must I config/set something in solr ? thanks, Bruno Le 06/01/2016 14:56, Emir Arnautovic a écrit : Hi Bruno, Can you check counts? Is it possible that first page is only with results from collection that you sent request to so you assumed it returns only results from single collection? Thanks, Emir On 06.01.2016 14:33, Susheel Kumar wrote: Hi Bruno, I just tested this scenario in my local solr 5.3.1 and it returned results from two identical collections. I doubt if it is broken in 5.4 just double check if you are not missing anything else. Thanks, Susheel http://localhost:8983/solr/c1/select?q=id_type%3Ahello&wt=json&indent=true&collection=c1,c2 responseHeader": {"status": 0,"QTime": 98,"params": {"q": "id_type:hello"," indent": "true","collection": "c1,c2","wt": "json"}}, response": {"numFound": 2,"start": 0,"maxScore": 1,"docs": [{"id": "1"," id_type": "hello","_version_": 1522623395043213300},{"id": "3","id_type":" hello","_version_": 1522623422397415400}]} On Wed, Jan 6, 2016 at 6:13 AM, Bruno Mannina wrote: yes id value is unique in C1 and unique in C2. id in C1 is never present in C2 id in C2 is never present in C1 Le 06/01/2016 11:12, Binoy Dalal a écrit : Are Id values for docs in both the collections exactly same? To get proper results, the ids should be unique across both the cores. On Wed, 6 Jan 2016, 15:11 Bruno Mannina wrote: Hi All, Solr 5.4, Ubuntu I thought it was simple to req
Re: Newbie: Searching across 2 collections ?
Yeah ! it works with your method ! thanks a lot Esther ! Le 06/01/2016 19:15, Esther-Melaine Quansah a écrit : Ok, so join won’t work. Distributed search is your answer. This worked for me: http://localhost:8983/solr/temp/select?shards=localhost:8983/solr/job,localhost:8983/solr/temp&q=*:* <http://localhost:8983/solr/temp/select?shards=localhost:8983/solr/job,localhost:8983/solr/temp&q=*:*> so for you it’d look something like http://localhost:8983/solr/c1/select?shards=localhost:8983/solr/c1,localhost:8983/solr/c2&q=fid:34520196 <http://localhost:8983/solr/c1/select?shards=localhost:8983/solr/c1,localhost:8983/solr/c2&q=fid:34520196> and obviously, you’ll just choose the ports that correspond to your configuration. Esther On Jan 6, 2016, at 9:36 AM, Bruno Mannina wrote: :( not work for me http://my_adress:my_port/solr/c1/select?q={!join from=fid to=fid fromIndex=c2}fid:34520196&wt=json the result is always the same, it answer only for c1 34520196 has result in both collections Le 06/01/2016 18:16, Binoy Dalal a écrit : Bruno, Use join like so: {!join from=f1 to=f2 fromIndex=c2} On c1 On Wed, 6 Jan 2016, 22:30 Bruno Mannina wrote: Hi Ester, yes, i saw it, but if I use: q={!join from=fid to=fid}fid:34520196 (with or not &collection=c1,c2) I have only the result from the collection used in the select/c1 Le 06/01/2016 17:52, esther.quan...@lucidworks.com a écrit : Hi Bruno, You might consider using the JoinQueryParser. Details here : https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-JoinQueryParser Best, Esther Le 6 janv. 2016 à 08:48, Bruno Mannina a écrit : Same result on my dev' server, it seems that collection param haven't effect on the query... Q: I don't see on the solr 5.4 doc, the "collection" param for select handler, is it always present in 5.4 version ? Le 06/01/2016 17:38, Bruno Mannina a écrit : I have a dev' server, I will do some test on it... Le 06/01/2016 17:31, Susheel Kumar a écrit : I'll suggest if you can setup some some test data locally and try this out. This will confirm your understanding. Thanks, Susheel On Wed, Jan 6, 2016 at 10:39 AM, Bruno Mannina wrote: Hi Susheel, Emir, yes I check, and I have one result in c1 and in c2 with the same query fid:34520196 http://xxx.xxx.xxx.xxx: /solr/c1/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 { "responseHeader":{ "status":0, "QTime":1, "params":{ "fl":"fid,cc*,st", "indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"EP1680447", "st":"LAPSED", "fid":"34520196"}] } } http://xxx.xxx.xxx.xxx: /solr/c2/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 { "responseHeader":{ "status":0, "QTime":0, "params":{ "fl":"id,fid,cc*,st", "indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"WO2005040212", "st":"PENDING", "cc_CA":"LAPSED", "cc_EP":"LAPSED", "cc_JP":"PENDING", "cc_US":"LAPSED", "fid":"34520196"}] }} I have the same xxx.xxx.xxx.xxx: (server:port). unique key field C1, C2 : id id data in C1 is different of id data in C2 Must I config/set something in solr ? thanks, Bruno Le 06/01/2016 14:56, Emir Arnautovic a écrit : Hi Bruno, Can you check counts? Is it possible that first page is only with results from collection that you sent request to so you assumed it returns only results from single collection? Thanks, Emir On 06.01.2016 14:33, Susheel Kumar wrote: Hi Bruno, I just tested this scenario in my local solr 5.3.1 and it returned results from two identical collections. I doubt if it is broken in 5.4 just double check if you are not missing anything else. Thanks, Susheel http://localhost:8983/solr/c1/select?q=id_type%3Ahello&wt=json&indent=true&collection=c1,c2 responseHeader": {"status": 0,"QTime": 98,"params": {"q": "id_type:hello"," indent": "true","collection": "c1,c2","wt
Re: Newbie: Searching across 2 collections ?
Hi Shawn, thanks for this info, I use solr alone on my own server. Le 06/01/2016 20:13, Shawn Heisey a écrit : On 1/6/2016 2:41 AM, Bruno Mannina wrote: I try to use this request without having both results: http://my_adress:my_port/solr/C1/select?collection=C1,C2&q=fid:34520196&wt=json this request returns only C1 results and if I do: http://my_adress:my_port/solr/C2/select?collection=C1,C2&q=fid:34520196&wt=json it returns only C2 results. Are you running in SolrCloud mode (with zookeeper)? If you're not, then the collection parameter doesn't do anything, and old-style distributed search (with the shards parameter) will be your only option. Thanks, Shawn --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com
Re: Newbie: Searching across 2 collections ?
Hi, is it possible that was the problem wrote by Shawn and you have SolrCloud mode (with zookeeper) ? The solution gives by Esther works fine so it's ok for me :) ** Are you running in SolrCloud mode (with zookeeper)? If you're not, then the collection parameter doesn't do anything, and old-style distributed search (with the shards parameter) will be your only option. Thanks, Shawn *** Le 06/01/2016 19:17, Susheel Kumar a écrit : Hi Bruno, I just tested on 5.4 for your sake and it works fine. You are somewhere goofing up. Please create a new simple schema different from your use case with 2-3 fields with 2-3 documents and test this out independently on your current problem. That's what i can make suggestion and did same to confirm this. On Wed, Jan 6, 2016 at 11:48 AM, Bruno Mannina wrote: Same result on my dev' server, it seems that collection param haven't effect on the query... Q: I don't see on the solr 5.4 doc, the "collection" param for select handler, is it always present in 5.4 version ? Le 06/01/2016 17:38, Bruno Mannina a écrit : I have a dev' server, I will do some test on it... Le 06/01/2016 17:31, Susheel Kumar a écrit : I'll suggest if you can setup some some test data locally and try this out. This will confirm your understanding. Thanks, Susheel On Wed, Jan 6, 2016 at 10:39 AM, Bruno Mannina wrote: Hi Susheel, Emir, yes I check, and I have one result in c1 and in c2 with the same query fid:34520196 http://xxx.xxx.xxx.xxx: /solr/c1/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 { "responseHeader":{ "status":0, "QTime":1, "params":{ "fl":"fid,cc*,st", "indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"EP1680447", "st":"LAPSED", "fid":"34520196"}] } } http://xxx.xxx.xxx.xxx: /solr/c2/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 { "responseHeader":{ "status":0, "QTime":0, "params":{ "fl":"id,fid,cc*,st", "indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"WO2005040212", "st":"PENDING", "cc_CA":"LAPSED", "cc_EP":"LAPSED", "cc_JP":"PENDING", "cc_US":"LAPSED", "fid":"34520196"}] }} I have the same xxx.xxx.xxx.xxx: (server:port). unique key field C1, C2 : id id data in C1 is different of id data in C2 Must I config/set something in solr ? thanks, Bruno Le 06/01/2016 14:56, Emir Arnautovic a écrit : Hi Bruno, Can you check counts? Is it possible that first page is only with results from collection that you sent request to so you assumed it returns only results from single collection? Thanks, Emir On 06.01.2016 14:33, Susheel Kumar wrote: Hi Bruno, I just tested this scenario in my local solr 5.3.1 and it returned results from two identical collections. I doubt if it is broken in 5.4 just double check if you are not missing anything else. Thanks, Susheel http://localhost:8983/solr/c1/select?q=id_type%3Ahello&wt=json&indent=true&collection=c1,c2 responseHeader": {"status": 0,"QTime": 98,"params": {"q": "id_type:hello"," indent": "true","collection": "c1,c2","wt": "json"}}, response": {"numFound": 2,"start": 0,"maxScore": 1,"docs": [{"id": "1"," id_type": "hello","_version_": 1522623395043213300},{"id": "3","id_type":" hello","_version_": 1522623422397415400}]} On Wed, Jan 6, 2016 at 6:13 AM, Bruno Mannina wrote: yes id value is unique in C1 and unique in C2. id in C1 is never present in C2 id in C2 is never present in C1 Le 06/01/2016 11:12, Binoy Dalal a écrit : Are Id values for docs in both the collections exactly same? To get proper results, the ids should be unique across both the cores. On Wed, 6 Jan 2016, 15:11 Bruno Mannina wrote: Hi All, Solr 5.4, Ubuntu I thought it was simple to request across two collections
Wildcard "?" ?
Dear Solr-user, I'm surprise to see in my SOLR 5.0 that the wildward ? replace inevitably 1 character. my request is: title:magnet? AND tire? SOLR found only title with a character after magnet and tire but don't found title with only magnet AND tire Do you know where can I tell to solr that ? wildcard means [0, 1] character and not [1] character ? Is it possible ? Thanks a lot ! my field in my schema is defined like that: Field: title Field-Type: org.apache.solr.schema.TextField PI Gap: 100 Flags: Indexed Tokenized Stored Multivalued Properties y y y y Schema y y y y Index y y y * org.apache.solr.analysis.TokenizerChain * org.apache.solr.analysis.TokenizerChain --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com
Re: Wildcard "?" ?
title:/magnet.?/ doesn't work for me because solr answers: |title = "Magnetic folding system"| but thanks to give me the idea to use regexp !!! Le 21/10/2015 18:46, Upayavira a écrit : No, you cannot tell Solr to handle wildcards differently. However, you can use regular expressions for searching: title:/magnet.?/ should do it. Upayavira On Wed, Oct 21, 2015, at 11:35 AM, Bruno Mannina wrote: Dear Solr-user, I'm surprise to see in my SOLR 5.0 that the wildward ? replace inevitably 1 character. my request is: title:magnet? AND tire? SOLR found only title with a character after magnet and tire but don't found title with only magnet AND tire Do you know where can I tell to solr that ? wildcard means [0, 1] character and not [1] character ? Is it possible ? Thanks a lot ! my field in my schema is defined like that: Field: title Field-Type: org.apache.solr.schema.TextField PI Gap: 100 Flags: Indexed Tokenized Stored Multivalued Properties y y y y Schema y y y y Index y y y * org.apache.solr.analysis.TokenizerChain * org.apache.solr.analysis.TokenizerChain --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com
Re: Wildcard "?" ?
Upayavira, Thanks a lot for these information Regards, Bruno Le 21/10/2015 19:24, Upayavira a écrit : regexp will match the whole term. So, if you have stemming on, magnetic may well stem to magnet, and that is the term against which the regexp is executed. If you want to do the regexp against the whole field, then you need to do it against a string version of that field. The process of using a regexp (and a wildcard for that matter) is: * search through the list of terms in your field for terms that match your regexp (uses an FST for speed) * search for documents that contain those resulting terms Upayavira On Wed, Oct 21, 2015, at 12:08 PM, Bruno Mannina wrote: title:/magnet.?/ doesn't work for me because solr answers: |title = "Magnetic folding system"| but thanks to give me the idea to use regexp !!! Le 21/10/2015 18:46, Upayavira a écrit : No, you cannot tell Solr to handle wildcards differently. However, you can use regular expressions for searching: title:/magnet.?/ should do it. Upayavira On Wed, Oct 21, 2015, at 11:35 AM, Bruno Mannina wrote: Dear Solr-user, I'm surprise to see in my SOLR 5.0 that the wildward ? replace inevitably 1 character. my request is: title:magnet? AND tire? SOLR found only title with a character after magnet and tire but don't found title with only magnet AND tire Do you know where can I tell to solr that ? wildcard means [0, 1] character and not [1] character ? Is it possible ? Thanks a lot ! my field in my schema is defined like that: Field: title Field-Type: org.apache.solr.schema.TextField PI Gap: 100 Flags: Indexed Tokenized Stored Multivalued Properties y y y y Schema y y y y Index y y y * org.apache.solr.analysis.TokenizerChain * org.apache.solr.analysis.TokenizerChain --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com
Solr 3.6, Highlight and multi words?
Dear Solr User, I try to work with highlight, it works well but only if I have only one keyword in my query?! If my request is plastic AND bicycle then only plastic is highlight. my request is: ./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&rows=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5 Could you help me please to understand ? I read doc, google, without success... so I post here... my result is: (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal body (10) made fromplastic material , particularly for touring bike. #CMT#ADVANTAGE : #/CMT# The bicycle pedal has a pedal body made fromplastic betweenplastic tapes 3 and 3 having two heat fusion layers, and the twoplastic tapes 3 and 3 are stuck elements. A connecting element is formed as a hinge, a flexible foil or a flexibleplastic part. #CMT#USE A bicycle handlebar grip includes an inner fiber layer and an outerplastic layer. Thus, the fiber handlebar grip, while theplastic layer is soft and has an adjustable thickness to provide a comfortable sensation to a user. In addition, theplastic layer includes a holding portion coated on the outer surface layer to enhance the combination strength between the fiber layer and theplastic layer and to enhance --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Solr 3.6, Highlight and multi words?
Additional information, in my schema.xml, my field is defined like this: May be it misses something? like termVectors Le 29/03/2015 21:15, Bruno Mannina a écrit : Dear Solr User, I try to work with highlight, it works well but only if I have only one keyword in my query?! If my request is plastic AND bicycle then only plastic is highlight. my request is: ./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&rows=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5 Could you help me please to understand ? I read doc, google, without success... so I post here... my result is: (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal body (10) made from<em>plastic</em> material , particularly for touring bike. #CMT#ADVANTAGE : #/CMT# The bicycle pedal has a pedal body made from<em>plastic</em> between<em>plastic</em> tapes 3 and 3 having two heat fusion layers, and the two<em>plastic</em> tapes 3 and 3 are stuck elements. A connecting element is formed as a hinge, a flexible foil or a flexible<em>plastic</em> part. #CMT#USE A bicycle handlebar grip includes an inner fiber layer and an outer<em>plastic</em> layer. Thus, the fiber handlebar grip, while the<em>plastic</em> layer is soft and has an adjustable thickness to provide a comfortable sensation to a user. In addition, the<em>plastic</em> layer includes a holding portion coated on the outer surface layer to enhance the combination strength between the fiber layer and the<em>plastic</em> layer and to enhance --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Solr 3.6, Highlight and multi words?
Sorry to disturb you with the renew but nobody use or have problem with multi-terms and highlight ? regards, Le 29/03/2015 21:15, Bruno Mannina a écrit : Dear Solr User, I try to work with highlight, it works well but only if I have only one keyword in my query?! If my request is plastic AND bicycle then only plastic is highlight. my request is: ./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&rows=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5 Could you help me please to understand ? I read doc, google, without success... so I post here... my result is: (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal body (10) made from<em>plastic</em> material , particularly for touring bike. #CMT#ADVANTAGE : #/CMT# The bicycle pedal has a pedal body made from<em>plastic</em> between<em>plastic</em> tapes 3 and 3 having two heat fusion layers, and the two<em>plastic</em> tapes 3 and 3 are stuck elements. A connecting element is formed as a hinge, a flexible foil or a flexible<em>plastic</em> part. #CMT#USE A bicycle handlebar grip includes an inner fiber layer and an outer<em>plastic</em> layer. Thus, the fiber handlebar grip, while the<em>plastic</em> layer is soft and has an adjustable thickness to provide a comfortable sensation to a user. In addition, the<em>plastic</em> layer includes a holding portion coated on the outer surface layer to enhance the combination strength between the fiber layer and the<em>plastic</em> layer and to enhance --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Solr 3.6, Highlight and multi words?
Dear Charles, Thanks for your answer, please find below my answers. ok it works if I use "aben" as field in my query as you say in Answer 1. it doesn't work if I use "ab" may be because "ab" field is a copyField for abfr, aben, abit, abpt Concerning the 2., yes you have right it's not and but AND I have this result: <em>Bicycle</em> frame comprises holder, particularly for water bottle, where holder is connected #CMT# #/CMT# The<em>bicycle</em> frame (7) comprises a holder (1), particularly for a water bottle . The holder is connected with the<em>bicycle</em> frame by a screw (5), where a mounting element has a compensation section which is made of an elastic material, particularly a<em>plastic</em> material. The compensation section So my last question is why I haven't instead having colored ? How can I tell to solr to use the colored ? Thanks a lot, Bruno Le 01/04/2015 17:15, Reitzel, Charles a écrit : Haven't used Solr 3.x in a long time. But with 4.10.x, I haven't had any trouble with multiple terms. I'd look at a few things. 1. Do you have a typo in your query? Shouldn't it be q=aben:(plastic and bicycle)? ^^ 2. Try removing the word "and" from the query. There may be some interaction with a stop word filter. If you want a phrase query, wrap it in quotes. 3. Also, be sure that the query and indexing analyzers for the aben field are compatible with each other. -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 7:05 AM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? Sorry to disturb you with the renew but nobody use or have problem with multi-terms and highlight ? regards, Le 29/03/2015 21:15, Bruno Mannina a écrit : Dear Solr User, I try to work with highlight, it works well but only if I have only one keyword in my query?! If my request is plastic AND bicycle then only plastic is highlight. my request is: ./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&row s=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5 Could you help me please to understand ? I read doc, google, without success... so I post here... my result is: (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal body (10) made from<em>plastic</em> material , particularly for touring bike. #CMT#ADVANTAGE : #/CMT# The bicycle pedal has a pedal body made from<em>plastic</em> between<em>plastic</em> tapes 3 and 3 having two heat fusion layers, and the two<em>plastic</em> tapes 3 and 3 are stuck elements. A connecting element is formed as a hinge, a flexible foil or a flexible<em>plastic</em> part. #CMT#USE A bicycle handlebar grip includes an inner fiber layer and an outer<em>plastic</em> layer. Thus, the fiber handlebar grip, while the<em>plastic</em> layer is soft and has an adjustable thickness to provide a comfortable sensation to a user. In addition, the<em>plastic</em> layer includes a holding portion coated on the outer surface layer to enhance the combination strength between the fiber layer and the<em>plastic</em> layer and to enhance * This e-mail may contain confidential or privileged information. If you are not the intended recipient, please notify the sender immediately and then delete it. TIAA-CREF *
Re: Solr 3.6, Highlight and multi words?
ok for qf (i can't test now) but concerning hl.simple.pre hl.simple.post I can define only one color no ? in the sample solrconfig.xml there are several color, How can I tell to solr to use these color instead of hl.simple.pre/post ? Le 01/04/2015 20:58, Reitzel, Charles a écrit : If you want to query on the field ab, you'll probably need to add it the qf parameter. To control the highlighting markup, with the standard highlighter, use hl.simple.pre and hl.simple.post. https://cwiki.apache.org/confluence/display/solr/Standard+Highlighter -Original Message----- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 2:24 PM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? Dear Charles, Thanks for your answer, please find below my answers. ok it works if I use "aben" as field in my query as you say in Answer 1. it doesn't work if I use "ab" may be because "ab" field is a copyField for abfr, aben, abit, abpt Concerning the 2., yes you have right it's not and but AND I have this result: <em>Bicycle</em> frame comprises holder, particularly for water bottle, where holder is connected #CMT# #/CMT# The<em>bicycle</em> frame (7) comprises a holder (1), particularly for a water bottle . The holder is connected with the<em>bicycle</em> frame by a screw (5), where a mounting element has a compensation section which is made of an elastic material, particularly a<em>plastic</em> material. The compensation section So my last question is why I haven't instead having colored ? How can I tell to solr to use the colored ? Thanks a lot, Bruno Le 01/04/2015 17:15, Reitzel, Charles a écrit : Haven't used Solr 3.x in a long time. But with 4.10.x, I haven't had any trouble with multiple terms. I'd look at a few things. 1. Do you have a typo in your query? Shouldn't it be q=aben:(plastic and bicycle)? ^^ 2. Try removing the word "and" from the query. There may be some interaction with a stop word filter. If you want a phrase query, wrap it in quotes. 3. Also, be sure that the query and indexing analyzers for the aben field are compatible with each other. -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 7:05 AM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? Sorry to disturb you with the renew but nobody use or have problem with multi-terms and highlight ? regards, Le 29/03/2015 21:15, Bruno Mannina a écrit : Dear Solr User, I try to work with highlight, it works well but only if I have only one keyword in my query?! If my request is plastic AND bicycle then only plastic is highlight. my request is: ./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&ro w s=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5 Could you help me please to understand ? I read doc, google, without success... so I post here... my result is: (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal body (10) made from<em>plastic</em> material , particularly for touring bike. #CMT#ADVANTAGE : #/CMT# The bicycle pedal has a pedal body made from<em>plastic</em> between<em>plastic</em> tapes 3 and 3 having two heat fusion layers, and the two<em>plastic</em> tapes 3 and 3 are stuck elements. A connecting element is formed as a hinge, a flexible foil or a flexible<em>plastic</em> part. #CMT#USE A bicycle handlebar grip includes an inner fiber layer and an outer<em>plastic</em> layer. Thus, the fiber handlebar grip, while the<em>plastic</em> layer is soft and has an adjustable thickness to provide a comfortable sensation to a user. In addition, the<em>plastic</em> layer includes a holding portion coated on the outer surface layer to enhance the combination strength between the fiber layer and the<em>plastic</em> layer and to enhance ** *** This e-mail may contain confidential or privileged information. If you are not the intended recipient, please notify the sender immediately and then delete it. TIAA-CREF ** *** * This e-mail may contain confidential or privileged information. If you are not the intended recipient, please notify the sender immediately and then delete it. TIAA-CREF *
Re: Solr 3.6, Highlight and multi words?
of course no prb charles, you already help me ! Le 01/04/2015 21:54, Reitzel, Charles a écrit : Sorry, I've never tried highlighting in multiple colors... -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 3:43 PM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? ok for qf (i can't test now) but concerning hl.simple.pre hl.simple.post I can define only one color no ? in the sample solrconfig.xml there are several color, How can I tell to solr to use these color instead of hl.simple.pre/post ? Le 01/04/2015 20:58, Reitzel, Charles a écrit : If you want to query on the field ab, you'll probably need to add it the qf parameter. To control the highlighting markup, with the standard highlighter, use hl.simple.pre and hl.simple.post. https://cwiki.apache.org/confluence/display/solr/Standard+Highlighter -Original Message----- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 2:24 PM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? Dear Charles, Thanks for your answer, please find below my answers. ok it works if I use "aben" as field in my query as you say in Answer 1. it doesn't work if I use "ab" may be because "ab" field is a copyField for abfr, aben, abit, abpt Concerning the 2., yes you have right it's not and but AND I have this result: <em>Bicycle</em> frame comprises holder, particularly for water bottle, where holder is connected #CMT# #/CMT# The<em>bicycle</em> frame (7) comprises a holder (1), particularly for a water bottle . The holder is connected with the<em>bicycle</em> frame by a screw (5), where a mounting element has a compensation section which is made of an elastic material, particularly a<em>plastic</em> material. The compensation section So my last question is why I haven't instead having colored ? How can I tell to solr to use the colored ? Thanks a lot, Bruno Le 01/04/2015 17:15, Reitzel, Charles a écrit : Haven't used Solr 3.x in a long time. But with 4.10.x, I haven't had any trouble with multiple terms. I'd look at a few things. 1. Do you have a typo in your query? Shouldn't it be q=aben:(plastic and bicycle)? ^^ 2. Try removing the word "and" from the query. There may be some interaction with a stop word filter. If you want a phrase query, wrap it in quotes. 3. Also, be sure that the query and indexing analyzers for the aben field are compatible with each other. -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 7:05 AM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? Sorry to disturb you with the renew but nobody use or have problem with multi-terms and highlight ? regards, Le 29/03/2015 21:15, Bruno Mannina a écrit : Dear Solr User, I try to work with highlight, it works well but only if I have only one keyword in my query?! If my request is plastic AND bicycle then only plastic is highlight. my request is: ./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&r o w s=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5 Could you help me please to understand ? I read doc, google, without success... so I post here... my result is: (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal body (10) made from<em>plastic</em> material , particularly for touring bike. #CMT#ADVANTAGE : #/CMT# The bicycle pedal has a pedal body made from<em>plastic</em> between<em>plastic</em> tapes 3 and 3 having two heat fusion layers, and the two<em>plastic</em> tapes 3 and 3 are stuck elements. A connecting element is formed as a hinge, a flexible foil or a flexible<em>plastic</em> part. #CMT#USE A bicycle handlebar grip includes an inner fiber layer and an outer<em>plastic</em> layer. Thus, the fiber handlebar grip, while the<em>plastic</em> layer is soft and has an adjustable thickness to provide a comfortable sensation to a user. In addition, the<em>plastic</em> layer includes a holding portion coated on the outer surface layer to enhance the combination strength between the fiber layer and the<em>plastic</em> layer
Solr 5.0, defaultSearchField, defaultOperator ?
Dear Solr users, Since today I used SOLR 5.0 (I used solr 3.6) so i try to adapt my old schema for solr 5.0. I have two questions: - how can I set the defaultSearchField ? I don't want to use in the query the df tag because I have a lot of modification to do for that on my web project. - how can I set the defaultOperator (and|or) ? It seems that these "options" are now deprecated in SOLR 5.0 schema. Thanks a lot for your comment, Regards, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Solr 5.0, defaultSearchField, defaultOperator ?
Thx Chris & Ahmet ! Le 17/04/2015 23:56, Chris Hostetter a écrit : : df and q.op are the ones you are looking for. : You can define them in defaults section. specifically... https://cwiki.apache.org/confluence/display/solr/InitParams+in+SolrConfig : : Ahmet : : : : On Friday, April 17, 2015 9:18 PM, Bruno Mannina wrote: : Dear Solr users, : : Since today I used SOLR 5.0 (I used solr 3.6) so i try to adapt my old : schema for solr 5.0. : : I have two questions: : - how can I set the defaultSearchField ? : I don't want to use in the query the df tag because I have a lot of : modification to do for that on my web project. : : - how can I set the defaultOperator (and|or) ? : : It seems that these "options" are now deprecated in SOLR 5.0 schema. : : Thanks a lot for your comment, : : Regards, : Bruno : : --- : Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. : http://www.avast.com : -Hoss http://www.lucidworks.com/ --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Correspondance table ?
Dear Solr Users, Solr 5.0.0 I have actually around 90 000 000 docs in my solr, and I have a field with one char which represents a category. i.e: value = a, definition : nature and health etc... I have fews categories, around 15. These definition categories can changed during years. Can I use a file where I will have a\tNature and Health b\tComputer science etc... and instead of having the code letter in my json result solr, I will have the definition ? Only in the result. The query will be done with the code letter. I'm sure it's possible ! Additional question: is it possible to do that also with a big correspondance file? around 5000 definitions? Thanks for your help, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Correspondance table ?
Hi Alex, well ok but if I have a big table ? more than 10 000 entries ? is it safe to do that client side ? note: I have one little table but I have also 2 big tables for 2 other fields Le 20/04/2015 10:57, Alexandre Rafalovitch a écrit : The best place to do so is in the client software, since you are not using it for search in any way. So, wherever you get your Solr's response JSON/XML/etc, map it there. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 20 April 2015 at 18:23, Bruno Mannina wrote: Dear Solr Users, Solr 5.0.0 I have actually around 90 000 000 docs in my solr, and I have a field with one char which represents a category. i.e: value = a, definition : nature and health etc... I have fews categories, around 15. These definition categories can changed during years. Can I use a file where I will have a\tNature and Health b\tComputer science etc... and instead of having the code letter in my json result solr, I will have the definition ? Only in the result. The query will be done with the code letter. I'm sure it's possible ! Additional question: is it possible to do that also with a big correspondance file? around 5000 definitions? Thanks for your help, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Correspondance table ?
Hi Jack, ok, it's not for many millions of users, just max 100 by day. it will be used on traditional "PC" and also on mobile clients. Then, I need to do test to verify the possibility. Thx Le 20/04/2015 14:20, Jack Krupansky a écrit : It depends on the specific nature of your clients. Is they in-house users, like only dozens or hundreds, or is this a large web app with many millions of users and with mobile clients as well as traditional "PC" clients? If it feels too much to do in the client, then a middleware API service layer could be the way to go. In any case, don't try to load too much work onto the Solr server itself. -- Jack Krupansky On Mon, Apr 20, 2015 at 7:32 AM, Bruno Mannina wrote: Hi Alex, well ok but if I have a big table ? more than 10 000 entries ? is it safe to do that client side ? note: I have one little table but I have also 2 big tables for 2 other fields Le 20/04/2015 10:57, Alexandre Rafalovitch a écrit : The best place to do so is in the client software, since you are not using it for search in any way. So, wherever you get your Solr's response JSON/XML/etc, map it there. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 20 April 2015 at 18:23, Bruno Mannina wrote: Dear Solr Users, Solr 5.0.0 I have actually around 90 000 000 docs in my solr, and I have a field with one char which represents a category. i.e: value = a, definition : nature and health etc... I have fews categories, around 15. These definition categories can changed during years. Can I use a file where I will have a\tNature and Health b\tComputer science etc... and instead of having the code letter in my json result solr, I will have the definition ? Only in the result. The query will be done with the code letter. I'm sure it's possible ! Additional question: is it possible to do that also with a big correspondance file? around 5000 definitions? Thanks for your help, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Solr5.0.0, do a commit alone ?
Dear Solr Users, With Solr3.6, when I want to force a commit without giving data, I do: java -jar post.jar Now with Solr5.0.0, I use bin/post . but it do not accept to do a commit if I don't give a data directory. ie: bin/post -c mydb -commit yes I want to do that because I have a file with delete action. Each line in this file contains one ref to delete bin/post -c mydb -commit no -d "..." So I would like to do the commit only after running my file with a command line bin/post -c mydb -commit yes (without data) is not accepted by post Thanks, Sincerely, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Solr 5.0, Ubuntu 14.04, SOLR_JAVA_MEM problem
Dear Solr Community, I have a recent computer with 8Go RAM, I installed Ubuntu 14.04 and SOLR 5.0, Java 7 This is a brand new installation. all work fine but I would like to increase the JAVA_MEM_SOLR (40% of total RAM available). So I edit the bin/solr.in.sh # Increase Java Min/Max Heap as needed to support your indexing / query needs SOLR_JAVA_MEM="-Xms3g –Xmx3g -XX:MaxPermSize=512m -XX:PermSize=512m" but with this param, the solr server can't be start, I use: bin/solr start Do you have an idea of the problem ? Thanks a lot for your comment, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Delete document stop my solr 5.0 ?!
Dear Solr Users, I have a brand new computer where I installed Ubuntu 14.04, 8Go RAM, SOLR 5.0, Java 7 I indexed 92 000 000 docs (little text file ~2ko each) I have around 30 fields All work fine but each Tuesday I need to delete some docs inside, so I create a batch file with inside line like this: /home/solr/solr-5.0.0/bin/post -c docdb -commit no -d "f1:58644" /home/solr/solr-5.0.0/bin/post -c docdb -commit no -d "f1:162882" .. . /home/solr/solr-5.0.0/bin/post -c docdb -commit yes -d "f1:2868668" my f1 field is my key field. It is unique. But if my file contains more than one or two hundreds line, my solr shutdown. Two hundreds line shutdown always solr 5.0. I have no error in my console, just Solr can't be reach on the port 8983. Is exists a variable that I must increase to disable this error ? On my old solr 3.6, I don't use the same line to delete document, I use: java -jar -Ddata=args -Dcommit=no post.jar "113422" You can see that I use directly not , and my schema between solr3.6 and solr5.0 is almost the same. I have just some more fields. why this method do not work now ? Thanks a lot, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Delete document stop my solr 5.0 ?!
ok I have this OOM error in the log file ... # # java.lang.OutOfMemoryError: Java heap space # -XX:OnOutOfMemoryError="/home/solr/solr-5.0.0/bin/oom_solr.sh 8983/home/solr/solr-5.0.0/server/logs" # Executing /bin/sh -c "/home/solr/solr-5.0.0/bin/oom_solr.sh 8983/home/solr/solr-5.0.0/server/logs"... Running OOM killer script for process 28233 for Solr on port 8983 Killed process 28233 I try in few minutes to increase the formdataUploadLimitInKB and I will tell you the result. Le 04/05/2015 14:58, Shawn Heisey a écrit : On 5/4/2015 3:19 AM, Bruno Mannina wrote: All work fine but each Tuesday I need to delete some docs inside, so I create a batch file with inside line like this: /home/solr/solr-5.0.0/bin/post -c docdb -commit no -d "f1:58644" /home/solr/solr-5.0.0/bin/post -c docdb -commit no -d "f1:162882" .. . /home/solr/solr-5.0.0/bin/post -c docdb -commit yes -d "f1:2868668" my f1 field is my key field. It is unique. But if my file contains more than one or two hundreds line, my solr shutdown. Two hundreds line shutdown always solr 5.0. I have no error in my console, just Solr can't be reach on the port 8983. Is exists a variable that I must increase to disable this error ? As far as I know, the only limit that can affect that is the maximum post size. Current versions of Solr default to a 2MB max post size, using the formdataUploadLimitInKB attribute on the requestParsers element in solrconfig.xml, which defaults to 2048. Even if that limit is exceeded by a request, it should not crash Solr, it should simply log an error and ignore the request. It would be a bug if Solr does crash. What happens if you increase that limit? Are you seeing any error messages in the Solr logfile when you send that delete request? Thanks, Shawn --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Delete document stop my solr 5.0 ?!
I increase the formdataUploadLimitInKB to 2048000 and the problem is the same, same error an idea ? Le 04/05/2015 16:38, Bruno Mannina a écrit : ok I have this OOM error in the log file ... # # java.lang.OutOfMemoryError: Java heap space # -XX:OnOutOfMemoryError="/home/solr/solr-5.0.0/bin/oom_solr.sh 8983/home/solr/solr-5.0.0/server/logs" # Executing /bin/sh -c "/home/solr/solr-5.0.0/bin/oom_solr.sh 8983/home/solr/solr-5.0.0/server/logs"... Running OOM killer script for process 28233 for Solr on port 8983 Killed process 28233 I try in few minutes to increase the formdataUploadLimitInKB and I will tell you the result. Le 04/05/2015 14:58, Shawn Heisey a écrit : On 5/4/2015 3:19 AM, Bruno Mannina wrote: All work fine but each Tuesday I need to delete some docs inside, so I create a batch file with inside line like this: /home/solr/solr-5.0.0/bin/post -c docdb -commit no -d "f1:58644" /home/solr/solr-5.0.0/bin/post -c docdb -commit no -d "f1:162882" .. . /home/solr/solr-5.0.0/bin/post -c docdb -commit yes -d "f1:2868668" my f1 field is my key field. It is unique. But if my file contains more than one or two hundreds line, my solr shutdown. Two hundreds line shutdown always solr 5.0. I have no error in my console, just Solr can't be reach on the port 8983. Is exists a variable that I must increase to disable this error ? As far as I know, the only limit that can affect that is the maximum post size. Current versions of Solr default to a 2MB max post size, using the formdataUploadLimitInKB attribute on the requestParsers element in solrconfig.xml, which defaults to 2048. Even if that limit is exceeded by a request, it should not crash Solr, it should simply log an error and ignore the request. It would be a bug if Solr does crash. What happens if you increase that limit? Are you seeing any error messages in the Solr logfile when you send that delete request? Thanks, Shawn --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Solr 5.0, Ubuntu 14.04, SOLR_JAVA_MEM problem
Yes ! it works !!! Scott perfect For my config 3g do not work, but 2g yes ! Thanks Le 04/05/2015 16:50, Scott Dawson a écrit : Bruno, You have the wrong kind of dash (a long dash) in front of the Xmx flag. Could that be causing a problem? Regards, Scott On Mon, May 4, 2015 at 5:06 AM, Bruno Mannina wrote: Dear Solr Community, I have a recent computer with 8Go RAM, I installed Ubuntu 14.04 and SOLR 5.0, Java 7 This is a brand new installation. all work fine but I would like to increase the JAVA_MEM_SOLR (40% of total RAM available). So I edit the bin/solr.in.sh # Increase Java Min/Max Heap as needed to support your indexing / query needs SOLR_JAVA_MEM="-Xms3g –Xmx3g -XX:MaxPermSize=512m -XX:PermSize=512m" but with this param, the solr server can't be start, I use: bin/solr start Do you have an idea of the problem ? Thanks a lot for your comment, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Delete document stop my solr 5.0 ?!
Yes it was that ! I increased the SOLR_JAVA_MEM to 2g (with 8Go Ram i do more, 3g fail to run solr on my brand new computer) thanks ! Le 04/05/2015 17:03, Shawn Heisey a écrit : On 5/4/2015 8:38 AM, Bruno Mannina wrote: ok I have this OOM error in the log file ... # # java.lang.OutOfMemoryError: Java heap space # -XX:OnOutOfMemoryError="/home/solr/solr-5.0.0/bin/oom_solr.sh 8983/home/solr/solr-5.0.0/server/logs" # Executing /bin/sh -c "/home/solr/solr-5.0.0/bin/oom_solr.sh 8983/home/solr/solr-5.0.0/server/logs"... Running OOM killer script for process 28233 for Solr on port 8983 Out Of Memory errors are a completely different problem. Solr behavior is completely unpredictable after an OutOfMemoryError exception, so the 5.0 install includes a script to run on OOME that kills Solr. It's the only safe way to handle that problem. Your Solr install is not being given enough Java heap memory for what it is being asked to do. You need to increase the heap size for Solr. If you look at the admin UI for Solr in a web browser, you can see what the max heap is set to ... on a default 5.0 install running Solr with "bin/solr" the max heap will be 512m ... which is VERY small. Try using bin/solr with the -m option, set to something like 2g (for 2 gigabytes of heap). Thanks, Shawn --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Solr 5.0, Ubuntu 14.04, SOLR_JAVA_MEM problem
Shaun thanks a lot for this comment, So, I have this information, no information about 32 or 64 bits... solr@linux:~$ java -version java version "1.7.0_79" OpenJDK Runtime Environment (IcedTea 2.5.5) (7u79-2.5.5-0ubuntu0.14.04.2) OpenJDK Server VM (build 24.79-b02, mixed mode) solr@linux:~$ solr@linux:~$ uname -a Linux linux 3.13.0-51-generic #84-Ubuntu SMP Wed Apr 15 12:11:46 UTC 2015 i686 i686 i686 GNU/Linux solr@linux:~$ I need to install a new version of Java ? I just install my ubuntu since one week :) updates are up to date. Le 04/05/2015 17:23, Shawn Heisey a écrit : On 5/4/2015 9:09 AM, Bruno Mannina wrote: Yes ! it works !!! Scott perfect For my config 3g do not work, but 2g yes ! If you can't start Solr with a 3g heap, chances are that you are running a 32-bit version of Java. A 32-bit Java cannot go above a 2GB heap. A 64-bit JVM requires a 64-bit operating system, which requires a 64-bit CPU. Since 2006, Intel has only been providing 64-bit chips to the consumer market, and getting a 32-bit chip in a new computer has gotten extremely difficult. The server market has had only 64-bit chips from Intel since 2005. I am not sure what those dates look like for AMD chips, but it is probably similar. Running "java -version" should give you enough information to determine whether your Java is 32-bit or 64-bit. This is the output from that command on a Linux machine that is running a 64-bit JVM from Oracle: root@idxa4:~# java -version java version "1.8.0_45" Java(TM) SE Runtime Environment (build 1.8.0_45-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode) If you are running Solr on Linux, then the output of "uname -a" should tell you whether your operating system is 32 or 64 bit. Thanks, Shawn --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Solr 5.0, Ubuntu 14.04, SOLR_JAVA_MEM problem
ok, I note all these information, thanks ! I will update if it's needed. 2go seems to be ok. Le 04/05/2015 18:46, Shawn Heisey a écrit : On 5/4/2015 10:28 AM, Bruno Mannina wrote: solr@linux:~$ java -version java version "1.7.0_79" OpenJDK Runtime Environment (IcedTea 2.5.5) (7u79-2.5.5-0ubuntu0.14.04.2) OpenJDK Server VM (build 24.79-b02, mixed mode) solr@linux:~$ solr@linux:~$ uname -a Linux linux 3.13.0-51-generic #84-Ubuntu SMP Wed Apr 15 12:11:46 UTC 2015 i686 i686 i686 GNU/Linux solr@linux:~$ Both Linux and Java are 32-bit. For linux, I know this because your arch is "i686", which means it is coded for a newer generation 32-bit CPU. You can't be running a 64-bit Java, and the Java version confirms that because it doesn't contain "64-bit". Run this command: cat /proc/cpuinfo If the "flags" on the CPU contain the string "lm" (long mode), then your CPU is capable of running a 64-bit (sometimes known as amd64 or x86_64) version of Linux, and a 64-bit Java. You will need to re-install both Linux and Java to get this capability. Here's "uname -a" from a 64-bit version of Ubuntu: Linux lb1 3.13.0-51-generic #84-Ubuntu SMP Wed Apr 15 12:08:34 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux Since you are running 5.0, I would recommend Oracle Java 8. http://www.webupd8.org/2012/09/install-oracle-java-8-in-ubuntu-via-ppa.html Thanks, Shawn --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Solr 5.0 - uniqueKey case insensitive ?
Dear Solr users, I have a problem with SOLR5.0 (and not on SOLR3.6) What kind of field can I use for my uniqueKey field named "code" if I want it case insensitive ? On SOLR3.6, I defined a string_ci field like this: and it works fine. - If I add a document with the same code then the doc is updated. - If I search a document with lower or upper case, the doc is found But in SOLR5.0, if I use this definition then : - I can search in lower/upper case, it's OK - BUT if I add a doc with the same code then the doc is added not updated !? I read that the problem could be that the type of field is tokenized instead of use a string. If I change from string_ci to string, then - I lost the possibility to search in lower/upper case - but it works fine to update the doc. So, could you help me to find the right field type to: - search in case insensitive - if I add a document with the same code, the old doc will be updated Thanks a lot ! --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Solr 5.0 - uniqueKey case insensitive ?
Hello Chris, yes I confirm on my SOLR3.6 it works fine since several years, and each doc added with same code is updated not added. To be more clear, I receive docs with a field name "pn" and it's the uniqueKey, and it always in uppercase so I must define in my schema.xml required="true" stored="true"/> indexed="true" stored="false"/> ... id ... but the application that use solr already exists so it requests with pn field not id, i cannot change that. and in each docs I receive, there is not id field, just pn field, and i cannot also change that. so there is a problem no ? I must import a id field and request a pn field, but I have a pn field only for import... Le 05/05/2015 01:00, Chris Hostetter a écrit : : On SOLR3.6, I defined a string_ci field like this: : : : : : : : : : I'm really suprised that field would have worked for you (reliably) as a uniqueKey field even in Solr 3.6. the best practice for something like what you describe has always (going back to Solr 1.x) been to use a copyField to create a case insensitive copy of your uniqueKey for searching. if, for some reason, you really want case insensitve *updates* (so a doc with id "foo" overwrites a doc with id "FOO" then the only reliable way to make something like that work is to do the lowercassing in an UpdateProcessor to ensure it happens *before* the docs are distributed to the correct shard, and so the correct existing doc is overwritten (even if you aren't using solr cloud) -Hoss http://www.lucidworks.com/ --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Solr 5.0 - uniqueKey case insensitive ?
Yes thanks it's now for me too. Daniel, my pn is always in uppercase and I index them always in uppercase. the problem (solved now after all your answers, thanks) was the request, if users requests with lowercase then solr reply no result and it was not good. but now the problem is solved, I changed in my source file the name pn field to id and in my schema I use a copy field named pn and it works perfectly. Thanks a lot !!! Le 06/05/2015 09:44, Daniel Collins a écrit : Ah, I remember seeing this when we first started using Solr (which was 4.0 because we needed Solr Cloud), I never got around to filing an issue for it (oops!), but we have a note in our schema to leave the key field a normal string (like Bruno we had tried to lowercase it which failed). We didn't really know Solr in those days, and hadn't really thought about it since then, but Hoss' and Erick's explanations make perfect sense now! Since shard routing is (basically) done on hashes of the unique key, if I have 2 documents which are the "same", but have values "HELLO" and "hello", they might well hash to completely different shards, so the update logistics would be horrible. Bruno, why do you need to lowercase at all then? You said in your example, that your client application always supplies "pn" and it is always uppercase, so presumably all adds/updates could be done directly on that field (as a normal string with no lowercasing). Where does the case insensitivity come in, is that only for searching? If so couldn't you add a search field (called id), and update your app to search using that (or make that your default search field, I guess it depends if your calling app explicitly uses the pn field name in its searches). On 6 May 2015 at 01:55, Erick Erickson wrote: Well, "working fine" may be a bit of an overstatement. That has never been officially supported, so it "just happened" to work in 3.6. As Chris points out, if you're using SolrCloud then this will _not_ work as routing happens early in the process, i.e. before the analysis chain gets the token so various copies of the doc will exist on different shards. Best, Erick On Mon, May 4, 2015 at 4:19 PM, Bruno Mannina wrote: Hello Chris, yes I confirm on my SOLR3.6 it works fine since several years, and each doc added with same code is updated not added. To be more clear, I receive docs with a field name "pn" and it's the uniqueKey, and it always in uppercase so I must define in my schema.xml indexed="true" stored="false"/> ... id ... but the application that use solr already exists so it requests with pn field not id, i cannot change that. and in each docs I receive, there is not id field, just pn field, and i cannot also change that. so there is a problem no ? I must import a id field and request a pn field, but I have a pn field only for import... Le 05/05/2015 01:00, Chris Hostetter a écrit : : On SOLR3.6, I defined a string_ci field like this: : : : : : : : : : I'm really suprised that field would have worked for you (reliably) as a uniqueKey field even in Solr 3.6. the best practice for something like what you describe has always (going back to Solr 1.x) been to use a copyField to create a case insensitive copy of your uniqueKey for searching. if, for some reason, you really want case insensitve *updates* (so a doc with id "foo" overwrites a doc with id "FOO" then the only reliable way to make something like that work is to do the lowercassing in an UpdateProcessor to ensure it happens *before* the docs are distributed to the correct shard, and so the correct existing doc is overwritten (even if you aren't using solr cloud) -Hoss http://www.lucidworks.com/ --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
How to index 20 000 files with a command line ?
Dear Solr Users, Habitualy i use this command line to index my files: >bin/post -c hbl /data/hbl-201522/*.xml but today I have a big update, so there are 20 000 xml files (each files 1kohttp://www.avast.com
Re: How to index 20 000 files with a command line ?
oh yes like this: find /data/hbl-201522/-name "*.xml" -exec bin/post -c hbl {} \; ? Le 29/05/2015 14:15, Sergey Shvets a écrit : Hello Bruno, You can use find command with exec attribute. regards Sergey Friday, May 29, 2015, 3:11:37 PM, you wrote: Dear Solr Users, Habitualy i use this command line to index my files: >bin/post -c hbl /data/hbl-201522/*.xml but today I have a big update, so there are 20 000 xml files (each files 1kohttp://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Help for a field in my schema ?
Dear Solr-Users, (SOLR 5.0 Ubuntu) I have xml files with tags like this claimXXYYY where XX is a language code like FR EN DE PT etc... (I don't know the number of language code I can have) and YYY is a number [1..999] i.e.: claimen1 claimen2 claimen3 claimfr1 claimfr2 claimfr3 I would like to define fields named: *claimen* equal to claimenYYY (EN language, all numbers, indexed=true, stored=true) (search needed and must be displayed) *claim *equal to all claimXXYYY (all languages, all numbers, indexed=true, stored false) (search not needed but must be displayed) Is it possible to have these 2 fields ? Could you help me to declare them in my schema.xml ? Thanks a lot for your help ! Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Possible or not ?
Dear Solr Users, I would like to post 1 000 000 records (1 records = 1 files) in one shoot ? and do the commit and the end. Is it possible to do that ? I've several directories with each 20 000 files inside. I would like to do: bin/post -c mydb /DATA under DATA I have /DATA/1/*.xml (20 000 files) /DATA/2/*.xml (20 000 files) /DATA/3/*.xml (20 000 files) /DATA/50/*.xml (20 000 files) Actually, I post 5 directories in one time (it takes around 1h30 for 100 000 records/files) But it's Friday and I would like to run it during the W.E. alone. Thanks for your comment, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. https://www.avast.com/antivirus
Re: Possible or not ?
Hi Alessandro, I'm actually on my dev' computer, so I would like to post 1 000 000 xml file (with a structure defined in my schema.xml) I have already import 1 000 000 xml files by using bin/post -c mydb /DATA0/1 /DATA0/2 /DATA0/3 /DATA0/4 /DATA0/5 where /DATA0/X contains 20 000 xml files (I do it 20 times by just changing X from 1 to 50) I would like to do now bin/post -c mydb /DATA1 I would like to know If my SOLR5 will run fine and no provide an memory error because there are too many files in one post without doing a commit? The commit will be done at the end of 1 000 000. Is it ok ? Le 05/06/2015 16:59, Alessandro Benedetti a écrit : Hi Bruno, I can not see what is your challenge. Of course you can index your data in the flavour you want and do a commit whenever you want… Are those xml Solr xml ? If not you would need to use the DIH, the extract update handler or any custom Indexer application. Maybe I missed your point… Give me more details please ! Cheers 2015-06-05 15:41 GMT+01:00 Bruno Mannina : Dear Solr Users, I would like to post 1 000 000 records (1 records = 1 files) in one shoot ? and do the commit and the end. Is it possible to do that ? I've several directories with each 20 000 files inside. I would like to do: bin/post -c mydb /DATA under DATA I have /DATA/1/*.xml (20 000 files) /DATA/2/*.xml (20 000 files) /DATA/3/*.xml (20 000 files) /DATA/50/*.xml (20 000 files) Actually, I post 5 directories in one time (it takes around 1h30 for 100 000 records/files) But it's Friday and I would like to run it during the W.E. alone. Thanks for your comment, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. https://www.avast.com/antivirus --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. https://www.avast.com/antivirus
Re: Possible or not ?
Ok thanks for these information ! Le 05/06/2015 17:37, Erick Erickson a écrit : Picking up on Alessandro's point. While you can post all these docs and commit at the end, unless you do a hard commit ( openSearcher=true or false doesn't matter), then if your server should abnormally terminate for _any_ reason, all these docs will be replayed on startup from the transaction log. I'll also echo Alessandro's point that I don't see the advantage of this. Personally I'd set my hard commit interval with openSearcher=false to something like 6 (60 seconds it's in milliseconds) and forget about it. You're not imposing much extra load on the system, you're durably saving your progress, you're avoiding really, really, really long restarts if your server should stop for some reason. If you don't want the docs to be _visible_ for searches, be sure your autocommit has openSearcer set to false and disable soft commits (set the interval to -1 or remove it from your solrconfig). Best, Erick On Fri, Jun 5, 2015 at 8:21 AM, Alessandro Benedetti wrote: I can not see any problem in that, but talking about commits I would like to make a difference between "Hard" and "Soft" . Hard commit -> durability Soft commit -> visibility I suggest you this interesting reading : https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ It's an old interesting Erick post. It explains you better what are the differences between different commit types. I would put you in this scenario : Heavy (bulk) indexing The assumption here is that you’re interested in getting lots of data to the index as quickly as possible for search sometime in the future. I’m thinking original loads of a data source etc. - Set your soft commit interval quite long. As in 10 minutes or even longer (-1 for no soft commits at all). *Soft commit is about visibility, *and my assumption here is that bulk indexing isn’t about near real time searching so don’t do the extra work of opening any kind of searcher. - Set your hard commit intervals to 15 seconds, openSearcher=false. Again the assumption is that you’re going to be just blasting data at Solr. The worst case here is that you restart your system and have to replay 15 seconds or so of data from your tlog. If your system is bouncing up and down more often than that, fix the reason for that first. - Only after you’ve tried the simple things should you consider refinements, they’re usually only required in unusual circumstances. But they include: - Turning off the tlog completely for the bulk-load operation - Indexing offline with some kind of map-reduce process - Only having a leader per shard, no replicas for the load, then turning on replicas later and letting them do old-style replication to catch up. Note that this is automatic, if the node discovers it is “too far” out of sync with the leader, it initiates an old-style replication. After it has caught up, it’ll get documents as they’re indexed to the leader and keep its own tlog. - etc. Actually you could do the commit only at the end, but I can not see any advantage in that. I suggest you to play with auto hard/soft commit config and get a better idea of the situation ! Cheers 2015-06-05 16:08 GMT+01:00 Bruno Mannina : Hi Alessandro, I'm actually on my dev' computer, so I would like to post 1 000 000 xml file (with a structure defined in my schema.xml) I have already import 1 000 000 xml files by using bin/post -c mydb /DATA0/1 /DATA0/2 /DATA0/3 /DATA0/4 /DATA0/5 where /DATA0/X contains 20 000 xml files (I do it 20 times by just changing X from 1 to 50) I would like to do now bin/post -c mydb /DATA1 I would like to know If my SOLR5 will run fine and no provide an memory error because there are too many files in one post without doing a commit? The commit will be done at the end of 1 000 000. Is it ok ? Le 05/06/2015 16:59, Alessandro Benedetti a écrit : Hi Bruno, I can not see what is your challenge. Of course you can index your data in the flavour you want and do a commit whenever you want… Are those xml Solr xml ? If not you would need to use the DIH, the extract update handler or any custom Indexer application. Maybe I missed your point… Give me more details please ! Cheers 2015-06-05 15:41 GMT+01:00 Bruno Mannina : Dear Solr Users, I would like to post 1 000 000 records (1 records = 1 files) in one shoot ? and do the commit and the end. Is it possible to do that ? I've several directories with each 20 000 files inside. I would like to do: bin/post -c mydb /DATA under DATA I have /DATA/1/*.xml (20 000 files) /DATA/2/*.xml (20 000 files) /DATA/3/*.xml (20 000 files) /DATA/50/*.xml (20 000 files) Actually, I post 5 directories in one time (it takes around 1h30 for
Re: Possible or not ?
Thanks for the link, So, I launch this post, I will see on Monday if it will ok :) Le 05/06/2015 17:21, Alessandro Benedetti a écrit : I can not see any problem in that, but talking about commits I would like to make a difference between "Hard" and "Soft" . Hard commit -> durability Soft commit -> visibility I suggest you this interesting reading : https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ It's an old interesting Erick post. It explains you better what are the differences between different commit types. I would put you in this scenario : Heavy (bulk) indexing The assumption here is that you’re interested in getting lots of data to the index as quickly as possible for search sometime in the future. I’m thinking original loads of a data source etc. - Set your soft commit interval quite long. As in 10 minutes or even longer (-1 for no soft commits at all). *Soft commit is about visibility, *and my assumption here is that bulk indexing isn’t about near real time searching so don’t do the extra work of opening any kind of searcher. - Set your hard commit intervals to 15 seconds, openSearcher=false. Again the assumption is that you’re going to be just blasting data at Solr. The worst case here is that you restart your system and have to replay 15 seconds or so of data from your tlog. If your system is bouncing up and down more often than that, fix the reason for that first. - Only after you’ve tried the simple things should you consider refinements, they’re usually only required in unusual circumstances. But they include: - Turning off the tlog completely for the bulk-load operation - Indexing offline with some kind of map-reduce process - Only having a leader per shard, no replicas for the load, then turning on replicas later and letting them do old-style replication to catch up. Note that this is automatic, if the node discovers it is “too far” out of sync with the leader, it initiates an old-style replication. After it has caught up, it’ll get documents as they’re indexed to the leader and keep its own tlog. - etc. Actually you could do the commit only at the end, but I can not see any advantage in that. I suggest you to play with auto hard/soft commit config and get a better idea of the situation ! Cheers 2015-06-05 16:08 GMT+01:00 Bruno Mannina : Hi Alessandro, I'm actually on my dev' computer, so I would like to post 1 000 000 xml file (with a structure defined in my schema.xml) I have already import 1 000 000 xml files by using bin/post -c mydb /DATA0/1 /DATA0/2 /DATA0/3 /DATA0/4 /DATA0/5 where /DATA0/X contains 20 000 xml files (I do it 20 times by just changing X from 1 to 50) I would like to do now bin/post -c mydb /DATA1 I would like to know If my SOLR5 will run fine and no provide an memory error because there are too many files in one post without doing a commit? The commit will be done at the end of 1 000 000. Is it ok ? Le 05/06/2015 16:59, Alessandro Benedetti a écrit : Hi Bruno, I can not see what is your challenge. Of course you can index your data in the flavour you want and do a commit whenever you want… Are those xml Solr xml ? If not you would need to use the DIH, the extract update handler or any custom Indexer application. Maybe I missed your point… Give me more details please ! Cheers 2015-06-05 15:41 GMT+01:00 Bruno Mannina : Dear Solr Users, I would like to post 1 000 000 records (1 records = 1 files) in one shoot ? and do the commit and the end. Is it possible to do that ? I've several directories with each 20 000 files inside. I would like to do: bin/post -c mydb /DATA under DATA I have /DATA/1/*.xml (20 000 files) /DATA/2/*.xml (20 000 files) /DATA/3/*.xml (20 000 files) /DATA/50/*.xml (20 000 files) Actually, I post 5 directories in one time (it takes around 1h30 for 100 000 records/files) But it's Friday and I would like to run it during the W.E. alone. Thanks for your comment, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. https://www.avast.com/antivirus --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. https://www.avast.com/antivirus --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. https://www.avast.com/antivirus
How to index text field with html entities ?
Dear Solr User, Solr 5.0.1 I have several xml files that contains html entities in some fields. I have a author field (english text) with this kind of text: Brown & Gammon If I set my field like this: Brown & Gammon Solr generates error "Undeclared general entity" if I add CDATA like this: it seems that I can't search with the & au:"brown & gammon" Could you help me to find the right syntax ? Thanks a lot, Bruno --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
Re: How to index text field with html entities ?
Hi Chris, Thanks for your answer, and I add a little thing, after checking my log it seems that it concerns only some html entities. No problem with & but I have problem with: ü “ etc... I will check your answer to find a solution, Thanks ! Le 29/07/2016 à 23:58, Chris Hostetter a écrit : : I have several xml files that contains html entities in some fields. ... : If I set my field like this: : : Brown & Gammon : : Solr generates error "Undeclared general entity" ...because that's not valid XML... : if I add CDATA like this: : : : : it seems that I can't search with the & ...because that is valid xml, and tells solr you want the literal string "Brown & Gammon" to be indexed -- given a typical analyzer you are probably getting either "&" or "amp" as a term in your index. : Could you help me to find the right syntax ? the client code you are using for indexing can either "parse" these HTML snippets using an HTML parser, and then send solr the *real* string you want to index, or you can configure solr with something like HTMLStripFieldUpdateProcessorFactory (if you want both the indexed form and the stored form to be plain text) or HTMLStripCharFilterFactory (if you wnat to preserve the html markup in the stored value, but strip it as part of the analysis chain for indexing. http://lucene.apache.org/solr/6_1_0/solr-core/org/apache/solr/update/processor/HTMLStripFieldUpdateProcessorFactory.html http://lucene.apache.org/core/6_1_0/analyzers-common/org/apache/lucene/analysis/charfilter/HTMLStripCharFilterFactory.html -Hoss http://www.lucidworks.com/ --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
Re: How to index text field with html entities ?
Thanks Shawn for these precisions Le 30/07/2016 à 00:43, Shawn Heisey a écrit : On 7/29/2016 4:05 PM, Bruno Mannina wrote: after checking my log it seems that it concerns only some html entities. No problem with & but I have problem with: ü “ etc... Those are valid *HTML* entities, but they are not valid *XML* entities. The list of entities that are valid in XML is quite short -- there are only five of them. https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Predefined_entities_in_XML When Solr processes XML, it is only going to convert entities that are valid for XML -- the five already mentioned. It will fail on the other 247 entities that are only valid for HTML. If you are seeing the problem with & (which is one of the five valid XML entities) then we'll need the Solr version and the full error message/stacktrace from the solr logfile. Thanks, Shawn --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
Strange error when I try to copy....
Dear Solr Users, I use since several years SOLR and since two weeks, I have a problem when I try to copy my solr index. My solr index is around 180Go (~100 000 000 docs, 1 doc ~ 3ko) My method to save my index every Sunday: - I stop SOLR 5.4 on Ubuntu 14.04LTS - 16Go - i3-2120 CPU @ 3.30Ghz - I do a simple directory copy /data to my HDD backup (from 2To SATA to 2To SATA directly connected to the Mothercard). All files are copied fine but one not ! the biggest (~65Go) failed. I have the message : "Error splicing file: Input/output error" I tried also on windows (I have a dualboot), I have "redondance error". I check my HDD, no error, I check the file "_k46.fdt" no error, I can delete docs, add docs, my database can be reach and works fine. Is someone have an idea to backup my database ? or why I have this error ? Many thanks for your help, Sincerely, Bruno --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
Re: Strange error when I try to copy....
Le 09/09/2016 à 17:57, Shawn Heisey a écrit : On 9/8/2016 9:41 AM, Bruno Mannina wrote: - I stop SOLR 5.4 on Ubuntu 14.04LTS - 16Go - i3-2120 CPU @ 3.30Ghz - I do a simple directory copy /data to my HDD backup (from 2To SATA to 2To SATA directly connected to the Mothercard). All files are copied fine but one not ! the biggest (~65Go) failed. I have the message : "Error splicing file: Input/output error" This isn't a Solr issue, which is easy to determine by the fact that you've stopped Solr and it's not even running. It's a problem with the filesystem, probably the destination filesystem. The most common reason that I have found for this error is a destination filesystem that is incapable of holding a large file -- which can happen when the disk is formatted fat32 instead of ntfs or a Linux filesystem. You can have a 2TB filesystem with fat32, but no files larger than 4GB -- so your 65GB file won't fit. I think you're going to need to reformat that external drive with another filesystem. If you choose NTFS, you'll be able to use the disk on either Linux or Windows. Thanks, Shawn Hi Shawn, First thanks for your answer, effectively it's a little bit clear. Tonight I will check the file system of my hdd. And sorry for this question out of solr subject. Cdlt, Bruno --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
Solr 5.4.0: Colored Highlight and multi-value field ?
Dear all, Is it possible to have a colored highlight in a multi-value field ? Im succeed to do it on a textfield but not in a multi-value field, then SOLR takes hl.simple.pre / hl.simple.post as tag. Thanks a lot for your help, Cordialement, Best Regards Bruno Mannina <http://www.matheo-software.com> www.matheo-software.com <http://www.patent-pulse.com> www.patent-pulse.com Tél. +33 0 970 738 743 Mob. +33 0 634 421 817 <https://www.facebook.com/PatentPulse> facebook (1) <https://twitter.com/matheosoftware> 1425551717 <https://www.linkedin.com/company/matheo-software> 1425551737 <https://www.youtube.com/user/MatheoSoftware> 1425551760 --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
RE: Solr 5.4.0: Colored Highlight and multi-value field ?
Hi Erik, Sorry for the late reply, I wasn't in my office this week... So, I give more information: * IC is a multi-value field defined like this: * The request I use (i.e): http://my_host/solr/collection/select? q=ic:(A63C10* OR G06F22/086) &start=0 &rows=10 &wt=json &indent=true &sort=pd+desc &fl=* // HighLight &hl=true &hl.fl=ti,ab,ic,inc,cpc,apc &hl.simple.pre= &hl.simple.post= &hl.fragmentsBuilder=colored &hl.useFastVectorHighlighter=true &hl.highlightMultiTerm=true &hl.usePhraseHighlighter=true &hl.fragsize=999 &hl.preserveMulti=true * Result: I have only one color (in my case the yellow) for all different values found * BUT * If I use a non multi-value field like ti (title) with a query with some keywords *Result (i.e ti:(foo OR merge) ): I have different colors for each different terms found Question: - Is it because IC field is not defined with all term*="true" options ? - How can I have different color and not use pre and post tags ? Many thanks for your help ! -Message d'origine- De : Erick Erickson [mailto:erickerick...@gmail.com] Envoyé : mercredi 4 octobre 2017 15:48 À : solr-user Objet : Re: Solr 5.4.0: Colored Highlight and multi-value field ? How does it not work for you? Details matter, an example set of values and the response from Solr are good bits of info for us to have. On Tue, Oct 3, 2017 at 3:59 PM, Bruno Mannina wrote: > Dear all, > > > > Is it possible to have a colored highlight in a multi-value field ? > > > > I’m succeed to do it on a textfield but not in a multi-value field, > then SOLR takes hl.simple.pre / hl.simple.post as tag. > > > > Thanks a lot for your help, > > > > Cordialement, Best Regards > > Bruno Mannina > > www.matheo-software.com > > www.patent-pulse.com > > Tél. +33 0 970 738 743 > > Mob. +33 0 634 421 817 > > [image: facebook (1)] <https://www.facebook.com/PatentPulse>[image: > 1425551717] <https://twitter.com/matheosoftware>[image: 1425551737] > <https://www.linkedin.com/company/matheo-software>[image: 1425551760] > <https://www.youtube.com/user/MatheoSoftware> > > > > > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_ > campaign=sig-email&utm_content=emailclient> Garanti sans virus. > www.avast.com > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_ > campaign=sig-email&utm_content=emailclient> > <#m_-7780043212915396992_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> > --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
Get docs with same value in one other field ?
Hello all, I'm facing a problem that I would like to know if it's possible to do it with one request in SOLR. I have SOLR 5. I have docs with several fields but here two are useful for us. Field 1 : id (unique key) Field 2 : fid (family Id) i.e: id:XXX fid: 1254 id: YYY fid: 1254 id: ZZZ fid:3698 id: QQQ fid: 3698 . I request only by id in my project, and I would like in my result have also all docs that have the same fid . i.e. if I request : ..q=id:ZZZ&. I get the docs ZZZ of course but also QQQ because QQQ_fid = ZZZ_fid MoreLikeThis, Group, etc. don't answer to my question (but may I don't know how to use it to do that) Thanks for your help, Bruno --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
RE: Get docs with same value in one other field ?
Ye it's perfect !!! it works. Thanks David & Alexandre ! -Message d'origine- De : David Hastings [mailto:hastings.recurs...@gmail.com] Envoyé : mercredi 22 février 2017 23:00 À : solr-user@lucene.apache.org Objet : Re: Get docs with same value in one other field ? sorry embedded link: q={!join+from=fid=fid}id:ZZZ On Wed, Feb 22, 2017 at 4:58 PM, David Hastings < hastings.recurs...@gmail.com> wrote: > for a reference to some examples: > > https://wiki.apache.org/solr/Join > > sor youd want something like: > > q={!join+from=fid=fid}i > <http://localhost:8983/solr/select?q=%7B!join+from=manu_id_s+to=id%7Di > pod> > d:ZZZ > > i dont have much experience with this function however > > > > On Wed, Feb 22, 2017 at 4:40 PM, Alexandre Rafalovitch > > wrote: > >> Sounds like two clauses with the second clause being a JOINT search >> where you match by ID and then join on FID. >> >> Would that work? >> >> Regards, >>Alex. >> >> http://www.solr-start.com/ - Resources for Solr users, new and >> experienced >> >> >> On 22 February 2017 at 16:27, Bruno Mannina wrote: >> > >> > >> > Hello all, >> > >> > >> > >> > I'm facing a problem that I would like to know if it's possible to >> > do it with one request in SOLR. >> > >> > I have SOLR 5. >> > >> > >> > >> > I have docs with several fields but here two are useful for us. >> > >> > Field 1 : id (unique key) >> > >> > Field 2 : fid (family Id) >> > >> > >> > >> > i.e: >> > >> > >> > >> > id:XXX >> > >> > fid: 1254 >> > >> > >> > >> > id: YYY >> > >> > fid: 1254 >> > >> > >> > >> > id: ZZZ >> > >> > fid:3698 >> > >> > >> > >> > id: QQQ >> > >> > fid: 3698 >> > >> > . >> > >> > >> > >> > I request only by id in my project, and I would like in my result >> > have >> also >> > all docs that have the same fid . >> > >> > i.e. if I request : >> > >> > ..q=id:ZZZ&. >> > >> > >> > >> > I get the docs ZZZ of course but also QQQ because QQQ_fid = ZZZ_fid >> > >> > >> > >> > MoreLikeThis, Group, etc. don't answer to my question (but may I >> > don't >> know >> > how to use it to do that) >> > >> > >> > >> > Thanks for your help, >> > >> > >> > >> > Bruno >> > >> > >> > >> > >> > >> > >> > >> > --- >> > L'absence de virus dans ce courrier électronique a été vérifiée par >> > le >> logiciel antivirus Avast. >> > https://www.avast.com/antivirus >> > > --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
RE: Get docs with same value in one other field ?
Just a little more thing, I need to request up to 1000 id's Actually I test with 2 or 3 and it takes times (my db is around 100 000 000 docs, 128Go RAM). Do you think, it could be OOM error ? if I test with up to 1000 id ? -Message d'origine- De : Bruno Mannina [mailto:bmann...@free.fr] Envoyé : mercredi 22 février 2017 23:47 À : solr-user@lucene.apache.org Objet : RE: Get docs with same value in one other field ? Ye it's perfect !!! it works. Thanks David & Alexandre ! -Message d'origine- De : David Hastings [mailto:hastings.recurs...@gmail.com] Envoyé : mercredi 22 février 2017 23:00 À : solr-user@lucene.apache.org Objet : Re: Get docs with same value in one other field ? sorry embedded link: q={!join+from=fid=fid}id:ZZZ On Wed, Feb 22, 2017 at 4:58 PM, David Hastings < hastings.recurs...@gmail.com> wrote: > for a reference to some examples: > > https://wiki.apache.org/solr/Join > > sor youd want something like: > > q={!join+from=fid=fid}i > <http://localhost:8983/solr/select?q=%7B!join+from=manu_id_s+to=id%7Di > pod> > d:ZZZ > > i dont have much experience with this function however > > > > On Wed, Feb 22, 2017 at 4:40 PM, Alexandre Rafalovitch > > wrote: > >> Sounds like two clauses with the second clause being a JOINT search >> where you match by ID and then join on FID. >> >> Would that work? >> >> Regards, >>Alex. >> >> http://www.solr-start.com/ - Resources for Solr users, new and >> experienced >> >> >> On 22 February 2017 at 16:27, Bruno Mannina wrote: >> > >> > >> > Hello all, >> > >> > >> > >> > I'm facing a problem that I would like to know if it's possible to >> > do it with one request in SOLR. >> > >> > I have SOLR 5. >> > >> > >> > >> > I have docs with several fields but here two are useful for us. >> > >> > Field 1 : id (unique key) >> > >> > Field 2 : fid (family Id) >> > >> > >> > >> > i.e: >> > >> > >> > >> > id:XXX >> > >> > fid: 1254 >> > >> > >> > >> > id: YYY >> > >> > fid: 1254 >> > >> > >> > >> > id: ZZZ >> > >> > fid:3698 >> > >> > >> > >> > id: QQQ >> > >> > fid: 3698 >> > >> > . >> > >> > >> > >> > I request only by id in my project, and I would like in my result >> > have >> also >> > all docs that have the same fid . >> > >> > i.e. if I request : >> > >> > ..q=id:ZZZ&. >> > >> > >> > >> > I get the docs ZZZ of course but also QQQ because QQQ_fid = ZZZ_fid >> > >> > >> > >> > MoreLikeThis, Group, etc. don't answer to my question (but may I >> > don't >> know >> > how to use it to do that) >> > >> > >> > >> > Thanks for your help, >> > >> > >> > >> > Bruno >> > >> > >> > >> > >> > >> > >> > >> > --- >> > L'absence de virus dans ce courrier électronique a été vérifiée par >> > le >> logiciel antivirus Avast. >> > https://www.avast.com/antivirus >> > > --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
RE: Get docs with same value in one other field ?
Ok Alex, I will looking for a best solution. I'm afraid to have a OOM with a huge number of ids. And yes I already use a POST query, it was just to show my problem. Anyway thanks to indicate me this information also. -Message d'origine- De : Alexandre Rafalovitch [mailto:arafa...@gmail.com] Envoyé : jeudi 23 février 2017 00:08 À : solr-user Objet : Re: Get docs with same value in one other field ? A thousand of IDs could be painful to send and perhaps to run against. At minimum, look into splitting your query into multiple variables (so you could reuse the list in both direct and join query). Look also at using terms query processor that specializes in the list of IDs. You may also need to send your ID list as a POST, not GET request to avoid blowing the URL length. Regards, Alex. http://www.solr-start.com/ - Resources for Solr users, new and experienced On 22 February 2017 at 17:55, Bruno Mannina wrote: > Just a little more thing, I need to request up to 1000 id's Actually I > test with 2 or 3 and it takes times (my db is around 100 000 000 docs, 128Go > RAM). > > Do you think, it could be OOM error ? if I test with up to 1000 id ? > > -Message d'origine- > De : Bruno Mannina [mailto:bmann...@free.fr] Envoyé : mercredi 22 > février 2017 23:47 À : solr-user@lucene.apache.org Objet : RE: Get > docs with same value in one other field ? > > Ye it's perfect !!! it works. > > Thanks David & Alexandre ! > > -Message d'origine- > De : David Hastings [mailto:hastings.recurs...@gmail.com] > Envoyé : mercredi 22 février 2017 23:00 À : > solr-user@lucene.apache.org Objet : Re: Get docs with same value in > one other field ? > > sorry embedded link: > > q={!join+from=fid=fid}id:ZZZ > > On Wed, Feb 22, 2017 at 4:58 PM, David Hastings < > hastings.recurs...@gmail.com> wrote: > >> for a reference to some examples: >> >> https://wiki.apache.org/solr/Join >> >> sor youd want something like: >> >> q={!join+from=fid=fid}i >> <http://localhost:8983/solr/select?q=%7B!join+from=manu_id_s+to=id%7D >> i >> pod> >> d:ZZZ >> >> i dont have much experience with this function however >> >> >> >> On Wed, Feb 22, 2017 at 4:40 PM, Alexandre Rafalovitch >> > > wrote: >> >>> Sounds like two clauses with the second clause being a JOINT search >>> where you match by ID and then join on FID. >>> >>> Would that work? >>> >>> Regards, >>>Alex. >>> >>> http://www.solr-start.com/ - Resources for Solr users, new and >>> experienced >>> >>> >>> On 22 February 2017 at 16:27, Bruno Mannina wrote: >>> > >>> > >>> > Hello all, >>> > >>> > >>> > >>> > I'm facing a problem that I would like to know if it's possible to >>> > do it with one request in SOLR. >>> > >>> > I have SOLR 5. >>> > >>> > >>> > >>> > I have docs with several fields but here two are useful for us. >>> > >>> > Field 1 : id (unique key) >>> > >>> > Field 2 : fid (family Id) >>> > >>> > >>> > >>> > i.e: >>> > >>> > >>> > >>> > id:XXX >>> > >>> > fid: 1254 >>> > >>> > >>> > >>> > id: YYY >>> > >>> > fid: 1254 >>> > >>> > >>> > >>> > id: ZZZ >>> > >>> > fid:3698 >>> > >>> > >>> > >>> > id: QQQ >>> > >>> > fid: 3698 >>> > >>> > . >>> > >>> > >>> > >>> > I request only by id in my project, and I would like in my result >>> > have >>> also >>> > all docs that have the same fid . >>> > >>> > i.e. if I request : >>> > >>> > ..q=id:ZZZ&. >>> > >>> > >>> > >>> > I get the docs ZZZ of course but also QQQ because QQQ_fid = >>> > ZZZ_fid >>> > >>> > >>> > >>> > MoreLikeThis, Group, etc. don't answer to my question (but may I >>> > don't >>> know >>> > how to use it to do that) >>> > >>> > >>> > >>> > Thanks for your help, >>> > >>> > >>> > >>> > Bruno >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > --- >>> > L'absence de virus dans ce courrier électronique a été vérifiée >>> > par le >>> logiciel antivirus Avast. >>> > https://www.avast.com/antivirus >>> >> >> > > > --- > L'absence de virus dans ce courrier électronique a été vérifiée par le > logiciel antivirus Avast. > https://www.avast.com/antivirus > --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
Get docs with same value in one other field ?
Hello all, Im facing a problem that I would like to know if its possible to do it with one request in SOLR. I have SOLR 5. I have docs with several fields but here two are useful for us. Field 1 : id (unique key) Field 2 : fid (family Id) i.e: id:XXX fid: 1254 id: YYY fid: 1254 id: ZZZ fid:3698 id: QQQ fid: 3698 I request only by id in my project, and I would like in my result have also all docs that have the same fid . i.e. if I request : ..q=id:ZZZ& I get the docs ZZZ of course but also QQQ because QQQ_fid = ZZZ_fid MoreLikeThis, Group, etc dont answer to my question (but may I dont know how to use it to do that) Thanks for your help, Bruno Bruno Mannina <http://www.matheo-software.com> www.matheo-software.com <http://www.patent-pulse.com> www.patent-pulse.com Tél. +33 0 430 650 788 Fax. +33 0 430 650 728 Stay in touch! <https://twitter.com/matheosoftware> cid:image001.png@01D2860B.70B15DC0 <https://www.linkedin.com/company/matheo-software> cid:image002.png@01D2860B.70B15DC0 <https://www.youtube.com/user/MatheoSoftware> cid:image003.png@01D2860B.70B15DC0 --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
Solr5, Clustering & exact phrase problem
Dear Solr-User, Im trying to use solr clustering (Lingo algorithm) on my database (notices with id, title, abstract fields) All works fine when my query is simple (with or without Boolean operators) but if I try with exact phrase like: ..&q=ti:snowboard binding& Then Solr generates only one cluster named other and put inside all notices. As I test it since few times, I have in my solrconfig the sample that example gives. Of course, I changed field names. Do you know if I made a mistake, missing something or may be exact phrase is not supported by clustering ? Just one another question, I want to generate clusters by using fields abstract and title, is exact what I did ion my solrconfig: Carrot.title = title Carrot.snippet = abstract Thanks a lot for your help, Bruno Mannina <http://www.matheo-software.com> www.matheo-software.com <http://www.patent-pulse.com> www.patent-pulse.com Tél. +33 0 430 650 788 Fax. +33 0 430 650 728 <https://www.facebook.com/PatentPulse> facebook (1) <https://twitter.com/matheosoftware> 1425551717 <https://www.linkedin.com/company/matheo-software> 1425551737 <https://www.youtube.com/user/MatheoSoftware> 1425551760 --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
Shards, delete duplicates ?
Dear Solr users, I have two collections C1 and C2 For C1 and C2 the unique key is ID. ID in C1 are patent numbers normalized i.e US + 12 digits + A1 ID in C2 are patent numbers as I receive them. US + 13 digits + A1 (a leading 0 is added) My collection C2 has a field name ID12 which is not defined as a unique field. This ID12 is the copy of the field ID of C1. (US + 12 digits + A1) Data in ID12 are unique in the whole C2 collection. Data in C1_ID and C2_ID12 are the same. I try to request these both collections using shards in the url. It works fine but I get duplicate documents. Its normal I know. Is exists a method, a parameter, or anything else that allows me to indicate to solr to compare ID in C1 with ID12 in C2 to delete duplicates ? Many thanks for your help, Bruno Mannina <http://www.matheo-software.com> www.matheo-software.com <http://www.patent-pulse.com> www.patent-pulse.com Tél. +33 0 430 650 788 Fax. +33 0 430 650 728 <https://www.facebook.com/PatentPulse> facebook (1) <https://twitter.com/matheosoftware> 1425551717 <https://www.linkedin.com/company/matheo-software> 1425551737 <https://www.youtube.com/user/MatheoSoftware> 1425551760 --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
How can I request a big list of values ?
Hi All, I'm using actually SOLR 3.6 and I have around 91 000 000 docs inside. All work fine, it's great :) But now, I would like to request a list of values in the same field (more than 2000 values) I know I can use |?q=x:(AAA BBB CCC ...) (my default operator is OR) but I have a list of 2000 values ! I think it's not the good idea to use this method. Can someone help me to find the good solution ? Can I use a json structure by using a POST method ? Thanks a lot, Bruno | --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: How can I request a big list of values ?
Hi Jack, ok but for 2000 values, it means that I must do 40 requests if I choose to have 50 values by requests :'( and in my case, user can choose about 8 topics, so it can generate 8 times 40 requests... humm... is it not possible to send a text, json, xml file ? Le 10/08/2014 17:38, Jack Krupansky a écrit : Generally, "large requests" are an anti-pattern in modern distributed systems. Better to have a number of smaller requests executing in parallel and then merge the results in the application layer. -- Jack Krupansky -Original Message----- From: Bruno Mannina Sent: Saturday, August 9, 2014 7:18 PM To: solr-user@lucene.apache.org Subject: How can I request a big list of values ? Hi All, I'm using actually SOLR 3.6 and I have around 91 000 000 docs inside. All work fine, it's great :) But now, I would like to request a list of values in the same field (more than 2000 values) I know I can use |?q=x:(AAA BBB CCC ...) (my default operator is OR) but I have a list of 2000 values ! I think it's not the good idea to use this method. Can someone help me to find the good solution ? Can I use a json structure by using a POST method ? Thanks a lot, Bruno | --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: How can I request a big list of values ?
Hi Anshum, I can do it with 3.6 release no ? my main problem, it's that I have around 2000 values, so I can't use one request with these values, it's too wide. :'( I will take a look to generate (like Jack proposes me) several requests, but even in this case it seems to be not safe... Le 10/08/2014 19:45, Anshum Gupta a écrit : Hi Bruno, If you would have been on a more recent release, https://issues.apache.org/jira/browse/SOLR-6318 would have come in handy perhaps. You might want to look at patching your version with this though (as a work around). On Sat, Aug 9, 2014 at 4:18 PM, Bruno Mannina wrote: Hi All, I'm using actually SOLR 3.6 and I have around 91 000 000 docs inside. All work fine, it's great :) But now, I would like to request a list of values in the same field (more than 2000 values) I know I can use |?q=x:(AAA BBB CCC ...) (my default operator is OR) but I have a list of 2000 values ! I think it's not the good idea to use this method. Can someone help me to find the good solution ? Can I use a json structure by using a POST method ? Thanks a lot, Bruno | --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
statistic on a field?
Dear, I have a field named "Authors", is it possible to have the frequency of terms (first 2000 for i.e.) of this field ? Thanks, Bruno
Re: statistic on a field?
Le 20/10/2013 17:52, Bruno Mannina a écrit : Dear, I have a field named "Authors", is it possible to have the frequency of terms (first 2000 for i.e.) of this field ? Thanks, Bruno By using Schema Browser, I have information on my field Authors but I have a problem, I have statistic on part of terms of this field... i.e. termfreq co256875 ltd235899 corp 195554 etc... The field has been splitted to do stats ?! FieldType: TEXT_GENERAL Properties: Indexed, Tokenized, Stored, Multivalued Schema: Indexed, Tokenized, Stored, Multivalued Index: indexed, Tokenized, Stored Position Increment Gap: 100 Distinct: 1803034 I think it's because this field is Tokenized ? no ? Regards, Bruno
Is Solr can create temporary sub-index ?
Dear Solr User, We have to do a new web project which is : Connect our SOLR database to a web plateform. This Web Plateform will be used by several users at the same time. They do requests on our SOLR and they can apply filter on the result. i.e.: Our SOLR contains 87M docs An user do requests, result is around few hundreds to several thousands. On the Web Plateform, user will see first 20 results (or more by using Next Page button) But he will need also to filter the whole result by additional terms. (Terms that our plateform will propose him) Is SOLR can create temporary index (manage by SOLR himself during a web session) ? My goal is to not download the whole result on local computer to provide filter, or to re-send the same request several times added to the new criterias. Many thanks for your comment, Regards, Bruno
Re: Is Solr can create temporary sub-index ?
Hello Tim, Yes solr's facet could be a solution, but I need to re-send the q= each time. I'm asking me just if an another solution exists. Facet seems to be the good solution. Bruno Le 23/10/2013 17:03, Timothy Potter a écrit : Hi Bruno, Have you looked into Solr's facet support? If I'm reading your post correctly, this sounds like the classic case for facets. Each time the user selects a facet, you add a filter query (fq clause) to the original query. http://wiki.apache.org/solr/SolrFacetingOverview Tim On Wed, Oct 23, 2013 at 8:16 AM, Bruno Mannina wrote: Dear Solr User, We have to do a new web project which is : Connect our SOLR database to a web plateform. This Web Plateform will be used by several users at the same time. They do requests on our SOLR and they can apply filter on the result. i.e.: Our SOLR contains 87M docs An user do requests, result is around few hundreds to several thousands. On the Web Plateform, user will see first 20 results (or more by using Next Page button) But he will need also to filter the whole result by additional terms. (Terms that our plateform will propose him) Is SOLR can create temporary index (manage by SOLR himself during a web session) ? My goal is to not download the whole result on local computer to provide filter, or to re-send the same request several times added to the new criterias. Many thanks for your comment, Regards, Bruno
Re: Is Solr can create temporary sub-index ?
I have a little question concerning statistics on a request: I have a field defined like that: multiValued="true"/> positionIncrementGap="100" autoGeneratePhraseQueries="true"> words="stopwords.txt" enablePositionIncrements="true"/> words="stopwords.txt" enablePositionIncrements="true"/> ignoreCase="true" expand="true"/> Date sample for this field: A23L1/22066 A23L1/227 A23L1/231 A23L1/2375 My question is: Is it possible to have frequency of terms for the whole result of the initial user's request? Thanks a lot, Bruno Le 23/10/2013 18:12, Timothy Potter a écrit : Yes, absolutely you resend the q= each time, optionally with any facets selected by the user using fq= On Wed, Oct 23, 2013 at 10:00 AM, Bruno Mannina wrote: Hello Tim, Yes solr's facet could be a solution, but I need to re-send the q= each time. I'm asking me just if an another solution exists. Facet seems to be the good solution. Bruno Le 23/10/2013 17:03, Timothy Potter a écrit : Hi Bruno, Have you looked into Solr's facet support? If I'm reading your post correctly, this sounds like the classic case for facets. Each time the user selects a facet, you add a filter query (fq clause) to the original query. http://wiki.apache.org/solr/**SolrFacetingOverview<http://wiki.apache.org/solr/SolrFacetingOverview> Tim On Wed, Oct 23, 2013 at 8:16 AM, Bruno Mannina wrote: Dear Solr User, We have to do a new web project which is : Connect our SOLR database to a web plateform. This Web Plateform will be used by several users at the same time. They do requests on our SOLR and they can apply filter on the result. i.e.: Our SOLR contains 87M docs An user do requests, result is around few hundreds to several thousands. On the Web Plateform, user will see first 20 results (or more by using Next Page button) But he will need also to filter the whole result by additional terms. (Terms that our plateform will propose him) Is SOLR can create temporary index (manage by SOLR himself during a web session) ? My goal is to not download the whole result on local computer to provide filter, or to re-send the same request several times added to the new criterias. Many thanks for your comment, Regards, Bruno
Re: Is Solr can create temporary sub-index ?
Hum I think my fieldType = "text_classification" is not appropriated for this kind of data... I don't need to use stopwords, synonym etc... IC field is a field that contains codes, and codes contains often the char "/" and if I use the Terms option, I get: ... 4563254 3763554 2263254 ... .. Le 23/10/2013 18:51, Bruno Mannina a écrit : positionIncrementGap="100" autoGeneratePhraseQueries="true"> words="stopwords.txt" enablePositionIncrements="true"/> words="stopwords.txt" enablePositionIncrements="true"/> ignoreCase="true" expand="true"/>
Re: Is Solr can create temporary sub-index ?
I need your help to define the right fieldType, please, this field must be indexed, stored and each value must be considered as one term. The char / don't be consider like a separator. Is String could be a good fieldType ? thanks Le 23/10/2013 18:51, Bruno Mannina a écrit : A23L1/22066 A23L1/227 A23L1/231 A23L1/2375
What is the right fieldType for this kind of field?
Dear, Data look likes: A23L1/22066 A23L1/227 A23L1/231 A23L1/2375 I tried: - String but I can't search with troncation (i.e. A23*) - Text_General but as my code contains / then data are splitted... What kind of field must choose to use truncation and consider code with / as one term? thanks a lot for your help, Bruno
Re: What is the right fieldType for this kind of field?
Hi Jack, Yes String works fine, I forgot to restart my solr server after changing my schema.xml...arrf.I'm so stupid sorry ! Le 23/10/2013 20:09, Jack Krupansky a écrit : Trailing wildcard should work fine for strings, but "a23*" will not match "A23*" due to case. You could use the keyword tokenizer plus the lower case filter. -- Jack Krupansky -Original Message- From: Bruno Mannina Sent: Wednesday, October 23, 2013 1:54 PM To: solr-user@lucene.apache.org Subject: What is the right fieldType for this kind of field? Dear, Data look likes: A23L1/22066 A23L1/227 A23L1/231 A23L1/2375 I tried: - String but I can't search with troncation (i.e. A23*) - Text_General but as my code contains / then data are splitted... What kind of field must choose to use truncation and consider code with / as one term? thanks a lot for your help, Bruno
Re: What is the right fieldType for this kind of field?
Le 23/10/2013 20:09, Jack Krupansky a écrit : You could use the keyword tokenizer plus the lower case filter. Jack, Could you help me to write the right fieldType please? (index and query) Another thing, I don't know if I must use the Keyword tokenizer because codes contain "/" char, and Tokenizer seems split code no ? Many thanks, Bruno
Re: What is the right fieldType for this kind of field?
Le 23/10/2013 22:44, Bruno Mannina a écrit : Le 23/10/2013 20:09, Jack Krupansky a écrit : You could use the keyword tokenizer plus the lower case filter. Jack, Could you help me to write the right fieldType please? (index and query) Another thing, I don't know if I must use the Keyword tokenizer because codes contain "/" char, and Tokenizer seems split code no ? Many thanks, Bruno may be an answer (i don't tested yet) http://pietervogelaar.nl/solr-3-5-search-case-insensitive-on-a-string-field-for-exact-match/
Re: What is the right fieldType for this kind of field?
Le 23/10/2013 22:49, Bruno Mannina a écrit : Le 23/10/2013 22:44, Bruno Mannina a écrit : Le 23/10/2013 20:09, Jack Krupansky a écrit : You could use the keyword tokenizer plus the lower case filter. Jack, Could you help me to write the right fieldType please? (index and query) Another thing, I don't know if I must use the Keyword tokenizer because codes contain "/" char, and Tokenizer seems split code no ? Many thanks, Bruno may be an answer (i don't tested yet) http://pietervogelaar.nl/solr-3-5-search-case-insensitive-on-a-string-field-for-exact-match/ ok it works fine !
Terms function join with a Select function ?
Dear Solr users, I use the Terms function to see the frequency data in a field but it's for the whole database. I have 2 questions: - Is it possible to increase the number of statistic ? actually I have the 10 first frequency term. - Is it possible to limit this statistic to the result of a request ? PS: the second question is very important for me. Many thanks
Re: Terms function join with a Select function ?
Dear All, Ok I have an answer concerning the first question (limit) It's the terms.limit parameters. But I can't find how to apply a Terms request on a query result any idea ? Bruno Le 23/10/2013 23:19, Bruno Mannina a écrit : Dear Solr users, I use the Terms function to see the frequency data in a field but it's for the whole database. I have 2 questions: - Is it possible to increase the number of statistic ? actually I have the 10 first frequency term. - Is it possible to limit this statistic to the result of a request ? PS: the second question is very important for me. Many thanks --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Terms function join with a Select function ?
Dear, humI don't know how can I use it..; I tried: my query: ti:snowboard (3095 results) I would like to have at the end of my XML, the Terms statistic for the field AP (applicant field (patent notice)) but I haven't that... Please help, Bruno /select?q=ti%Asnowboard&version=2.2&start=0&rows=10&indent=on&facet=true&f.ap.facet.limit=10 Le 24/10/2013 14:04, Erik Hatcher a écrit : That would be called faceting :) http://wiki.apache.org/solr/SimpleFacetParameters On Oct 24, 2013, at 5:23 AM, Bruno Mannina wrote: Dear All, Ok I have an answer concerning the first question (limit) It's the terms.limit parameters. But I can't find how to apply a Terms request on a query result.... any idea ? Bruno Le 23/10/2013 23:19, Bruno Mannina a écrit : Dear Solr users, I use the Terms function to see the frequency data in a field but it's for the whole database. I have 2 questions: - Is it possible to increase the number of statistic ? actually I have the 10 first frequency term. - Is it possible to limit this statistic to the result of a request ? PS: the second question is very important for me. Many thanks --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Terms function join with a Select function ?
humm facet perfs are very bad (Solr 3.6.0) My index is around 87 000 000 docs. (4 * Proc double core, 24G Ram) I thought facets will work only on the result but it seems it's not the case. My request: http://localhost:2727/solr/select?q=ti:snowboard&rows=0&facet=true&facet.field=ap&facet.limit=5 Do you think my request is wrong ? Maybe it's not possible to have statistic on a field (like Terms function) on a query..... Thx for your help, Bruno Le 24/10/2013 19:40, Bruno Mannina a écrit : Dear, humI don't know how can I use it..; I tried: my query: ti:snowboard (3095 results) I would like to have at the end of my XML, the Terms statistic for the field AP (applicant field (patent notice)) but I haven't that... Please help, Bruno /select?q=ti%Asnowboard&version=2.2&start=0&rows=10&indent=on&facet=true&f.ap.facet.limit=10 Le 24/10/2013 14:04, Erik Hatcher a écrit : That would be called faceting :) http://wiki.apache.org/solr/SimpleFacetParameters On Oct 24, 2013, at 5:23 AM, Bruno Mannina wrote: Dear All, Ok I have an answer concerning the first question (limit) It's the terms.limit parameters. But I can't find how to apply a Terms request on a query result any idea ? Bruno Le 23/10/2013 23:19, Bruno Mannina a écrit : Dear Solr users, I use the Terms function to see the frequency data in a field but it's for the whole database. I have 2 questions: - Is it possible to increase the number of statistic ? actually I have the 10 first frequency term. - Is it possible to limit this statistic to the result of a request ? PS: the second question is very important for me. Many thanks --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Terms function join with a Select function ?
Just a little precision: solr down after running my URL :( so bad... Le 24/10/2013 22:04, Bruno Mannina a écrit : humm facet perfs are very bad (Solr 3.6.0) My index is around 87 000 000 docs. (4 * Proc double core, 24G Ram) I thought facets will work only on the result but it seems it's not the case. My request: http://localhost:2727/solr/select?q=ti:snowboard&rows=0&facet=true&facet.field=ap&facet.limit=5 Do you think my request is wrong ? Maybe it's not possible to have statistic on a field (like Terms function) on a query..... Thx for your help, Bruno Le 24/10/2013 19:40, Bruno Mannina a écrit : Dear, humI don't know how can I use it..; I tried: my query: ti:snowboard (3095 results) I would like to have at the end of my XML, the Terms statistic for the field AP (applicant field (patent notice)) but I haven't that... Please help, Bruno /select?q=ti%Asnowboard&version=2.2&start=0&rows=10&indent=on&facet=true&f.ap.facet.limit=10 Le 24/10/2013 14:04, Erik Hatcher a écrit : That would be called faceting :) http://wiki.apache.org/solr/SimpleFacetParameters On Oct 24, 2013, at 5:23 AM, Bruno Mannina wrote: Dear All, Ok I have an answer concerning the first question (limit) It's the terms.limit parameters. But I can't find how to apply a Terms request on a query result any idea ? Bruno Le 23/10/2013 23:19, Bruno Mannina a écrit : Dear Solr users, I use the Terms function to see the frequency data in a field but it's for the whole database. I have 2 questions: - Is it possible to increase the number of statistic ? actually I have the 10 first frequency term. - Is it possible to limit this statistic to the result of a request ? PS: the second question is very important for me. Many thanks --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Normalized data during indexing ?
Dear, I would like to know if SOLR can do that: I have a field named "Assignee" with values like: Int Business Machines Corp Int Business Mach Inc I would like to have a "result field" in the schema.xml named "Norm_Assignee" which contains the translation with a lexical file: Int Business Machines Corp > IBM Int Business Mach Inc > IBM So, I will have: Int Business Machines Corp IBM Int Business Mach Inc IBM and if the correspondance do not exists then don't create the data. I'm sure this idea is possible with SOLR but I don't found on Wiki, Google, SOLR Support Thanks for any idea, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Terms function join with a Select function ?
Hi Erick, I think it's a memory problem, I do my test on a little computer at home (8Go Ram i3-2120 3.30Ghz 64bits) and my database is very big 87M docs for 200Go size. I thought SOLR could done statistic on only the query answer, so here on around 3000 docs (around 6000 terms) it's not so big I do analyze log yet, I will do in few hours when I comeback home Thanks, Bruno Le 25/10/2013 15:36, Erick Erickson a écrit : How many unique values are in the field? Solr has to create a counter for each and every one of them, you may be blowing memory up. What do the logs say? Best, Erick On Thu, Oct 24, 2013 at 4:07 PM, Bruno Mannina wrote: Just a little precision: solr down after running my URL :( so bad... Le 24/10/2013 22:04, Bruno Mannina a écrit : humm facet perfs are very bad (Solr 3.6.0) My index is around 87 000 000 docs. (4 * Proc double core, 24G Ram) I thought facets will work only on the result but it seems it's not the case. My request: http://localhost:2727/solr/**select?q=ti:snowboard&rows=0&** facet=true&facet.field=ap&**facet.limit=5<http://localhost:2727/solr/select?q=ti:snowboard&rows=0&facet=true&facet.field=ap&facet.limit=5> Do you think my request is wrong ? Maybe it's not possible to have statistic on a field (like Terms function) on a query..... Thx for your help, Bruno Le 24/10/2013 19:40, Bruno Mannina a écrit : Dear, humI don't know how can I use it..; I tried: my query: ti:snowboard (3095 results) I would like to have at the end of my XML, the Terms statistic for the field AP (applicant field (patent notice)) but I haven't that... Please help, Bruno /select?q=ti%Asnowboard&**version=2.2&start=0&rows=10&** indent=on&facet=true&f.ap.**facet.limit=10 Le 24/10/2013 14:04, Erik Hatcher a écrit : That would be called faceting :) http://wiki.apache.org/solr/**SimpleFacetParameters<http://wiki.apache.org/solr/SimpleFacetParameters> On Oct 24, 2013, at 5:23 AM, Bruno Mannina wrote: Dear All, Ok I have an answer concerning the first question (limit) It's the terms.limit parameters. But I can't find how to apply a Terms request on a query result any idea ? Bruno Le 23/10/2013 23:19, Bruno Mannina a écrit : Dear Solr users, I use the Terms function to see the frequency data in a field but it's for the whole database. I have 2 questions: - Is it possible to increase the number of statistic ? actually I have the 10 first frequency term. - Is it possible to limit this statistic to the result of a request ? PS: the second question is very important for me. Many thanks --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Normalized data during indexing ?
Hi Michael, thanks it sounds like I'm looking for I need to investigate Thanks a lot ! Le 25/10/2013 14:46, michael.boom a écrit : Maybe this can help you: https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory - Thanks, Michael -- View this message in context: http://lucene.472066.n3.nabble.com/Normalized-data-during-indexing-tp4097750p4097752.html Sent from the Solr - User mailing list archive at Nabble.com. --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
How to request not directly my SOLR server ?
Dear All, I show my SOLR server to a friend and its first question was: "You can request directly your solr database from your internet explorer?! is it not a security problem? each person which has your request link can use your database directly?" So I ask the question here. I protect my admin panel but is it possible to protect a direct request ? By using google, lot a result concern admin panel security but I can't find information about that. Thanks for your comment, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: How to request not directly my SOLR server ?
Le 26/11/2013 18:52, Shawn Heisey a écrit : On 11/26/2013 8:37 AM, Bruno Mannina wrote: I show my SOLR server to a friend and its first question was: "You can request directly your solr database from your internet explorer?! is it not a security problem? each person which has your request link can use your database directly?" So I ask the question here. I protect my admin panel but is it possible to protect a direct request ? Don't make your Solr server directly accessible from the Internet. Only make it accessible from the machines that serve your website and whoever needs to administer it. Solr has no security features. You can use the security features in whatever container is running Solr, but that is outside the scope of this mailing list. Thanks, Shawn Thanks a lot for this information, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Indexed a new big database while the old is running?
Dear Solr Users, We have actually a SOLR db with around 88 000 000 docs. All work fine :) We receive each year a new backfile with the same content (but improved). Index these docs takes several days on SOLR, So is it possible to create a new collection (restart SOLR) and Index these new 88 000 000 docs without stopping the current collection ? We have around 1 million connections by month. Do you think that this new indexation may cause problem to SOLR using? Note: new database will not be used until the current collection will be stopped. Thx for your comment, Bruno
Re: Indexed a new big database while the old is running?
Hi Shaw, Thanks for your answer. Actually we haven't performance problem because we do only select request. We have 4 CPUs 8cores 24Go Ram. I know how to create alias, my question was just concerning performance, and you have right, impossible to answer to this question without more information about my system, sorry. I will do real test and I will check if perf will be down, if yes I will stop new indexation If you have more information concerning indexation performance with my server config, don't miss to write me. :) Have a nice day, Regards, Bruno Le 18/02/2014 16:30, Shawn Heisey a écrit : On 2/18/2014 5:28 AM, Bruno Mannina wrote: We have actually a SOLR db with around 88 000 000 docs. All work fine :) We receive each year a new backfile with the same content (but improved). Index these docs takes several days on SOLR, So is it possible to create a new collection (restart SOLR) and Index these new 88 000 000 docs without stopping the current collection ? We have around 1 million connections by month. Do you think that this new indexation may cause problem to SOLR using? Note: new database will not be used until the current collection will be stopped. You can instantly switch between collections by using the alias feature. To do this, you would have collections named something like test201302 and test201402, then you would create an alias named 'test' that points to one of these collections. Your code can use 'test' as the collection name. Without a lot more information, it's impossible to say whether building a new collection will cause performance problems for the existing collection. It does seem like a problem that rebuilding the index takes several days. You might already be having performance problems. It's also possible that there's an aspect to this that I am not seeing, and that several days is perfectly normal for YOUR index. Not enough RAM is the most common reason for performance issues on a large index: http://wiki.apache.org/solr/SolrPerformanceProblems Thanks, Shawn
Help with SolrCloud exceptions while recovering
Hi, I am a newbie SolrCloud enthusiast. My goal is to implement an infrastructure to enable text analysis (clustering, classification, information extraction, sentiment analysis, etc). My development environment consists of one machine, quad-core processor, 16GB RAM and 1TB HD. Have started implementing Apache Flume, Twitter as source and SolrCloud (within JBoss AS 7) as sink. Using Zookeeper (5 servers) to upload configuration and managing cluster. The pseudo-distributed cluster consists of one collection, three shards each with three replicas. Everything runs smoothly for a while. After 50.000 tweets committed (actually CloudSolrServer commits every batch consisting of 500 documents) randomly SolrCloud starts logging exceptions: Lucene file not found, IndexWriter cannot be opened, replication unsuccessful and the likes. Recovery starts with no success until replica goes down. Have tried different Solr versions (4.10.2, 4.9.1 and lastly 4.8.1) with same results. I have looked everywhere for help before writing this email. My guess right now is that the problem lies with SolrCloud and Zookeeper connection, although haven't seen any such exception. Any reference or help will be welcomed. Cheers, B.
Re: Help with SolrCloud exceptions while recovering
Hi Erick, Thank you very much for your reply. I disabled client commit while setting commits at solconfig.xml as follows: ${solr.autoCommit.maxTime:30} false ${solr.autoSoftCommit.maxTime:6} The picture changed for the better. No more index corruption, endless replication trials and, up till now, 16 hours since start-up and more than 142k tweet downloaded, shards and replicas are "active". One problem remains though. While auto committing Solr logs the following stack-trace 00:00:40,383 ERROR [org.apache.solr.update.CommitTracker] (commitScheduler-25-thread-1) auto commit error...:org.apache.solr.common.SolrException: *Error opening new searcher* at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1550) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1662) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:603) at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) *Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: _1.nvm* at org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:252) at org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:238) at java.util.TimSort.countRunAndMakeAscending(TimSort.java:324) at java.util.TimSort.sort(TimSort.java:203) at java.util.TimSort.sort(TimSort.java:173) at java.util.Arrays.sort(Arrays.java:659) at java.util.Collections.sort(Collections.java:217) at org.apache.lucene.index.TieredMergePolicy.findMerges(TieredMergePolicy.java:286) at org.apache.lucene.index.IndexWriter.updatePendingMerges(IndexWriter.java:2017) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1986) at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:407) at org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:287) at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:272) at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:251) at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1461) ... 10 more *Caused by: java.io.FileNotFoundException: _1.nvm* at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:260) at org.apache.lucene.store.NRTCachingDirectory.fileLength(NRTCachingDirectory.java:177) at org.apache.lucene.index.SegmentCommitInfo.sizeInBytes(SegmentCommitInfo.java:141) at org.apache.lucene.index.MergePolicy.size(MergePolicy.java:513) at org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:242) ... 24 more This file "_1.nvm" once existed. Was deleted during one auto commit , but remains somewhere in a queue for deletion. I believe the consequence is that at SolrCloud Admin UI -> Core Admin -> Stats, the "Current" status is off for all shards' replica number 3. If I understand correctly this means that changes to the index are not becoming visible. Once again I tried to find possible reasons for that situation, but none of the threads found seems to reflect my case. My lock type is set to: ${solr.lock.type:single}. This is due to lock.wait timeout error with both "native" and "simple" when trying to create collection using the commands API. There is a thread discussing this issue: http://lucene.472066.n3.nabble.com/unable-to-load-core-after-cluster-restart-td4098731.html The only thing is that "single" should only be used if "there is no possibility of another process trying to modify the index" and I cannot guarantee that. Could that be the cause of the file not found exception? Thanks once again for your help. Regards, Bruno. 2014-11-08 18:36 GMT-02:00 Erick Erickson : > First. for tweets committing every 500 docs is much too frequent. > Especially from the client and super-especially if you have multiple > clients running. I'd recommend you just configure solrconfig this way > as a place to start and do NOT commit from any clients. > 1> a hard commit (openSearcher=false) every minute (or maybe 5 minutes) > 2> a soft commit every minute > > This latter governs how long it'll be between when a doc is indexed and > when > can be searched. > > Here'
Re: Help with SolrCloud exceptions while recovering
Erick, Once again thank you very much for your attention. Now my pseudo-distributed SolrCloud is configured with no inconsistency. An additional problem was starting Jboss with "solr.data.dir" set to a path not expected by Solr (actually it was not even underneath solr.home directory). This thread ( http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3ccao8xr5zv8o-s6zn7ypaxpzpourqjknbsm59mbe6h3dpfykg...@mail.gmail.com%3E) explains the inconsistency. I found no need to change Solr data directory. After commenting this property at Jboss' standalone.xml and setting "${solr.lock.type:native}" everything started to work properly. Regards, Bruno 2014-11-09 14:35 GMT-02:00 Erick Erickson : > OK, we're _definitely_ in the speculative realm here, so don't think > I know more than I do ;)... > > The next thing I'd try is to go back to "native" as the lock type on the > theory that the lock type wasn't your problem, it was the too-frequent > commits. > > bq: This file "_1.nvm" once existed. Was deleted during one auto commit , > but > remains somewhere in a queue for deletion > > Assuming Unix, this is entirely expected. Searchers have all the files > open. Commits > do background merges, which may delete segments. So the current searcher > may > have the file open even though it's been "merged away". When the searcher > closes, the file will actually truly disappear. > > It's more complicated on Windows but eventually that's what happens > > Anyway, keep us posted. If this continues to occur, please open a new > thread, > that might catch the eye of people who are deep into Lucene file locking... > > Best, > Erick > > On Sun, Nov 9, 2014 at 6:45 AM, Bruno Osiek wrote: > > Hi Erick, > > > > Thank you very much for your reply. > > I disabled client commit while setting commits at solconfig.xml as > follows: > > > > > >${solr.autoCommit.maxTime:30} > >false > > > > > > > >${solr.autoSoftCommit.maxTime:6} > > > > > > The picture changed for the better. No more index corruption, endless > > replication trials and, up till now, 16 hours since start-up and more > than > > 142k tweet downloaded, shards and replicas are "active". > > > > One problem remains though. While auto committing Solr logs the following > > stack-trace > > > > 00:00:40,383 ERROR [org.apache.solr.update.CommitTracker] > > (commitScheduler-25-thread-1) auto commit > > error...:org.apache.solr.common.SolrException: *Error opening new > searcher* > > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1550) > > at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1662) > > at > > > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:603) > > at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > > at > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) > > at > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) > > at > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > at > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:745) > > *Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: > > _1.nvm* > > at > > > org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:252) > > at > > > org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:238) > > at java.util.TimSort.countRunAndMakeAscending(TimSort.java:324) > > at java.util.TimSort.sort(TimSort.java:203) > > at java.util.TimSort.sort(TimSort.java:173) > > at java.util.Arrays.sort(Arrays.java:659) > > at java.util.Collections.sort(Collections.java:217) > > at > > > org.apache.lucene.index.TieredMergePolicy.findMerges(TieredMergePolicy.java:286) > > at > > > org.apache.lucene.index.IndexWriter.updatePendingMerges(IndexWriter.java:2017) > > at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1986) > > at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:407) > > at > > > org.apache
Request two databases at the same time ?
Dear All, I use Apache-SOLR3.6, on Ubuntu (newbie user). I have a big database named BigDB1 with 90M documents, each document contains several fields (docid, title, author, date, etc...) I received today from another source, abstract of some documents (there are also the same docid field in this source). I don't want to modify my BigDB1 to update documents with abstract because BigDB1 is always updated twice by week. Do you think it's possible to create a new database named AbsDB1 and request the both database at the same time ? if I do for example: title:airplane AND abstract:plastic I would like to obtain documents from BigDB1 and AbsDB1. Many thanks for your help, information and others things that can help me. Regards, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Request two databases at the same time ?
Dear Erick, thank you for your answer. My answers are below. Le 09/01/2015 20:43, Erick Erickson a écrit : bq: I don't want to modify my BigDB1 to update documents with abstract because BigDB1 is always updated twice by week. Why not? Solr/Lucene handle updating docs, if a doc in the index has the same , the old doc is deleted and the new one takes its place. So why not just put the new abstracts into BigDB1? If you re-index the docs later (your twice/week comment), then they'll be overwritten. This will be much simpler than trying to maintain two. I understand this process, I use it for other collections and twice time by week for BigDB1. But, i.e. Doc1 is updated with Abstract on Monday. Tuesday I must update it with new data, then Abstract will be lost. I can't check/get abstract before to re-insert it in the new doc because I receive several thousand docs every week (new and amend), i think it will take a long time to do that. But if you cannot update BigDB1 just fire off two queries and combine them. Or specify the shards parameter on the URL pointing to both collections. Do note, though, that the relevance calculations may not be absolutely comparable, so mixing the results may show some surprises... Shards..I wilkl take a look to this, I don't know this param. Concerning relevance, I don't really use it, so it won't be a problem I think. Sincerely, Best, Erick On Fri, Jan 9, 2015 at 9:12 AM, Bruno Mannina wrote: Dear All, I use Apache-SOLR3.6, on Ubuntu (newbie user). I have a big database named BigDB1 with 90M documents, each document contains several fields (docid, title, author, date, etc...) I received today from another source, abstract of some documents (there are also the same docid field in this source). I don't want to modify my BigDB1 to update documents with abstract because BigDB1 is always updated twice by week. Do you think it's possible to create a new database named AbsDB1 and request the both database at the same time ? if I do for example: title:airplane AND abstract:plastic I would like to obtain documents from BigDB1 and AbsDB1. Many thanks for your help, information and others things that can help me. Regards, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com