HTML entities being missed by DIH HTMLStripTransformer
Hi, I am using DIH to index some database fields. These fields contain html formatted text in them. I use the 'HTMLStripTransformer' to remove that markup. This works fine when the text is like for example: Item One or *This is in Bold* However when the text has HTML entity names like in: <li>Item One</> or <b>This is in Bold</b> NOTHING HAPPENS. Two questions. (1) Is this the expected behavior of DIH HTMLStripTransformer? (2) If yes, is there an another transformer that I can employ first to turn these html entities into their usual symbols that can then be removed by the DIH HTMLStripTransformer? Thanks - ashok -- View this message in context: http://lucene.472066.n3.nabble.com/HTML-entities-being-missed-by-DIH-HTMLStripTransformer-tp4053582.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: HTML entities being missed by DIH HTMLStripTransformer
Well, the database field has text, sometimes with HTML entities and at other times with html tags. I have no control over the process that populates the database tables with info. -- View this message in context: http://lucene.472066.n3.nabble.com/HTML-entities-being-missed-by-DIH-HTMLStripTransformer-tp4053582p4053586.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: HTML entities being missed by DIH HTMLStripTransformer
Hi Steve, Fabulous suggestion! Yup, that is it! Using the HTMLStripTransformer twice did the trick. I am using Solr 4.1. Thank you very much! - ashok -- View this message in context: http://lucene.472066.n3.nabble.com/HTML-entities-being-missed-by-DIH-HTMLStripTransformer-tp4053582p4053609.html Sent from the Solr - User mailing list archive at Nabble.com.
WordDelimiterFactory
Hi, Why does WDF swallow all 'words' that start with a 'digit'? My config is: For some text like 20x-30y I am expecting (& want) '20x' & '30y' to be returned & retained as the tokens after WDF is done with it. But I get nothing as per the analysis page. Any idea why? I am using 4.1 Thanks - ashok -- View this message in context: http://lucene.472066.n3.nabble.com/WordDelimiterFactory-tp4056529.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: WordDelimiterFactory
Thank you Jack, yes it is tricky. If my text is x20-y30 I get two nice tokens x20 & y30 that I need to keep. But the text 20x-30y is treated differently and I get nothing. 20x-y30 gives me just 'y30' The docs on LucidWorks say generateNumberParts: (integer, default 1) If non-zero, splits numeric strings at delimiters:"1947-32" ->"1947", "32" It looks like any 'word' that starts with a digit is treated as a numeric string. Setting generateNumberParts="1" in stead of "0" seems to generate the right tokens in this case but need to see if it has any other impacts on the finalized token list... Thanks - ashok -- View this message in context: http://lucene.472066.n3.nabble.com/WordDelimiterFactory-tp4056529p4056544.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: WordDelimiterFactory
Yes, thank you Erick. The analysis/document handlers hold the key to deciding the type & order of the filters to employ given one's document set, & subject matter at hand. The finalized terms they produce for SOLR search, mlt etc... are crucial to the quality of the results. - ashok -- View this message in context: http://lucene.472066.n3.nabble.com/WordDelimiterFactory-tp4056529p4057349.html Sent from the Solr - User mailing list archive at Nabble.com.
Loadbalance for SorCloud using SolrNet
Hello, I have application sending request to Shard1 each time, making this single point of failure. Please suggest what can I use for Load balancing in SolrNet. Is there something like CloudSolrClient as in SolrJ ? Or will I have to go with HA proxy or physical load balance only ? Please suggest. Thank you, Vrinda Davda This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify so to the sender by e-mail and delete the original message. In such cases, please notify us immediately at i...@infinite.com . Further, you are not to copy, disclose, or distribute this e-mail or its contents to any unauthorized person(s) .Any such actions are considered unlawful. This e-mail may contain viruses. Infinite has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachments. Infinite reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infinite e-mail system. ***INFINITE End of DisclaimerINFINITE
RE: Loadbalance for SorCloud using SolrNet
Thanks Shawn. So do you suggest to have external load balance ? Something like HA proxy or physical load balance. -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Thursday, April 20, 2017 12:36 PM To: solr-user@lucene.apache.org Subject: Re: Loadbalance for SorCloud using SolrNet On 4/20/2017 12:47 AM, Vrinda Ashok wrote: > I have application sending request to Shard1 each time, making this single > point of failure. Please suggest what can I use for Load balancing in SolrNet. > > Is there something like CloudSolrClient as in SolrJ ? Or will I have to go > with HA proxy or physical load balance only ? SolrNet was not built by the Solr project. It was developed by somebody else. Unless SolrNet has the capability of using more than one base URL to access Solr and failing over if one of them becomes unusable, you will need a separate load balancer. I have no idea whether SolrNet has that capability. Thanks, Shawn This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify so to the sender by e-mail and delete the original message. In such cases, please notify us immediately at i...@infinite.com . Further, you are not to copy, disclose, or distribute this e-mail or its contents to any unauthorized person(s) .Any such actions are considered unlawful. This e-mail may contain viruses. Infinite has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachments. Infinite reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infinite e-mail system. ***INFINITE End of DisclaimerINFINITE
Getting many slorconfig.xml's in the example folder
Hi I am not getting why there are many solrconfig.xml file. Is that everytime when we upload a file to solr solrconfig.xml file is generated.
Re: Getting many slorconfig.xml's in the example folder
Hii thanx for the reply.i am new to solr. i am giving som mre details here. Actually i upload files through post.jar format.whn i search for solrconfig.xml files thre are around 5 files nd 5 files for solr.xml by default.i want to know is that the default configuration... On 30-Nov-2014 9:55 PM, "Alexandre Rafalovitch" wrote: > If you are talking about default distribution, it is because Solr > comes with multiple examples actually. Look for the file solr.xml. > That's the rule of a full example each of which may have one or more > collections. > > If you start Solr with java -Dsolr.solr.home=/x/y/z -jar start.jar you > will see the rest of them. Where /x/y/z is the location of the > solr.xml file. > > Regards, >Alex. > > Personal: http://www.outerthoughts.com/ and @arafalov > Solr resources and newsletter: http://www.solr-start.com/ and @solrstart > Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 > > > On 30 November 2014 at 07:33, ASHOK SARMAH > wrote: > > Hi I am not getting why there are many solrconfig.xml file. > > Is that everytime when we upload a file to solr solrconfig.xml file is > > generated. >
Re: Getting many slorconfig.xml's in the example folder
Thanx fr reply. I had one more doubt as if in solrconfig.xml i hav added spellcheckr setting within browse request handler.when i am trying to add same spellchecker setting in another request handler ie query within same solrconfig.xml the spellchecker functionality is not being shown. On 01-Dec-2014 12:23 AM, "Alexandre Rafalovitch" wrote: > It's a default configuration. Nothing to do with post.jar. Easy to > check this by just looking at what's inside the zip file. Reread my > other email for details again. > > Also try the tutorials, both the one online and the one in the docs folder. > > Regards, >Alex. > > > Personal: http://www.outerthoughts.com/ and @arafalov > Solr resources and newsletter: http://www.solr-start.com/ and @solrstart > Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 > > > On 30 November 2014 at 13:39, ASHOK SARMAH > wrote: > > Hii thanx for the reply.i am new to solr. i am giving som mre details > here. > > Actually i upload files through post.jar format.whn i search for > > solrconfig.xml files thre are around 5 files nd 5 files for solr.xml by > > default.i want to know is that the default configuration... > > On 30-Nov-2014 9:55 PM, "Alexandre Rafalovitch" > wrote: > > > >> If you are talking about default distribution, it is because Solr > >> comes with multiple examples actually. Look for the file solr.xml. > >> That's the rule of a full example each of which may have one or more > >> collections. > >> > >> If you start Solr with java -Dsolr.solr.home=/x/y/z -jar start.jar you > >> will see the rest of them. Where /x/y/z is the location of the > >> solr.xml file. > >> > >> Regards, > >>Alex. > >> > >> Personal: http://www.outerthoughts.com/ and @arafalov > >> Solr resources and newsletter: http://www.solr-start.com/ and > @solrstart > >> Solr popularizers community: > https://www.linkedin.com/groups?gid=6713853 > >> > >> > >> On 30 November 2014 at 07:33, ASHOK SARMAH > >> wrote: > >> > Hi I am not getting why there are many solrconfig.xml file. > >> > Is that everytime when we upload a file to solr solrconfig.xml file is > >> > generated. > >> >
Contextual search
Hii all .i wanted to know how solr performs contextual search.actually in my search list i had given the query as "three book".i got the suggestn as "a book of three".which i wanted.but when i specify it as "thri book".it specifies me of spelling check for thri as three its fyn.but why i dont get in this case result as "a book of three".like previous.
Re: Contextual search
Hi alex thnx .i was able to get the get the suggestion for thri book as " the book of three".but when i search for threebook (three and book are now combined)then i am not able to get the suggestn for "a book of three".how we solve this? On 01-Dec-2014 9:34 PM, "Alexandre Rafalovitch" wrote: > If you need Solr to treat 'thri' (invalid English) as 'three', you > need to tell it to do so. Look at the synonym modules in the example's > schema.xml. > > Or you could do phonetic matches. You have a couple of choices for > those, but basically it's all about the specific analyzer chains to > experiment with. So, start with that and come back if you still have > troubles once you understand the way analyzers work. > > Regards, >Alex. > Personal: http://www.outerthoughts.com/ and @arafalov > Solr resources and newsletter: http://www.solr-start.com/ and @solrstart > Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 > > > On 1 December 2014 at 09:46, ASHOK SARMAH > wrote: > > Hii all .i wanted to know how solr performs contextual search.actually in > > my search list i had given the query as "three book".i got the suggestn > as > > "a book of three".which i wanted.but when i specify it as "thri book".it > > specifies me of spelling check for thri as three its fyn.but why i dont > get > > in this case result as "a book of three".like previous. >
Re: Contextual search
HI Alex, I have specified following in my solrconfig.xml :: on true 5 2 5 true true 5 3 wordbreak 5 I have written wordbreak 5 to break the words with minimum length 5.then it should break my word threebook as three and book right?correct me if I am wrong.But I am not getting the required search results.Kindly suggest. On Wed, Dec 3, 2014 at 12:08 AM, Alexandre Rafalovitch wrote: > Well, how would you expect it to solve it - in non-technical terms. > What's the high level description of "book of three" matching > "threebook" and not say "threeof"? Random permutation of any two > words? It's a bit of a strange requirement so far. > > Regards, >Alex. > Personal: http://www.outerthoughts.com/ and @arafalov > Solr resources and newsletter: http://www.solr-start.com/ and @solrstart > Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 > > > On 2 December 2014 at 12:55, ASHOK SARMAH > wrote: > > Hi alex thnx .i was able to get the get the suggestion for thri book as " > > the book of three".but when i search for threebook (three and book are > now > > combined)then i am not able to get the suggestn for "a book of three".how > > we solve this? > > On 01-Dec-2014 9:34 PM, "Alexandre Rafalovitch" > wrote: > > > >> If you need Solr to treat 'thri' (invalid English) as 'three', you > >> need to tell it to do so. Look at the synonym modules in the example's > >> schema.xml. > >> > >> Or you could do phonetic matches. You have a couple of choices for > >> those, but basically it's all about the specific analyzer chains to > >> experiment with. So, start with that and come back if you still have > >> troubles once you understand the way analyzers work. > >> > >> Regards, > >>Alex. > >> Personal: http://www.outerthoughts.com/ and @arafalov > >> Solr resources and newsletter: http://www.solr-start.com/ and > @solrstart > >> Solr popularizers community: > https://www.linkedin.com/groups?gid=6713853 > >> > >> > >> On 1 December 2014 at 09:46, ASHOK SARMAH > >> wrote: > >> > Hii all .i wanted to know how solr performs contextual > search.actually in > >> > my search list i had given the query as "three book".i got the > suggestn > >> as > >> > "a book of three".which i wanted.but when i specify it as "thri > book".it > >> > specifies me of spelling check for thri as three its fyn.but why i > dont > >> get > >> > in this case result as "a book of three".like previous. > >> >
Re: Contextual search
HI Alex, I have specified these in the solrconfig.xml as:: on true 5 2 5 true true 5 3 wordbreak 5 . The lines wordbreak 5 are for breaking the word threebook as three and book .But then too its not searching for the string "A book of three".Kindly suggest what all ways it can be done On Wed, Dec 3, 2014 at 12:08 AM, Alexandre Rafalovitch wrote: > Well, how would you expect it to solve it - in non-technical terms. > What's the high level description of "book of three" matching > "threebook" and not say "threeof"? Random permutation of any two > words? It's a bit of a strange requirement so far. > > Regards, >Alex. > Personal: http://www.outerthoughts.com/ and @arafalov > Solr resources and newsletter: http://www.solr-start.com/ and @solrstart > Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 > > > On 2 December 2014 at 12:55, ASHOK SARMAH > wrote: > > Hi alex thnx .i was able to get the get the suggestion for thri book as " > > the book of three".but when i search for threebook (three and book are > now > > combined)then i am not able to get the suggestn for "a book of three".how > > we solve this? > > On 01-Dec-2014 9:34 PM, "Alexandre Rafalovitch" > wrote: > > > >> If you need Solr to treat 'thri' (invalid English) as 'three', you > >> need to tell it to do so. Look at the synonym modules in the example's > >> schema.xml. > >> > >> Or you could do phonetic matches. You have a couple of choices for > >> those, but basically it's all about the specific analyzer chains to > >> experiment with. So, start with that and come back if you still have > >> troubles once you understand the way analyzers work. > >> > >> Regards, > >>Alex. > >> Personal: http://www.outerthoughts.com/ and @arafalov > >> Solr resources and newsletter: http://www.solr-start.com/ and > @solrstart > >> Solr popularizers community: > https://www.linkedin.com/groups?gid=6713853 > >> > >> > >> On 1 December 2014 at 09:46, ASHOK SARMAH > >> wrote: > >> > Hii all .i wanted to know how solr performs contextual > search.actually in > >> > my search list i had given the query as "three book".i got the > suggestn > >> as > >> > "a book of three".which i wanted.but when i specify it as "thri > book".it > >> > specifies me of spelling check for thri as three its fyn.but why i > dont > >> get > >> > in this case result as "a book of three".like previous. > >> >
Re: Using Solr Spatial in conjunction with HBASE/Hadoop
Have you looked at Oracle NoSQL Database http://www.oracle.com/us/products/database/nosql/overview/index.html, a scalable key-value store? Can Solr be integrated with it? Thanks and warm regards. ashok joshi oracle -- View this message in context: http://lucene.472066.n3.nabble.com/Using-Solr-Spatial-in-conjunction-with-HBASE-Hadoop-tp4034307p4034848.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Upgrade Issue
Hi Team, We are upgrading from Solr 7.5.0 version to 8.5.2 version. We are doing custom /createcore functionality from our web application. In 7.5.0 version we mentioned the that Filter in web.xml for create core and it is working fine. For 8.5.2 that customized create core filter not calling. Is there anything we restricting in 8.5.2 version. Please confirm Regards, Ashokkumar M [Aspire Systems] This e-mail message and any attachments are for the sole use of the intended recipient(s) and may contain proprietary, confidential, trade secret or privileged information. Any unauthorized review, use, disclosure or distribution is prohibited and may be a violation of law. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
How to index doc file in solr?
Hi, I would like to know how to index any document other than xml in SOLR ? Any comments would be appreciated !!! Thanks, Rohan CAUTION - Disclaimer * This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system. ***INFOSYS End of Disclaimer INFOSYS***