RE: File Descriptor/Memory Leak
Is there a firewall between a client and a server by any chance? CLOSE_WAIT is not a leak, but standard TCP step at the end. So the question is why sockets are reopened that often or why the other side does not acknowledge TCP termination packet fast. I would run Ethereal to troubleshoot that. And truss/strace. Regards, Alex On 8 Jul 2016 4:56 PM, "Mads Tomasgård Bjørgan" wrote: FYI - we're using Solr-6.1.0, and the leak seems to be consequent (occurs every single time when running with SSL). -Original Message- From: Anshum Gupta [mailto:ans...@anshumgupta.net] Sent: torsdag 7. juli 2016 18.14 To: solr-user@lucene.apache.org Subject: Re: File Descriptor/Memory Leak I've created a JIRA to track this: https://issues.apache.org/jira/browse/SOLR-9290 On Thu, Jul 7, 2016 at 8:00 AM, Shai Erera wrote: > Shalin, we're seeing that issue too (and actually actively debugging > it these days). So far I can confirm the following (on a 2-node cluster): > > 1) It consistently reproduces on 5.5.1, but *does not* reproduce on > 5.4.1 > 2) It does not reproduce when SSL is disabled > 3) Restarting the Solr process (sometimes both need to be restarted), > the count drops to 0, but if indexing continues, they climb up again > > When it does happen, Solr seems stuck. The leader cannot talk to the > replica, or vice versa, the replica is usually put in DOWN state and > there's no way to fix it besides restarting the JVM. > > Reviewing the changes from 5.4.1 to 5.5.1 I tried reverting some that > looked suspicious (SOLR-8451 and SOLR-8578), even though the changes > look legit. That did not help, and honestly I've done that before we > suspected it might be the SSL. Therefore I think those are "safe", but just FYI. > > When it does happen, the number of CLOSE_WAITS climb very high, to the > order of 30K+ entries in 'netstat'. > > When I say it does not reproduce on 5.4.1 I really mean the numbers > don't go as high as they do in 5.5.1. Meaning, when running without > SSL, the number of CLOSE_WAITs is smallish, usually less than a 10 (I > would separately like to understand why we have any in that state at > all). When running with SSL and 5.4.1, they stay low at the order of > hundreds the most. > > Unfortunately running without SSL is not an option for us. We will > likely roll back to 5.4.1, even if the problem exists there, but to a > lesser degree. > > I will post back here when/if we have more info about this. > > Shai > > On Thu, Jul 7, 2016 at 5:32 PM Shalin Shekhar Mangar < > shalinman...@gmail.com> > wrote: > > > I have myself seen this CLOSE_WAIT issue at a customer. I am running > > some tests with different versions trying to pinpoint the cause of this leak. > > Once I have some more information and a reproducible test, I'll open > > a > jira > > issue. I'll keep you posted. > > > > On Thu, Jul 7, 2016 at 5:13 PM, Mads Tomasgård Bjørgan > > wrote: > > > > > Hello there, > > > Our SolrCloud is experiencing a FD leak while running with SSL. > > > This is occurring on the one machine that our program is sending > > > data too. We > > have > > > a total of three servers running as an ensemble. > > > > > > While running without SSL does the FD Count remain quite constant > > > at around 180 while indexing. Performing a garbage collection also > > > clears almost the entire JVM-memory. > > > > > > However - when indexing with SSL does the FDC grow polynomial. The > count > > > increases with a few hundred every five seconds or so, but reaches > easily > > > 50 000 within three to four minutes. Performing a GC swipes most > > > of the memory on the two machines our program isn't transmitting > > > the data > > directly > > > to. The last machine is unaffected by the GC, and both memory nor > > > FDC doesn't reset before Solr is restarted on that machine. > > > > > > Performing a netstat reveals that the FDC mostly consists of > > > TCP-connections in the state of "CLOSE_WAIT". > > > > > > > > > > > > > > > -- > > Regards, > > Shalin Shekhar Mangar. > > > -- Anshum Gupta
Re: CDCR (Solr6.x) does not start
Hi Renaud, thank you for your response. You asked for some further information: 1. Log messages at the source cluster: As mentioned in my addendum "CDCR (Solr6.x) does not start (logfile)". I changed the log level for all Handlers to TRACE and I got three Messages for each shard caused by "Action LASTPROCESSEDVERSION sent to non-leader replica .." For me this looks like the blocker. 2. Replication should start even if no commit has been sent to the source cluster. Thanks for the clarification. It helps me to understand. 3. The empty queue seems to indicate there is an issue, and that cdcr was unable to instantiate the replicator for the target cluster. Just to be sure, your source cluster has 4 shards, but not replica ? If it has replicas, can you ensure that you execute these command on the shard leader. At the beginning I tried to replicate 4 shards with an replication factor of 3. Later on i simplified the environment by omitting the replicas. (replication factor = 1) Do you think having no replicas could the reason for the log messages above? Regards Uwe Am 05.07.2016 um 14:55 schrieb Renaud Delbru: Hi Uwe, At first look, your configuration seems correct, see my comments below. On 28/06/16 15:36, Uwe Reh wrote: 9. Start CDCR http://SOURCE:s_port/solr/scoll/cdcr?action=start&wt=json {"responseHeader":{"status":0,"QTime":13},"status":["process","started","buffer","enabled"]} ! (not even a single query to the target's zookeeper ??) Indeed, you should have observed a communication between the source cluster and the target zookeeper. Do you see any errors in the log of the source cluster ? Or a log message such as: "Unable to instantiate the log reader for target collection ..." 10. Enter some test data into the SOURCE 11. Explicit commit in SOURCE http://SOURCE:s_port/solr/scoll/update?commit=true&opensearcher=true !! (at least now there should be some traffic, or?) Replication should start even if no commit has been sent to the source cluster. 12. Check errors and queues http://SOURCE:s_port/solr/scoll_shard1_replica1/cdcr?action=queues&wt=json {"responseHeader":{"status":0,"QTime":0},"queues":[],"tlogTotalSize":135,"tlogTotalCount":1,"updateLogSynchronizer":"stopped"} http://SOURCE:s_port/solr/scoll_shard1_replica1/cdcr?action=errors&wt=json {"responseHeader":{"status":0,"QTime":0},"errors":[]} ! Why is the element queues is empty The empty queue seems to indicate there is an issue, and that cdcr was unable to instantiate the replicator for the target cluster. Just to be sure, your source cluster has 4 shards, but not replica ? If it has replicas, can you ensure that you execute these command on the shard leader. Kind Regards
Disabling solr scoring
Hi, Is there any way to completely disable scoring in solr cloud as i am always passing sort parameter whenever i search. And disabling scoring will improve performance? Thanks & Regards, Bhaumik Joshi
Query Elevation
A new requirement to get particular document as second result in result page. For example, If the query is “coal”, this document(id: 222) should come as second result. Please let me know if you have any solution. -- View this message in context: http://lucene.472066.n3.nabble.com/Query-Elevation-tp4286332.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Disabling solr scoring
What about sort=_docid_ asc ? 08 июля 2016 г. 13:50 пользователь "Bhaumik Joshi" < bhaumik.jo...@outlook.com> написал: > Hi, > > > Is there any way to completely disable scoring in solr cloud as i am > always passing sort parameter whenever i search. > > And disabling scoring will improve performance? > > > Thanks & Regards, > > Bhaumik Joshi >
Re: Suggester Issue
Taking a look to the code, basically the context field act as the default field in a Standard Query Parser. Even if having multiple context fields seems to be supported when building the auxiliary suggester index, it doesn't seem to work at query time. If you specify the context only at query time ( through the .cfq) is not going to work , as the context field is required to build properly the auxiliary suggesting index. Any additional insight is welcome, Cheers On Thu, Jul 7, 2016 at 1:13 PM, Rajesh Kapur wrote: > Hi, > > Any update on this? > > Could you please let me know is it possible to pass CFQ parameter on > multiple fields? > > Thanks > > On Tue, Jul 5, 2016 at 9:40 AM, Rajesh Kapur > wrote: > > > Hi, > > > > > > > > I tried to implement suggester using SOLR 6.0.1 with context field. PFB > > the configuration we are using to implement suggester > > > > > > > > > > > >class="solr.SuggestComponent"> > > > > > > > > mySuggester > > > > > > > > AnalyzingInfixLookupFactory > > > > suggester_infixdata_dir > > > > DocumentDictionaryFactory > > > > SearchSuggestions > > > > BrandName > > > > suggest > > > > true > > > > > > > > > > > > > > > > > > > > > > > > > > > > true > > > > 10 > > > > mySuggester > > > > > > > > > > > > > > > > suggest_sitesearch > > > > > > > > > > > > > > > > But I am not able to get the desired output using suggest.cfq parameter. > > Could you please help me in getting the correct output. > > > > > > > > -Thanks, > > > > Rajesh Kapur > > > -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
group.facet=true and facet on field of type int -> org.apache.solr.common.SolrException: Exception during facet.field
Hi all, are there any limitations in regard to retrieval of facet information, when grouping? When I send the following query where the field to facet ("m_pt_14_s_ns") on is of type "string", everything works fine. "q": "*:*", "facet.field": "m_pt_14_s_ns", "indent": "true", "group.facet": "true", "fq": "m_id_l:[* TO *]", "wt": "json", "facet": "true", "group.field": "m_id_l", "group": "true", [..] "facet_fields": { "m_pt_14_s_ns": [ "zahlr. Ill.", 7, "Ill.", 4, "Ill., graph. Darst.", 2, "zahlr. Ill. (z.T. farb.))", 2, "überw. Ill.", 1, "zahlr. Cartoons jetzt durchgehend zweifarbig illustriert", 1, "zahlr. Ill., Kt.", 1, "zahlr. Ill., graph. Darst.", 1, "überw. Ill.", 1 ] }, When I try the same with a field of type “int” (m_pt_27_i_ns), I get the following exception: "q": "*:*", "facet.field": "m_pt_27_i_ns", "indent": "true", "group.facet": "true", "fq": "m_id_l:[* TO *]", "wt": "json", "facet": "true", "group.field": "m_id_l", "group": "true", [..] "error": { "metadata": [ "error-class", "org.apache.solr.common.SolrException", "root-error-class", "java.lang.IllegalStateException" ], "msg": "Exception during facet.field: m_pt_27_i_ns", "trace": "org.apache.solr.common.SolrException: Exception during facet.field: m_pt_27_i_ns\r\n\tat org.apache.solr.request.SimpleFacets$3.call(SimpleFacets.java:700)\r\n\tat org.apache.solr.request.SimpleFacets$3.call(SimpleFacets.java:685)\r\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\r\n\tat org.apache.solr.request.SimpleFacets$2.execute(SimpleFacets.java:639)\r\n\tat org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:710)\r\n\tat org.apache.solr.handler.component.FacetComponent.getFacetCounts(FacetComponent.java:294)\r\n\tat org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:256)\r\n\tat org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:272)\r\n\tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155)\r\n\tat org.apache.solr.core.SolrCore.execute(SolrCore.java:2082)\r\n\tat org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:670)\r\n\tat org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:458)\r\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:225)\r\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:183)\r\n\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)\r\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)\r\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\r\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)\r\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)\r\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)\r\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)\r\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\r\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)\r\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\r\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)\r\n\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)\r\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)\r\n\tat org.eclipse.jetty.server.Server.handle(Server.java:499)\r\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)\r\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)\r\n\tat org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)\r\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)\r\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)\r\n\tat java.lang.Thread.run(Thread.java:745)\r\nCaused by: java.lang.IllegalStateException: unexpected docvalues type NUMERIC for field 'm_pt_27_i_ns' (expected=SORTED). Use UninvertingReader or index with docvalues.\r\n\tat org.apache.lucene.index.DocValues.checkField(DocValues.java:208)\r\n\tat org.apache.lucene.index.DocValues.getSorted(DocValues.java:264)\r\n\tat org.apache.lucene.search.grouping.term.TermGroupFacetCollector$SV.doSetNextReader(TermGroupFacetCollector.java:129)\r\n\tat org.apache.lucene.search.SimpleCollector.getLeafCollector(SimpleCollector.java:33)\r\n\tat org.apache.solr.request.SimpleFacets$1.getLeafCollector(SimpleFacets.java:601)\r\n\tat
Re: Integrating Stanford NLP or any other NLP for Natural Language Query
I've added multivalued fields within my SOLR schema for indexing entities extracted using NLP methods applied to the text I'm indexing, along with fields for other discrete data extracted from relational databases. A Java application reads data out of multiple relational databases, uses NLP on the text and indexes each document (de-normalized) using SOLRJ. I initially tried doing this with content handlers, but found it much easier to just write a Java application. SOLRJ Java API reference: https://cwiki.apache.org/confluence/display/solr/Using+SolrJ Stanford NLP: http://stanfordnlp.github.io/CoreNLP/ Best, Jay On Thu, Jul 7, 2016 at 9:52 PM, Puneet Pawaia wrote: > Hi Jay > Any place I can learn more on this method of integration? > Thanks > Puneet > > On 8 Jul 2016 02:58, "Jay Urbain" wrote: > > > I use Stanford NLP and cTakes (based on OpenNLP) while indexing with a > > SOLRJ application. > > > > Best, > > Jay > > > > On Thu, Jul 7, 2016 at 12:09 PM, Puneet Pawaia > > wrote: > > > > > Hi > > > > > > I am currently using Solr 5.5.x to test but can upgrade to Solr 6.x if > > > required. > > > I am working on a POC for natural language query using Solr. Should I > use > > > the Stanford libraries or are there any other libraries having > > integration > > > with Solr already available. > > > Any direction in how to do this would be most appreciated. How should I > > > process the query to give relevant results. > > > > > > Regards > > > Puneet > > > > > >
Re: Disabling solr scoring
Can you please elaborate? I am passing user defined sort field and order whenever i search. Thanks & Regards, Bhaumik Joshi From: Mikhail Khludnev Sent: Friday, July 8, 2016 4:13 AM To: solr-user Subject: Re: Disabling solr scoring What about sort=_docid_ asc ? 08 2016 ?. 13:50 "Bhaumik Joshi" < bhaumik.jo...@outlook.com> ???: > Hi, > > > Is there any way to completely disable scoring in solr cloud as i am > always passing sort parameter whenever i search. > > And disabling scoring will improve performance? > > > Thanks & Regards, > > Bhaumik Joshi >
SOLR 6: edismax search query with OR operator does not work as expected
Hello, after migrating my index from Solr 4.3 to Solr 6 I noticed that the OR logical operator in search query no longer works as expected. On Solr 4.3 query - Blue OR Red - brings all documents with Blue or Red or both tokens found. On Solr 6 the same query only brings documents with both the tokens, Blue and Red. I see some difference in the debug of the query but I cannot make much sense out of it. Was there any change between Solr 4 and 6 that would cause this? Thanks Ales Gregor
Re: Internode communication failed when enable basic authentication Solr 6.1.0
Hello, could this be related to https://issues.apache.org/jira/browse/SOLR-9188 ? Ales Gregor 2016-06-24 15:15 GMT+02:00 Shankar Ramalingam : > Hi Team, > > Basic Authentication is enabled on Solr cloud and node1 is running on one > machine and node2 is runnin on second machine, zookeeper installed on > second machine. Getting unathorized error when enable basic auth, error > mostly occure when machine trying access machine 2 solr and also while > starting solr also i can see the error message. > > > It would be grateful if you help me to resole the issue. I saw some jira > ticket stating that some internode communication issue and got fixed in > solr 6, but I am using solr 6 only and also getting isssue, Even-though am > login admin user on solr UI, sometime getting Error 401 Unauthorized > request mostly request go to node1 to node2. > > > > [c:adm s:shard2 r:core_node2 x:adm_shard2_replica2 ] > o.a.s.h.RequestHandlerBase > org.apache.solr.client.solrj.impl.HttpSolrClient$Re > moteSolrException: > Error from server at http://172.16.7.58:8983/solr/adm_shard2_ replica1: > Expected mime type application/octet-stream but got text/html. > > > Error 401 Unauthorized request, Response code: 401 > > HTTP ERROR 401 > Problem accessing /solr/adm_shard2_replica1/select. Reason: >* Unauthorized request, Response code: 401* > > > > > > *Below code define in security.json* > > -bash-3.2# curl --user solr:SolrRocks > http://localhost:8983/solr/admin/authorization > { > "responseHeader":{ > "status":0, > "QTime":0}, > "authorization.enabled":true, > "authorization":{ > "class":"solr.RuleBasedAuthorizationPlugin", > "user-role":{"solr":"admin"}, > "permissions":[{ > "name":"security-edit", > "role":"admin", > "index":1}, > { > "name":"read", > "role":"admin", > "index":2}, > { > "name":"collection-admin-read ", > "role":"admin", > "index":3}], > "":{"v":50}}} > > > > Thanks, > Shankar. > Contact: 91+9894546732 >
RE: SOLR 6: edismax search query with OR operator does not work as expected
This sounds like it might be of help - < solrQueryParser defaultOperator="AND"/> You can change it from and to or. (If I understood you) - Sas -Original Message- From: Aleš Gregor [mailto:alg...@gmail.com] Sent: Friday, July 8, 2016 9:37 AM To: solr-user@lucene.apache.org Subject: SOLR 6: edismax search query with OR operator does not work as expected Hello, after migrating my index from Solr 4.3 to Solr 6 I noticed that the OR logical operator in search query no longer works as expected. On Solr 4.3 query - Blue OR Red - brings all documents with Blue or Red or both tokens found. On Solr 6 the same query only brings documents with both the tokens, Blue and Red. I see some difference in the debug of the query but I cannot make much sense out of it. Was there any change between Solr 4 and 6 that would cause this? Thanks Ales Gregor
query / joint
"docs": [ { "id": "...", "type_s": "ticket", "customerid_s": "100", ... ... ... ... }, { "id": "...", "type_s": "customer", "customerid_s": "100", "name_s": "FISHER", ... ... ... ... } ] Hello, I have two entitys : - tickets (type_s : ticket) - customers (type_s : customer) I want a query to find all tickets for name customer FISHER In doc ticket, I have id customer but not name customer... Any ideas ??? joint ??? Thanks Philippe
Re: SOLR 6: edismax search query with OR operator does not work as expected
Also take a look at: https://issues.apache.org/jira/browse/SOLR-8812 On Fri, Jul 8, 2016 at 7:02 AM, Jamal, Sarfaraz wrote: > This sounds like it might be of help - > > < solrQueryParser defaultOperator="AND"/> > > You can change it from and to or. > > (If I understood you) - > > Sas > > > -Original Message- > From: Aleš Gregor [mailto:alg...@gmail.com] > Sent: Friday, July 8, 2016 9:37 AM > To: solr-user@lucene.apache.org > Subject: SOLR 6: edismax search query with OR operator does not work as > expected > > Hello, > > after migrating my index from Solr 4.3 to Solr 6 I noticed that the OR > logical operator in search query no longer works as expected. > > On Solr 4.3 query - Blue OR Red - brings all documents with Blue or Red or > both tokens found. > On Solr 6 the same query only brings documents with both the tokens, Blue and > Red. > > I see some difference in the debug of the query but I cannot make much sense > out of it. > > Was there any change between Solr 4 and 6 that would cause this? > > Thanks > Ales Gregor
Searching Home's, Homes and Home
User can type keyword for search in many ways an and following are the few examples: if user types any of the keywords homes, home, home's then it should be able to search the following: 1. Home 2. Home's 3. Homes If user types Americas, the results should include 1. Americas 2. America's 3. America Please suggest how to send the search query to Solr to include all the results. -- View this message in context: http://lucene.472066.n3.nabble.com/Searching-Home-s-Homes-and-Home-tp4286341.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Searching Home's, Homes and Home
I would start by looking at the stemming documentation - It might be of help. Sas -Original Message- From: Surender [mailto:surender.si...@rsystems.com] Sent: Friday, July 8, 2016 8:30 AM To: solr-user@lucene.apache.org Subject: Searching Home's, Homes and Home User can type keyword for search in many ways an and following are the few examples: if user types any of the keywords homes, home, home's then it should be able to search the following: 1. Home 2. Home's 3. Homes If user types Americas, the results should include 1. Americas 2. America's 3. America Please suggest how to send the search query to Solr to include all the results. -- View this message in context: http://lucene.472066.n3.nabble.com/Searching-Home-s-Homes-and-Home-tp4286341.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to create highlight search component using Config API
If you already have highlighting defined from one of the default configsets, you can see an example of how the JSON is structured with a Config API request. I assume you already tried that, but pointing it out just in case. Defining a highlighter with the Config API is a bit confusing to be honest, but I worked out something that works: {"add-searchcomponent": {"highlight": {"name":"myHighlight", "class":"solr.HighlightComponent","": {"gap": {"default":"true", "name": "gap", "class":"solr.highlight.GapFragmenter", "defaults":{"hl.fragsize":100}}},"html":[{"default": "true","name": "html","class": "solr.highlight.HtmlFormatter","defaults": {"hl.simple.pre":"", "hl.simple.post":""}},{"name": "html","class": "solr.highlight.HtmlEncoder"}]}}} Note there is an empty string after the initial class definition (shown as ""). That lets you then add the fragmenters. (I tried to prettify that, but my mail client isn't cooperating. I'm going to add this example to the Solr Ref Guide, though so it might be easier to see there in a few minutes.) Hope it helps - Cassandra On Wed, Jun 29, 2016 at 8:00 AM, Alexandre Drouin wrote: > Hi, > > I'm trying to create a highlight search component using the Config API of > Solr 6.0.1 however I cannot figure out how to include the elements > fragmenter, formatter, encoder, etc... > > Let's say I have the following component: > >name="myHighlightingComponent"> > >class="solr.highlight.GapFragmenter"> > > 100 > > >class="solr.highlight.HtmlFormatter"> > > > > > > > > > > From what I can see from the documentation my JSON should look a bit like > this: > > { > "add-searchcomponent":{ > "name":"myHighlightingComponent", > "class":"solr.HighlightComponent", > ?? > } > } > > However I have no idea how to defines the 2 fragmenters or the encoder. Any > help is appreciated. > > Thanks > Alex >
Re: Searching Home's, Homes and Home
I second Jamal, using a soft stemmer for your language should solve the problem. Specifically to the english language and the cases you mentioned : 1) Minimal English stemmer should be a good solution [1] 2) The english porter stemmer can be valid for your use case as well [2] 3) Not sure if the english possessive is managed by the previous filters, in case not [3] Cheers [1] https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-EnglishMinimalStemFilter [2] https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-PorterStemFilter [3] https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-ClassicFilter On Fri, Jul 8, 2016 at 6:17 PM, Jamal, Sarfaraz < sarfaraz.ja...@verizonwireless.com.invalid> wrote: > I would start by looking at the stemming documentation - > > It might be of help. > > Sas > > > -Original Message- > From: Surender [mailto:surender.si...@rsystems.com] > Sent: Friday, July 8, 2016 8:30 AM > To: solr-user@lucene.apache.org > Subject: Searching Home's, Homes and Home > > User can type keyword for search in many ways an and following are the few > examples: > if user types any of the keywords homes, home, home's then it should be > able to search the following: > 1. Home > 2. Home's > 3. Homes > > If user types Americas, the results should include 1. Americas 2. > America's 3. America > > Please suggest how to send the search query to Solr to include all the > results. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Searching-Home-s-Homes-and-Home-tp4286341.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
Re: Integrating Stanford NLP or any other NLP for Natural Language Query
Hi Puneet, your requirement : "I would like users to be able to write queries in natural language rather than keyword based search." Is really really vague :( Can you try to help us with some specific example, starting of course from the simplest use cases you have initially in mind ? Moving from keyword based search to natural language is a really complex task. Proceeding step by step can help you. Do you want for example to set up a Q&A basic system ? In that case you should take care of query rewriting. You need basically to identify your base requirement and then build a specific parser for that. You can use triple stores and knowledge bases to enrich both your query and your index, but let's start from the basis, what is your simplest requirement ? On Fri, Jul 8, 2016 at 1:56 PM, Jay Urbain wrote: > I've added multivalued fields within my SOLR schema for indexing entities > extracted using NLP methods applied to the text I'm indexing, along with > fields for other discrete data extracted from relational databases. > > A Java application reads data out of multiple relational databases, uses > NLP on the text and indexes each document (de-normalized) using SOLRJ. > > I initially tried doing this with content handlers, but found it much > easier to just write a Java application. > > SOLRJ Java API reference: > https://cwiki.apache.org/confluence/display/solr/Using+SolrJ > > Stanford NLP: > http://stanfordnlp.github.io/CoreNLP/ > > Best, > Jay > > > On Thu, Jul 7, 2016 at 9:52 PM, Puneet Pawaia > wrote: > > > Hi Jay > > Any place I can learn more on this method of integration? > > Thanks > > Puneet > > > > On 8 Jul 2016 02:58, "Jay Urbain" wrote: > > > > > I use Stanford NLP and cTakes (based on OpenNLP) while indexing with a > > > SOLRJ application. > > > > > > Best, > > > Jay > > > > > > On Thu, Jul 7, 2016 at 12:09 PM, Puneet Pawaia < > puneet.paw...@gmail.com> > > > wrote: > > > > > > > Hi > > > > > > > > I am currently using Solr 5.5.x to test but can upgrade to Solr 6.x > if > > > > required. > > > > I am working on a POC for natural language query using Solr. Should I > > use > > > > the Stanford libraries or are there any other libraries having > > > integration > > > > with Solr already available. > > > > Any direction in how to do this would be most appreciated. How > should I > > > > process the query to give relevant results. > > > > > > > > Regards > > > > Puneet > > > > > > > > > > -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
Re: Disabling solr scoring
: Can you please elaborate? I am passing user defined sort field and order whenever i search. I think Mikhail just missunderstood your question -- he was giving an example of how to override the default sort (which uses score) with one that would ensure scores are not computed. : > Is there any way to completely disable scoring in solr cloud as i am : > always passing sort parameter whenever i search. In general, you don't have to do anythign special. Solr's internal code looks at the sort specified, and the fields requested (via the fl param) to determine if/when scores need to be computed while colleting documents. If scores aren't needed for any reason, then that info is passed down to the low level lucene document matching/collection code for optimizing the collection so scores aren't computed. -Hoss http://www.lucidworks.com/
Re: Filter Query that matches all values of a field
: I have a single type field that can contain zero or more values (comma : separated values). This field stores some sort of access value. : : In the filter, I am given a list of allowed values for the field and a : document must be considered if all values contained in its field must be : present in the allowed values specified in the filter. : How can i write filter query for this? My prefered solution is: 1) index the *unique* values as a multivalued StrField (ex: foo) 2) create a second field containing the *count* of unique values, CountFieldValuesUpdateProcessorFactory makes this trivial (ex: foo_count) 3) query/filter using the frange parser with l & h both =0, and the function being foo_count minus the sum of the termfreq results for each value the user posesses. Examples query... :1. Case #1) If the allowed values specified in the filter are (a1, a3, :a4, a6) --> the document should not be considered since user doesn’t have fq={!frange l=0 h=0}sub(foo_count, sum(termfreq(foo,'a1), termfreq(foo,'a3'), termfreq(foo,'a4'), termfreq(foo,'a6'))) -Hoss http://www.lucidworks.com/
Re: Integrating Stanford NLP or any other NLP for Natural Language Query
Hi Alessandro I am looking at being able to answer questions like "Can a non-compete clause in an employment agreement be enforced after the expiry of the agreement?" On Sat, Jul 9, 2016 at 4:34 AM, Alessandro Benedetti wrote: > Hi Puneet, > your requirement : > "I would like users to be able to write queries in natural language rather > than keyword based search." > > Is really really vague :( > Can you try to help us with some specific example, starting of course from > the simplest use cases you have initially in mind ? > > Moving from keyword based search to natural language is a really complex > task. > Proceeding step by step can help you. > > Do you want for example to set up a Q&A basic system ? > In that case you should take care of query rewriting. > You need basically to identify your base requirement and then build a > specific parser for that. > You can use triple stores and knowledge bases to enrich both your query and > your index, but let's start from the basis, what is your simplest > requirement ? > > On Fri, Jul 8, 2016 at 1:56 PM, Jay Urbain wrote: > > > I've added multivalued fields within my SOLR schema for indexing entities > > extracted using NLP methods applied to the text I'm indexing, along with > > fields for other discrete data extracted from relational databases. > > > > A Java application reads data out of multiple relational databases, uses > > NLP on the text and indexes each document (de-normalized) using SOLRJ. > > > > I initially tried doing this with content handlers, but found it much > > easier to just write a Java application. > > > > SOLRJ Java API reference: > > https://cwiki.apache.org/confluence/display/solr/Using+SolrJ > > > > Stanford NLP: > > http://stanfordnlp.github.io/CoreNLP/ > > > > Best, > > Jay > > > > > > On Thu, Jul 7, 2016 at 9:52 PM, Puneet Pawaia > > wrote: > > > > > Hi Jay > > > Any place I can learn more on this method of integration? > > > Thanks > > > Puneet > > > > > > On 8 Jul 2016 02:58, "Jay Urbain" wrote: > > > > > > > I use Stanford NLP and cTakes (based on OpenNLP) while indexing with > a > > > > SOLRJ application. > > > > > > > > Best, > > > > Jay > > > > > > > > On Thu, Jul 7, 2016 at 12:09 PM, Puneet Pawaia < > > puneet.paw...@gmail.com> > > > > wrote: > > > > > > > > > Hi > > > > > > > > > > I am currently using Solr 5.5.x to test but can upgrade to Solr 6.x > > if > > > > > required. > > > > > I am working on a POC for natural language query using Solr. > Should I > > > use > > > > > the Stanford libraries or are there any other libraries having > > > > integration > > > > > with Solr already available. > > > > > Any direction in how to do this would be most appreciated. How > > should I > > > > > process the query to give relevant results. > > > > > > > > > > Regards > > > > > Puneet > > > > > > > > > > > > > > > > > > -- > -- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England >
Re: Integrating Stanford NLP or any other NLP for Natural Language Query
Hi Alessandro I am looking at being able to answer questions like "Can a non-compete clause in an employment agreement be enforced after the expiry of the agreement?" We are doing some testing with IBM Watson and with a sample test data, we are able to get relevant replies to the above question. Since IBM Watson uses Solr at its backend, I was wondering if we can get the same working at the Solr level without having to use Watson. Regards Puneet On Sat, Jul 9, 2016 at 11:34 AM, Puneet Pawaia wrote: > Hi Alessandro > > I am looking at being able to answer questions like "Can a non-compete > clause in an employment agreement be enforced after the expiry of the > agreement?" > > On Sat, Jul 9, 2016 at 4:34 AM, Alessandro Benedetti < > abenede...@apache.org> wrote: > >> Hi Puneet, >> your requirement : >> "I would like users to be able to write queries in natural language rather >> than keyword based search." >> >> Is really really vague :( >> Can you try to help us with some specific example, starting of course from >> the simplest use cases you have initially in mind ? >> >> Moving from keyword based search to natural language is a really complex >> task. >> Proceeding step by step can help you. >> >> Do you want for example to set up a Q&A basic system ? >> In that case you should take care of query rewriting. >> You need basically to identify your base requirement and then build a >> specific parser for that. >> You can use triple stores and knowledge bases to enrich both your query >> and >> your index, but let's start from the basis, what is your simplest >> requirement ? >> >> On Fri, Jul 8, 2016 at 1:56 PM, Jay Urbain wrote: >> >> > I've added multivalued fields within my SOLR schema for indexing >> entities >> > extracted using NLP methods applied to the text I'm indexing, along with >> > fields for other discrete data extracted from relational databases. >> > >> > A Java application reads data out of multiple relational databases, uses >> > NLP on the text and indexes each document (de-normalized) using SOLRJ. >> > >> > I initially tried doing this with content handlers, but found it much >> > easier to just write a Java application. >> > >> > SOLRJ Java API reference: >> > https://cwiki.apache.org/confluence/display/solr/Using+SolrJ >> > >> > Stanford NLP: >> > http://stanfordnlp.github.io/CoreNLP/ >> > >> > Best, >> > Jay >> > >> > >> > On Thu, Jul 7, 2016 at 9:52 PM, Puneet Pawaia >> > wrote: >> > >> > > Hi Jay >> > > Any place I can learn more on this method of integration? >> > > Thanks >> > > Puneet >> > > >> > > On 8 Jul 2016 02:58, "Jay Urbain" wrote: >> > > >> > > > I use Stanford NLP and cTakes (based on OpenNLP) while indexing >> with a >> > > > SOLRJ application. >> > > > >> > > > Best, >> > > > Jay >> > > > >> > > > On Thu, Jul 7, 2016 at 12:09 PM, Puneet Pawaia < >> > puneet.paw...@gmail.com> >> > > > wrote: >> > > > >> > > > > Hi >> > > > > >> > > > > I am currently using Solr 5.5.x to test but can upgrade to Solr >> 6.x >> > if >> > > > > required. >> > > > > I am working on a POC for natural language query using Solr. >> Should I >> > > use >> > > > > the Stanford libraries or are there any other libraries having >> > > > integration >> > > > > with Solr already available. >> > > > > Any direction in how to do this would be most appreciated. How >> > should I >> > > > > process the query to give relevant results. >> > > > > >> > > > > Regards >> > > > > Puneet >> > > > > >> > > > >> > > >> > >> >> >> >> -- >> -- >> >> Benedetti Alessandro >> Visiting card : http://about.me/alessandro_benedetti >> >> "Tyger, tyger burning bright >> In the forests of the night, >> What immortal hand or eye >> Could frame thy fearful symmetry?" >> >> William Blake - Songs of Experience -1794 England >> > >