"Path must not end with / character" error during performance tests

2013-08-20 Thread Tanya
Hi,

I have integrated SolrCloud search in some system with single shard and
works fine on single tests.

Recently we started to run performance test and we are getting following
exception after a while.

Help is really appreciated,

Thanks
Tanya

2013-08-20 10:45:56,128 [cTaskExecutor-1] ERROR
LoggingAspect  - Exception

java.lang.RuntimeException: java.lang.IllegalArgumentException: Path must
not end with / character

at
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:123)

at
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:88)

at
org.apache.solr.common.cloud.ZkStateReader.(ZkStateReader.java:148)

at
org.apache.solr.client.solrj.impl.CloudSolrServer.connect(CloudSolrServer.java:147)

at
org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:173)

at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)

at
com.mcm.search.solr.SolrDefaultIndexingAdapter.updateEntities(SolrDefaultIndexingAdapter.java:109)

at
com.mcm.search.AbstractSearchEngineMediator.updateDocuments(AbstractSearchEngineMediator.java:89)

at
com.alu.dmsp.search.mediator.wrapper.SearchMediatorWrapper.updateDocuments(SearchMediatorWrapper.java:255)

at
com.alu.dmsp.business.module.beans.DiscoverySolrIndexingRequestHandler.executeLogic(DiscoverySolrIndexingRequestHandler.java:122)

at
com.alu.dmsp.business.module.beans.DiscoverySolrIndexingRequestHandler.executeLogic(DiscoverySolrIndexingRequestHandler.java:36)

at
com.alu.dmsp.common.business.BasicBusinessServiceExecutionRequestHandler.executeRequest(BasicBusinessServiceExecutionRequestHandler.java:107)

at
com.alu.dmsp.common.business.BasicBusinessServiceExecutionRequestHandler.execute(BasicBusinessServiceExecutionRequestHandler.java:84)

at
com.alu.dmsp.common.business.beans.BasicBusinessService.executeRequest(BasicBusinessService.java:92)

at
com.alu.dmsp.common.business.beans.BasicBusinessService.execute(BasicBusinessService.java:79)

at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:319)

at
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)

at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)

at
org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:80)

at
com.alu.dmsp.common.log.LoggingAspect.logWebServiceMethodCall(LoggingAspect.java:143)

at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at
org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:621)

at
org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:610)

at
org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:65)

at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)

at
org.springframework.aop.aspectj.AspectJAfterThrowingAdvice.invoke(AspectJAfterThrowingAdvice.java:55)

at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)

at
org.springframework.aop.framework.adapter.MethodBeforeAdviceInterceptor.invoke(MethodBeforeAdviceInterceptor.java:50)

at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)

at
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:90)

at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)

at
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)

at com.sun.proxy.$Proxy44.execute(Unknown Source)

   at
com.alu.dmsp.business.indexing.beans.DiscoveryIndexingRequestHandler.indexEntity(DiscoveryIndexingRequestHandler.java:54)

at
com.alu.dmsp.business.indexing.beans.DiscoveryIndexingRequestHandler.executeLogic(DiscoveryIndexingRequestHandler.java:42)

at
com.alu.dmsp.business.indexing.beans.DiscoveryIndexingRequestHandler.executeLogic(DiscoveryIndexingRequestHandler.java:26)

at

Re: "Path must not end with / character" error during performance tests

2013-08-21 Thread Tanya
Eric,

The issue is that the problem happens not on every call, I assume that
there is not a configuration problem.

BR

Tanya



>It looks like you've specified your zkHost (?) as something like
>machine:port/solr/
>
>rather than
>machine:port/solr
>
>Is that possible?
>
>Best,
>Erick




On Tue, Aug 20, 2013 at 7:10 PM, Tanya  wrote:

> Hi,
>
> I have integrated SolrCloud search in some system with single shard and
> works fine on single tests.
>
> Recently we started to run performance test and we are getting following
> exception after a while.
>
> Help is really appreciated,
>
> Thanks
> Tanya
>
> 2013-08-20 10:45:56,128 [cTaskExecutor-1] ERROR
> LoggingAspect  - Exception
>
> java.lang.RuntimeException: java.lang.IllegalArgumentException: Path must
> not end with / character
>
> at
> org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:123)
>
> at
> org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:88)
>
> at
> org.apache.solr.common.cloud.ZkStateReader.(ZkStateReader.java:148)
>
> at
> org.apache.solr.client.solrj.impl.CloudSolrServer.connect(CloudSolrServer.java:147)
>
> at
> org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:173)
>
> at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
>
> at
> com.mcm.search.solr.SolrDefaultIndexingAdapter.updateEntities(SolrDefaultIndexingAdapter.java:109)
>
> at
> com.mcm.search.AbstractSearchEngineMediator.updateDocuments(AbstractSearchEngineMediator.java:89)
>
> at
> com.alu.dmsp.search.mediator.wrapper.SearchMediatorWrapper.updateDocuments(SearchMediatorWrapper.java:255)
>
> at
> com.alu.dmsp.business.module.beans.DiscoverySolrIndexingRequestHandler.executeLogic(DiscoverySolrIndexingRequestHandler.java:122)
>
> at
> com.alu.dmsp.business.module.beans.DiscoverySolrIndexingRequestHandler.executeLogic(DiscoverySolrIndexingRequestHandler.java:36)
>
> at
> com.alu.dmsp.common.business.BasicBusinessServiceExecutionRequestHandler.executeRequest(BasicBusinessServiceExecutionRequestHandler.java:107)
>
> at
> com.alu.dmsp.common.business.BasicBusinessServiceExecutionRequestHandler.execute(BasicBusinessServiceExecutionRequestHandler.java:84)
>
> at
> com.alu.dmsp.common.business.beans.BasicBusinessService.executeRequest(BasicBusinessService.java:92)
>
> at
> com.alu.dmsp.common.business.beans.BasicBusinessService.execute(BasicBusinessService.java:79)
>
> at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:606)
>
> at
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:319)
>
> at
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>
> at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>
> at
> org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:80)
>
> at
> com.alu.dmsp.common.log.LoggingAspect.logWebServiceMethodCall(LoggingAspect.java:143)
>
> at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:606)
>
> at
> org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:621)
>
> at
> org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:610)
>
> at
> org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:65)
>
> at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
>
> at
> org.springframework.aop.aspectj.AspectJAfterThrowingAdvice.invoke(AspectJAfterThrowingAdvice.java:55)
>
> at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
>
> at
> org.springframework.aop.framework.adapter.MethodBeforeAdviceInterceptor.invoke(MethodBeforeAdviceInterceptor.java:50)
>
> at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(Reflec

SOLR 7.0 DIH out of memory issue with sqlserver

2018-09-18 Thread Tanya Bompi
Hi,
  I have the SOLR 7.0 setup with the DataImportHandler connecting to the
sql server db. I keep getting OutOfMemory: Java Heap Space when doing a
full import. The size of the records is around 3 million so not very huge.
I tried the following steps and nothing helped thus far.

1. Setting the "responseBuffering=adaptive;selectMethod=Cursor" in the jdbc
connection string.
2. Setting the batchSize="-1" which hasnt helped
3. Increasing the heap size at solr startup by issuing the command \solr
start -m 1024m -p 8983
Increasing the heap size further doesnt start SOLR instance itself.

I am wondering what could be causing the issue and how to resolve this.
Below is the data-config :


  
  

Thanks,
Tanya


Re: SOLR 7.0 DIH out of memory issue with sqlserver

2018-09-18 Thread Tanya Bompi
Hi,
  I am using the Microsoft Jdbc driver 6.4 version in Solr 7.4.0 . I have
tried removing the selectMethod=Cursor and still it runs out of heap space.
Do we have anyone who has faced similar issue.

Thanks
Tanya


On Tue, Sep 18, 2018 at 6:38 PM Shawn Heisey  wrote:

> On 9/18/2018 4:48 PM, Tanya Bompi wrote:
> >I have the SOLR 7.0 setup with the DataImportHandler connecting to the
> > sql server db. I keep getting OutOfMemory: Java Heap Space when doing a
> > full import. The size of the records is around 3 million so not very
> huge.
> > I tried the following steps and nothing helped thus far.
>
> See this wiki page:
>
> https://wiki.apache.org/solr/DataImportHandlerFaq
>
> You already have the suggested fix -- setting responseBuffering to
> adaptive.  You might try upgrading the driver.  If that doesn't work,
> you're probably going to need to talk to Microsoft about what you need
> to do differently on the JDBC url.
>
> I did find this page:
>
>
> https://docs.microsoft.com/en-us/sql/connect/jdbc/using-adaptive-buffering?view=sql-server-2017
>
> This says that when using adaptive buffering, you should avoid using
> selectMethod=cursor.  So you should try removing that parameter.
>
> Thanks,
> Shawn
>
>


Re: SOLR 7.0 DIH out of memory issue with sqlserver

2018-09-19 Thread Tanya Bompi
Hi Erick,
  Thank you for the follow-up. I have resolved the issue with the increase
in heapSize and I am able to set the SOLR VM to initialize with a  3G heap
size and the subset of 1 mil records was fetched successfully. Although it
fails with the entire 3 mil records. So something is off with the adaptive
buffering setting as I see  it is not helping. And I also set the
autosoftcommit param. I might have to increase the heap size further to see
if it helps. I will keep posted if my issue doesnt resolve.

Thanks,
Tanya

On Wed, Sep 19, 2018 at 8:22 AM Erick Erickson 
wrote:

> Has this ever worked? IOW, is this something that's changed or has
> just never worked?
>
> The obvious first step is to start Solr with more than 1G of memory.
> Solr _likes_ memory and a 1G heap is quite small. But you say:
> "Increasing the heap size further doesnt start SOLR instance itself.".
> How much RAM do you have on your machine? What other programs are
> running? You should be able to increase the heap and start Solr if you
> have the RAM on your machine so I'd figure out what's behind that
> issue first. I regularly start Solr with 16 or 32G of memory on my
> local machines, I know of installations running Solr with 60G heaps so
> this points to something really odd about your environment.
>
> When you "increase it further", exactly _how_ does Solr fail to start?
> What appears in the Solr logs? etc. Really, troubleshoot that issue
> first I'd recommend.
>
> If DIH still needs a ridiculous amount of memory, it's usually the
> JDBC driver trying to read all the rows into memory at once and you'll
> have to explore the jdbc driver settings in detail.
>
> Best,
> Erick
> On Tue, Sep 18, 2018 at 11:16 PM Tanya Bompi 
> wrote:
> >
> > Hi,
> >   I am using the Microsoft Jdbc driver 6.4 version in Solr 7.4.0 . I have
> > tried removing the selectMethod=Cursor and still it runs out of heap
> space.
> > Do we have anyone who has faced similar issue.
> >
> > Thanks
> > Tanya
> >
> >
> > On Tue, Sep 18, 2018 at 6:38 PM Shawn Heisey 
> wrote:
> >
> > > On 9/18/2018 4:48 PM, Tanya Bompi wrote:
> > > >I have the SOLR 7.0 setup with the DataImportHandler connecting
> to the
> > > > sql server db. I keep getting OutOfMemory: Java Heap Space when
> doing a
> > > > full import. The size of the records is around 3 million so not very
> > > huge.
> > > > I tried the following steps and nothing helped thus far.
> > >
> > > See this wiki page:
> > >
> > > https://wiki.apache.org/solr/DataImportHandlerFaq
> > >
> > > You already have the suggested fix -- setting responseBuffering to
> > > adaptive.  You might try upgrading the driver.  If that doesn't work,
> > > you're probably going to need to talk to Microsoft about what you need
> > > to do differently on the JDBC url.
> > >
> > > I did find this page:
> > >
> > >
> > >
> https://docs.microsoft.com/en-us/sql/connect/jdbc/using-adaptive-buffering?view=sql-server-2017
> > >
> > > This says that when using adaptive buffering, you should avoid using
> > > selectMethod=cursor.  So you should try removing that parameter.
> > >
> > > Thanks,
> > > Shawn
> > >
> > >
>


Nested entity unrolled record

2018-09-21 Thread Tanya Bompi
Hi,
  I am wondering if with a nested entity using the DataImportHandler, if
its possible to unroll the parent document with each of the child document
as a separate entry in the index.
I am using Solr 7.4.0 version.

For eg.




  



So what i would desire the index to create multiple entries for each
Id,Name,Country combination.
Currently with the above config it combines all the entries of country
together in a single field for each Parent record.

What would the change look like? In my scenario, the child table has
multiple fields and combining all the data together for each field is not
what I desire.

Kindly let me know.
Thanks,
Tanya


Boosting score based off a match in a particular field

2018-11-28 Thread Tanya Bompi
Hi,
  I have an index that is built using a combination of fields (Title,
Description, Phone, Email etc). I have an indexed all the fields and the
combined copy field as well.
In the query that i have which is a combination of all the fields as input
(Title + Description+Phone+email).
There are some samples where if the Email/Phone has a match the resulting
Solr score is lower still. I have tried boosting the fields say Email^2 but
that results in any token in the input query being matched against the
email which results in erroneous results.

How can i formulate a query that I can boost for Email to match against
Email with a boost along with the combined field match against the combined
field index.

Thanks,
Tanya


Re: Boosting score based off a match in a particular field

2018-11-28 Thread Tanya Bompi
Hi Doug,
  Thank you for your response. I tried the above boost syntax but I get the
following error of going into an infinite loop. In the wiki page I couldnt
figure out what the 'v' parameter is. (
https://lucene.apache.org/solr/guide/7_0/the-extended-dismax-query-parser.html).
I will try the analysis tool as well.

"bq":"{!edismax mm=80% qf=ContactEmail^100 v=$q}"}},
"error":{ "metadata":[ "error-class","org.apache.solr.common.SolrException",
"root-error-class","org.apache.solr.search.SyntaxError"],
"msg":"org.apache.solr.search.SyntaxError:
Infinite Recursion detected parsing query

Thank you,
Tanya

On Wed, Nov 28, 2018 at 12:36 PM Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:

> The terminology we use at my company is you want to *gate* the effect of
> boost to only very precise scenarios. A lot of this depends on how your
> Email and Phone numbers are being tokenized/analyzed (ie what analyzer is
> on the field type), because you really only want to boost when you have
> high confidence email/phone number matches. You may actually have more of a
> matching problem than a relevance problem. You can debug this in the Solr
> analysis screen.
>
> Another tool you can use is putting a mm on just the boost query. This
> gates that specific boost based on how many query terms match that field.
> It's good for doing a kind of poor-man's entity recognition (how much does
> the query correspond to one kind of entity)
>
> Something like
>
> bq={!edismax mm=80% qf=Email^100 v=$q} <--Boost emails only when there's a
> strong match, 80% of query terms match the email
>
> alongside your main qf with the combined field
>
> qf=text_all
>
> There's a lot of strategies, and it usually involves a combination of query
> and analysis work (and lots of good test data to prove your approach works)
>
> (shameless plug is we cover a lot of this in Solr relevance training
> https://opensourceconnections.com/events/training/)
>
> Hope that helps
> -Doug
>
>
> On Wed, Nov 28, 2018 at 3:21 PM Tanya Bompi  wrote:
>
> > Hi,
> >   I have an index that is built using a combination of fields (Title,
> > Description, Phone, Email etc). I have an indexed all the fields and the
> > combined copy field as well.
> > In the query that i have which is a combination of all the fields as
> input
> > (Title + Description+Phone+email).
> > There are some samples where if the Email/Phone has a match the resulting
> > Solr score is lower still. I have tried boosting the fields say Email^2
> but
> > that results in any token in the input query being matched against the
> > email which results in erroneous results.
> >
> > How can i formulate a query that I can boost for Email to match against
> > Email with a boost along with the combined field match against the
> combined
> > field index.
> >
> > Thanks,
> > Tanya
> >
> --
> CTO, OpenSource Connections
> Author, Relevant Search
> http://o19s.com/doug
>


Similarity plugins which are normalized

2018-11-29 Thread Tanya Bompi
Hi,
  As I am tuning the relevancy of my query parser, I see that 2 different
queries with  phrase matches get very different scores primarily influenced
by the Term Frequency component. Since I am using a threshold to filter the
results for a matched record based off the SOLR score, a somewhat
normalized score is needed.
Are there any similarity classes that are more suitable to my needs?

Thanks,
Tanu


Re: Similarity plugins which are normalized

2018-11-29 Thread Tanya Bompi
Thanks a lot Doug. Maybe setting more importance to certain fields is the
way to go in conjunction with the overall match.

Tanu

On Thu, Nov 29, 2018 at 1:52 PM Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:

> The usual advice is relevance scores don’t exist on a scale where a
> threshold is useful. As these are just heuristics used for ranking , not a
> confidence level.
>
> I would instead focus on what attributes of a document consider it relevant
> or not (strong match in certain fields).
>
> A couple of things prevent field scores from being comparable:
> - doc freq differs per field
> - field length/ avg field length differs per field
> - typical term frequency of a term in a field differs
>
> You might find this article useful:
>
>
> https://opensourceconnections.com/blog/2013/07/02/getting-dissed-by-dismax-why-your-incorrect-assumptions-about-dismax-are-hurting-search-relevancy/
>
> Doug
>
> On Thu, Nov 29, 2018 at 4:44 PM Tanya Bompi  wrote:
>
> > Hi,
> >   As I am tuning the relevancy of my query parser, I see that 2 different
> > queries with  phrase matches get very different scores primarily
> influenced
> > by the Term Frequency component. Since I am using a threshold to filter
> the
> > results for a matched record based off the SOLR score, a somewhat
> > normalized score is needed.
> > Are there any similarity classes that are more suitable to my needs?
> >
> > Thanks,
> > Tanu
> >
> --
> *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections
> <http://opensourceconnections.com>, LLC | 240.476.9983
> Author: Relevant Search <http://manning.com/turnbull>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>


Post/Get API call for Solr results differ from web query

2018-12-13 Thread Tanya Bompi
Hi,
  I have a python scraper to query the solr index and get the top 'n'
results and process further. What I see is with the setting of the bq
parameter which has a lot of special chars, the output from the python call
differs from the results I get from the query issued in the web Solr
portal. I am trying to understand what could be causing the differences.
Without the bq parameter setting, the results are the same from the python
request calls to the web output.

The python code for the request are as follows where the params should be
url encoded and from what I understand no additional processing is needed.
Below is the snippet of code being issued:

import requests
payload = {'defType':'edismax', 'fl':'*,score','rows':'3',
'fq':'Status:Active', 'bq':'{!edismax bq='' mm=50% qf="ContactEmail^2
ContactName^2 URL^2 CompanyName^2"  v=$q}', 'qf':'searchfield', 'q':input}
r = requests.post(SolrIndexUrl, data=payload)
response = r.json()


The responseHeaders from the python call are:
"responseHeader":{
"status":0,
"QTime":0,
"params":{
  "q":"sample input",
  "defType":"edismax",
  "qf":"searchfield",
  "fl":"*,score",
  "fq":"Status:Active",
  "rows":"3",
  "bq":"{!edismax bq= mm=50% qf=\"ContactEmail^2 ContactName^2 URL^2
CompanyName^2\"  v=$q}"}
  },

The response headers from the Json output on the Solr Web portal are:
"responseHeader":{ "status":0, "QTime":0, "params":{ "q":"sample input", "
defType":"edismax", "qf":"searchfield", "fl":"*,score", "fq":"Status:active",
"_":"1544661362690", "bq":"{!edismax bq='' mm=50% qf=\"ContactEmail^2
ContactName^2 URL^2 CompanyName^2\" v=$q}"}},

The response headers seem to match but not the results output. Could
someone let me know what could be issue?

Thanks,
Tanya


Re: Post/Get API call for Solr results differ from web query

2018-12-13 Thread Tanya Bompi
I found the issue. The single quotes in the request params were set to
empty and  require to be escaped with a '\'. Python instead of throwing an
error message was simply concatenating the string which is a poor design.
Below payload yields the same results as the web request in the solr
portal.

payload = {'defType':'edismax', 'fl':'*,score','rows':'3',
'fq':'Status:Active', 'bq':'{!edismax bq=\'\' mm=50% qf="ContactEmail^2
ContactName^2 URL^2 CompanyName^2"  v=$q}', 'qf':'searchfield', 'q':input}

Tanya

On Thu, Dec 13, 2018 at 10:02 AM Jan Høydahl  wrote:

> Hi
>
> I don't see what the actual problem is here.
> What were you expecting to see in the response, and what do you see?
>
> Please try to reproduce the issue you think you are seeing with a tool
> like cURL, you can issue both POST and GET requests with cURL and many
> other tools. The result should be exactly the same whether you POST or GET
> your query. My guess is that you have some mismatch in encoding, escaping
> or similar in the Python Solr client you are using. So that is why I
> encourage you to reproduce the issue with cURL or in the browser.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> > 13. des. 2018 kl. 18:52 skrev Tanya Bompi :
> >
> > Hi,
> >  I have a python scraper to query the solr index and get the top 'n'
> > results and process further. What I see is with the setting of the bq
> > parameter which has a lot of special chars, the output from the python
> call
> > differs from the results I get from the query issued in the web Solr
> > portal. I am trying to understand what could be causing the differences.
> > Without the bq parameter setting, the results are the same from the
> python
> > request calls to the web output.
> >
> > The python code for the request are as follows where the params should be
> > url encoded and from what I understand no additional processing is
> needed.
> > Below is the snippet of code being issued:
> >
> > import requests
> > payload = {'defType':'edismax', 'fl':'*,score','rows':'3',
> > 'fq':'Status:Active', 'bq':'{!edismax bq='' mm=50% qf="ContactEmail^2
> > ContactName^2 URL^2 CompanyName^2"  v=$q}', 'qf':'searchfield',
> 'q':input}
> > r = requests.post(SolrIndexUrl, data=payload)
> > response = r.json()
> >
> >
> > The responseHeaders from the python call are:
> > "responseHeader":{
> >"status":0,
> >"QTime":0,
> >"params":{
> >  "q":"sample input",
> >  "defType":"edismax",
> >  "qf":"searchfield",
> >  "fl":"*,score",
> >  "fq":"Status:Active",
> >  "rows":"3",
> >  "bq":"{!edismax bq= mm=50% qf=\"ContactEmail^2 ContactName^2 URL^2
> > CompanyName^2\"  v=$q}"}
> >  },
> >
> > The response headers from the Json output on the Solr Web portal are:
> > "responseHeader":{ "status":0, "QTime":0, "params":{ "q":"sample input",
> "
> > defType":"edismax", "qf":"searchfield", "fl":"*,score",
> "fq":"Status:active",
> > "_":"1544661362690", "bq":"{!edismax bq='' mm=50% qf=\"ContactEmail^2
> > ContactName^2 URL^2 CompanyName^2\" v=$q}"}},
> >
> > The response headers seem to match but not the results output. Could
> > someone let me know what could be issue?
> >
> > Thanks,
> > Tanya
>
>


terms not to match in a search query

2018-12-13 Thread Tanya Bompi
Hi,
  If there are certain terms in the query like "pvt", "ltd" which I
wouldn't want to be matched against the index, is there a way to specify
the list of words that I could set in the configuration and not make it
part of the query.

Say, is it possible to add the terms to stopwords.txt or any other file
that could be treated as a blacklist which at querying time will be taken
of.

Also, is there a configuration setting to be able to set a min length of
the words that should be used in the matching when retrieving the
documents? Basically any words after tokenization of length < 3 to be
ignored.

Kindly let me know.

Thanks,
Tanya