Re: Dismax query special characters

2017-01-29 Thread Ahmet Arslan
Hi,

I don't think dismax recognizes AND OR.
Special characters for dismax are + - and quotes.

In your example, ampersand may causing you trouble. Due to URL encode stuff... 
Ahmet

On Sunday, January 29, 2017 12:17 AM, Jarosław Grązka 
 wrote:



Hi,

Reading Solr documentation about dismax query
https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser i
understood dismax query parser can interpret following special chars:
AND,OR,+,-,quotes (for phrases) and should ignore all others like
||,NOT,&&,~,^ etc and treat them as simple strings.

But when i try query as follows:
{
  "limit" : 10,
  params:{
  defType:"dismax",
  q:"Difference && Java",
  q.op:"OR",
  qf:"body",
  indent: "on"
  }
}
This && opeartors works as AND.

I also got exceptions for this query:
{
  "limit" : 10,
  params:{
  defType:"dismax",
  q:"Difference && Java NOT",
  q.op:"OR",
  qf:"body",
  indent: "on"
  }
}

Did i misunderstand something? Shouldnt it treat 'NOT' as just String?


Re: Dismax query special characters

2017-01-29 Thread Jarosław Grązka
I ended up using simple query parser which probably more fits my
requirements. I can use params:
q.operators="" and it ignores all lucene special functionalities and take
query phrase as it is. But i still think something is wrong with dismax, i
use JSON instead of url params so it should not cause a problem for any
special characters.

My final query is:
{
  limit : 10,
  params:{
  q.operators:"",
  defType:"simple",
  q:"${query}",
  q.op:"${operator}",
  qf:"${fields}",
  indent:"off"
  }
}

2017-01-29 13:19 GMT+01:00 Ahmet Arslan :

> Hi,
>
> I don't think dismax recognizes AND OR.
> Special characters for dismax are + - and quotes.
>
> In your example, ampersand may causing you trouble. Due to URL encode
> stuff...
> Ahmet
>
> On Sunday, January 29, 2017 12:17 AM, Jarosław Grązka <
> jaroslaw.gra...@gmail.com> wrote:
>
>
>
> Hi,
>
> Reading Solr documentation about dismax query
> https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser i
> understood dismax query parser can interpret following special chars:
> AND,OR,+,-,quotes (for phrases) and should ignore all others like
> ||,NOT,&&,~,^ etc and treat them as simple strings.
>
> But when i try query as follows:
> {
>   "limit" : 10,
>   params:{
>   defType:"dismax",
>   q:"Difference && Java",
>   q.op:"OR",
>   qf:"body",
>   indent: "on"
>   }
> }
> This && opeartors works as AND.
>
> I also got exceptions for this query:
> {
>   "limit" : 10,
>   params:{
>   defType:"dismax",
>   q:"Difference && Java NOT",
>   q.op:"OR",
>   qf:"body",
>   indent: "on"
>   }
> }
>
> Did i misunderstand something? Shouldnt it treat 'NOT' as just String?
>


Re: Upgrade SOLR version - facets perfomance regression

2017-01-29 Thread SOLR4189
Method uif: we used it also but it didn't help
Cardinality: high
Field Type: string, tdate
DocValued: yes, for all facet fields
Facet Method: fc (but tried fcs and enum)
Facet Params:
  1. Mincount = 1
  2. Limit = 11
  3. Threads = -1
  4. Query (on tdate field for each query)

My question: if Json Facet Api is good enough and if exists some converter
from old facet api to new facet api?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Upgrade-SOLR-version-facets-perfomance-regression-tp4315027p4317716.html
Sent from the Solr - User mailing list archive at Nabble.com.


Advanced Document Routing Questions

2017-01-29 Thread GW
Hi folks,

1: Can someone point me to some good documentation on how this works? Or is
it so simple that I'm over thinking?

My understanding of document routing is that I might be able to check the
hash of a shard with the hash of the document id and determine the exact
node / document and know exactly where to send the request reducing
Zookeeper traffic.

I'm getting ready to deploy and I have used the recommended format in my
doc id

All my work is REST/curl -> Solrcloud

I plan to watch cluster status through the admin console REST to and build
a list of OK servers to do the reads for the website.

I have a crawler that will be running mostly 3:am Eastern to 3:am Pacific,
outside the bulk of read activity. I plan to do all posts to Who Has
Zookeeper according the admin REST API

Can I get some reassurance? Be gentle, this is my very first solrcloud
deployment and it's going to production. I'm about to write script for
something that I still feel I am week in concept.

When I'm done and I totally understand, I promise to publish a nice A - Z
REST deployment HowTo for HA with class examples in (PHP,Perl,Python)/curl.


Best regards,

GW


Arabic words search in solr

2017-01-29 Thread mohan sundaram
Hi,

In solr search I want to search with product name using Arabic letters.
While searching, Arabic user can feel little default to search some product
name. Because some characters need to mention while searching.

Ex: إ أ آ


In the above mentioned characters, user can get combination of shift key.
Usually if Arabic people will mention “ ا “  character and will get the
below combined words.

Ex: إبرا


In my solr schema.xml I defined product arabic name field as below





  

  











  





What changes I have do in schame.xml. Please help me on this.



 --
Regards,
Mohan.N
096896429683


Re: Arabic words search in solr

2017-01-29 Thread Steve Rowe
Hi Mohan,

The analyzer in your text_ar field type looks like an expanded version of the 
one suggested in the Solr Reference Guide[1].

Can you give an example of a query and the indexed text you expect to match but 
doesn't?

ArabicNormalizationFilterFactory, which uses Lucene’s ArabicNormalizer[2] 
should convert alefs with hamza to plain alef, among several other 
normalizations.

The Light 10 stemming algorithm implemented by ArabicNormalizer and 
ArabicStemmer[3] is described here: 
.

[1] Solr Ref Guide: Language Analysis: Arabic 

[2] ArabicNormalizer javadocs 

[3] ArabicStemmer javadocs 


--
Steve
www.lucidworks.com

> On Jan 29, 2017, at 2:12 PM, mohan sundaram  wrote:
> 
> Hi,
> 
> In solr search I want to search with product name using Arabic letters.
> While searching, Arabic user can feel little default to search some product
> name. Because some characters need to mention while searching.
> 
> Ex: إ أ آ
> 
> 
> In the above mentioned characters, user can get combination of shift key.
> Usually if Arabic people will mention “ ا “  character and will get the
> below combined words.
> 
> Ex: إبرا
> 
> 
> In my solr schema.xml I defined product arabic name field as below
> 
> 
>  stored="true"/>
> 
> 
>   positionIncrementGap="100">
> 
>  
> 
>
> 
>
> 
> words="lang/stopwords_ar.txt" />
> 
>
> 
>
> 
>  
> 
>
> 
> 
> 
> What changes I have do in schame.xml. Please help me on this.
> 
> 
> 
> --
> Regards,
> Mohan.N
> 096896429683



Re: Advanced Document Routing Questions

2017-01-29 Thread Anshum Gupta
SolrCloud auto routes the documents to the correct shard leader, however
you would be able to reduce the extra hop by sending the document to the
correct shard. Here are a few posts that explain how the document routing
in SolrCloud works:

https://lucidworks.com/2013/06/13/solr-cloud-document-routing/
https://lucidworks.com/2014/01/06/multi-level-composite-id-routing-solrcloud/

If the extra hop isn't something you are much bothered about, I wouldn't
suggest adding the complexity to your client code.

The Java client that Solr ships with, SolrJ has a 'smart' client i.e.
CloudSolrClient, that tracks the clusterstate in zk by keeping a watch. It
also contains caching logic, and more to optimize sending requests to a
SolrCloud cluster. You might want to explore that and possibly use that
instead.

-Anshum


On Sun, Jan 29, 2017 at 9:49 AM GW  wrote:

> Hi folks,
>
> 1: Can someone point me to some good documentation on how this works? Or is
> it so simple that I'm over thinking?
>
> My understanding of document routing is that I might be able to check the
> hash of a shard with the hash of the document id and determine the exact
> node / document and know exactly where to send the request reducing
> Zookeeper traffic.
>
> I'm getting ready to deploy and I have used the recommended format in my
> doc id
>
> All my work is REST/curl -> Solrcloud
>
> I plan to watch cluster status through the admin console REST to and build
> a list of OK servers to do the reads for the website.
>
> I have a crawler that will be running mostly 3:am Eastern to 3:am Pacific,
> outside the bulk of read activity. I plan to do all posts to Who Has
> Zookeeper according the admin REST API
>
> Can I get some reassurance? Be gentle, this is my very first solrcloud
> deployment and it's going to production. I'm about to write script for
> something that I still feel I am week in concept.
>
> When I'm done and I totally understand, I promise to publish a nice A - Z
> REST deployment HowTo for HA with class examples in (PHP,Perl,Python)/curl.
>
>
> Best regards,
>
> GW
>


Re: Commit/callbacks doesn't happen on core close

2017-01-29 Thread saiks
Hi All,

We are a big public company and we are evaluating Solr to store hundreds of
tera bytes of data.
Post commit listeners getting called on core close is a must for us.

It would be great if anyone can help us fix the issue or suggest a
workaround :)

Thank you



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-callbacks-doesn-t-happen-on-core-close-tp4316015p4317762.html
Sent from the Solr - User mailing list archive at Nabble.com.


DocValues and facet searches

2017-01-29 Thread Stanislav Sandalnikov
Hello everyone, 

Recently we moved to DocValue fields and now we have a problem when some facet 
queries doesn’t work at all with facet.method different than enum, I know that 
enum method is more efficient for such fields than default one anyway, but I'm 
just curious to find out the reason. Here are some examples:

1) This query works fine with both fc and enum methods:
select?facet.field=datasource&facet=on&indent=on&q=emotions:*&rows=0

2) This query works only with enum method: 
select?fl=taskid,docid&q=*:*&fq=((*:*+NOT+(datasource:Test))+AND+(*:*+NOT+(datasource:wikileaks.org))+AND+(IndexDate:[2014-12-31T16\:00\:00.000Z+TO+2018-12-31T16\:59\:59.999Z]))&facet=true&facet.field=emotions&facet.limit=2147483647&facet.mincount=1&facet.sort=count+desc

a) FC, debug output here: http://pastebin.com/TJqdJukg

b) ENUM, debug output here: http://pastebin.com/XJLmdxxG 


  
Here is the schema fragment with description of related fields:


…




Is this somehow related to emotions field being multivalued?

Regards
Stanislav





How to create solr custom filter

2017-01-29 Thread Mugeesh Husain
Hi,

I am looking for how to create custom filter or tokenizer, I check this blog
http://solr.pl/en/2012/05/14/developing-your-own-solr-filter/ but they have
not  described how to setup your IDE, how to compile your code or add libs
etc.  I don't know how to compile code after fallowing above article, This
is very old blog for solr3 version.

Please give me suggestion where i should be starting for writing filter, how
to compile or share any suitable article if exist in web.

 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-create-solr-custom-filter-tp4317767.html
Sent from the Solr - User mailing list archive at Nabble.com.