How to set NOT clause on Date range query in Solr

2016-09-20 Thread Sandeep Khanzode
Have been trying to understand this for a while ...How can I specify NOT clause 
in the following query?{!field f=schedule op=Intersects}[2016-08-26T12:30:00Z 
TO 2016-08-26T18:30:00Z]{!field f=schedule op=Contains}[2016-08-26T12:30:00Z TO 
2016-08-26T18:30:00Z]Like, without LocalParams, we can specify 
-DateField:[2016-08-26T12:30:00Z TO 2016-08-26T18:30:00Z] to get an equivalent 
NOT clause. But, I need a NOT Contains Date Range query.I have tried a few 
options but I end up getting parsing errors. Surely there must be some obvious 
way I am missing. SRK

Re: I cannot get phrases highlighted correctly without using the Fast Vector highlighter

2016-09-20 Thread Koji Sekiguchi

Hello Panagiotis,

I'm sorry but it's a feature. As for hl.usePhraseHighlighter parameter, when 
you turn off it,
you may get only foo or bar highlighted in your snippets.

Koji

On 2016/09/18 15:55, Panagiotis T wrote:

I'm using Solr 6.2 (tried with 6.1 also)

I created a new core and the only change I made is adding the
following line in my schema.xml



I've indexed two simple xml files. Here's a sample:



foo bar
foo bar



I'm executing a simple query:
http://localhost:8983/solr/test/select?hl.fl=body_text_en&hl=on&indent=on&q=%22foo%20bar%22&wt=json

And here is the response:

  "response":{"numFound":2,"start":0,"docs":[
  {
"id":"foo bar",
"body_text_en":["foo bar"],
"_version_":1545790848171507712},
  {
"id":"foo bar2",
"body_text_en":["I strongly suspect that foo bar"],
"_version_":1545790848184090624}]
  },
  "highlighting":{
"foo bar":{
  "body_text_en":["foo bar"]},
"foo bar2":{
  "body_text_en":["I strongly suspect that foo bar"]}}}

If I append hl.useFastVectorHighlighter=true to my query the
highlighter correctly highlights the phrase as foo bar. Of
course I've tried explicitly appending hl.usePhraseHighlighter=true to
my query but I get the same result. I would like to get the same
result with the standard highlighter if possible.


Regards





Index Size in String Field vs Text Field

2016-09-20 Thread Zheng Lin Edwin Yeo
Hi,

Would like to check, will the index size for fields which has been defined
as String be generally smaller than fields which has been defined as a Text
Field (Eg: KeywordTokenizerFactory)?

Assuming that both of them contains the same value in the fields, and there
is no additional filters for KeywordTokenizerFactory.

I'm using Solr 6.2.0

Regards,
Edwin


Re: [Rerank Query] Distributed search + pagination

2016-09-20 Thread Alessandro Benedetti
Perfect Joel,
keep me updated !

Cheers

On Mon, Sep 19, 2016 at 10:26 PM, Joel Bernstein  wrote:

> Alessandro, I'll be doing some testing with the re-ranker as part of
> SOLR-9403 for Solr 6.3. I'll see if I can better understand the issue
> you're bringing up during the testing. I'll report back to this thread
> after I've done some testing.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Fri, Sep 16, 2016 at 11:17 AM, Alessandro Benedetti <
> abenede...@apache.org> wrote:
>
> > In addition to that, I think the only way to solve this is to rely on the
> > aggregator node to actually re-rank after having aggregated.
> >
> > Cheer
> >
> > On Fri, Sep 9, 2016 at 11:48 PM, Alessandro Benedetti <
> > abenede...@apache.org
> > > wrote:
> >
> > > Let me explain further,
> > > let's assume a simple case when we have 2 shards.
> > > ReRankDocs =10 , rows=10 .
> > >
> > > Correct me if I am wrong Joel,
> > > What we would like :
> > > 1 page : top 10 re-scored
> > > 2 page: remaining 10 re-scored
> > > From page 3 the original scored docs.
> > > This is what is happening in a single sol instance if we put reRankDocs
> > to
> > > 20.
> > >
> > > Let's see with sharing :
> > > To get the first page we get top 10 ( re-scored) from shard1 and top 10
> > > reranked for shard 2.
> > > Then the merged top 10 ( re-scored) will be calculated, and that is the
> > > page 1.
> > >
> > > But when we require the page 2, which means we additionally ask now :
> > > 20 docs to shard1, 10 re-scored and 10 not.
> > > 20 docs to shard2, 10 re-scored and 10 not.
> > > At this point we have 40 docs to merge and rank..
> > > The docs with the original score can go at any position ( not
> necessarily
> > > the last 20)
> > > In the page 2 we can find potentially docs with the original score.
> > > This is even more likely if the scores are on differente scales (e.g.
> the
> > > re-scored 0100 ) .
> > >
> > > Am I right ?
> > > Did I make any wrong assumption so far ?
> > >
> > > Cheers
> > >
> > >
> > > On Fri, Sep 9, 2016 at 7:47 PM, Joel Bernstein 
> > wrote:
> > >
> > >> I'm not understanding where the inconsistency comes into play.
> > >>
> > >> The re-ranking occurs on the shards. The aggregator node will be sent
> > some
> > >> docs that have been re-scored and others that are not. But the sorting
> > >> should be the same as someone pages through the result set.
> > >>
> > >>
> > >>
> > >> Joel Bernstein
> > >> http://joelsolr.blogspot.com/
> > >>
> > >> On Fri, Sep 9, 2016 at 9:28 AM, Alessandro Benedetti <
> > >> abenede...@apache.org>
> > >> wrote:
> > >>
> > >> > Hi guys,
> > >> > was just experimenting some reranker with really low number of
> rerank
> > >> docs
> > >> > ( 10= pageSize) .
> > >> > Let's focus on the distributed enviroment and  the manual sharding
> > >> > approach.
> > >> >
> > >> > Currently what happens is that the reranking task is delivered by
> the
> > >> > shards, they rescore the docs and then send them back to the
> > aggregator
> > >> > node.
> > >> >
> > >> > If you want to rerank only few docs ( leaving the others with the
> > >> original
> > >> > score following), this can be done in a single Solr instance ( the
> > >> howmany
> > >> > logic manages that in the reranker) .
> > >> >
> > >> > What happens when you move to a distributed environment ?
> > >> > The aggregator will aggregate both rescored and original scored
> > >> documents,
> > >> > making the final ranking inconsistent.
> > >> > In the other hand if we make the rarankingDocs threshold dynamic (
> to
> > >> adapt
> > >> > to start+rows) we can incur in the very annoying issue of having a
> > >> document
> > >> > sliding through the pages ( visible in the first page , then
> appearing
> > >> > again in the third ect ect).
> > >> >
> > >> > Any thought ?
> > >> >
> > >> > Cheers
> > >> >
> > >> > --
> > >> > --
> > >> >
> > >> > Benedetti Alessandro
> > >> > Visiting card : http://about.me/alessandro_benedetti
> > >> >
> > >> > "Tyger, tyger burning bright
> > >> > In the forests of the night,
> > >> > What immortal hand or eye
> > >> > Could frame thy fearful symmetry?"
> > >> >
> > >> > William Blake - Songs of Experience -1794 England
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > --
> > >
> > > Benedetti Alessandro
> > > Visiting card : http://about.me/alessandro_benedetti
> > >
> > > "Tyger, tyger burning bright
> > > In the forests of the night,
> > > What immortal hand or eye
> > > Could frame thy fearful symmetry?"
> > >
> > > William Blake - Songs of Experience -1794 England
> > >
> >
> >
> >
> > --
> > --
> >
> > Benedetti Alessandro
> > Visiting card : http://about.me/alessandro_benedetti
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England
> >
>



-- 
--

Benedetti Alessandro

Negative Date Query for Local Params in Solr

2016-09-20 Thread Sandeep Khanzode
For Solr 6.1.0
This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z

This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO 
2016-08-26T15:00:12Z]


Why does this not work?-{!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO 
2016-08-26T15:00:12Z]
 SRK

Solr Special Character Search

2016-09-20 Thread Cheatham, Kevin
Hello - Has anyone out there had success with anything similar to our issue 
below and be kind enough to share?

We posted several files as text and we're able to search for alphanumeric 
characters, but not able to search for special characters such as @ or © 
through Solrcloud Admin 5.2 UI.  
We've searched through lots of documentation but haven't had success yet.  

We also tried posting files not as text but seems we're not able to search for 
any special characters below hexadecimal 20.

Any assistance would be greatly appreciated!

Thanks!

Kevin Cheatham | Office (314) 573-5534 | kevin.cheat...@graybar.com 
www.graybar.com - Graybar Works to Your Advantage 
  


Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread David Smiley
It should, I think... what happens? Can you ascertain the nature of the
results?
~ David

On Tue, Sep 20, 2016 at 5:35 AM Sandeep Khanzode
 wrote:

> For Solr 6.1.0
> This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z
>
> This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> 2016-08-26T15:00:12Z]
>
>
> Why does this not work?-{!field f=schedule
> op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]
>  SRK

-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread Sandeep Khanzode
This is what I get ... 
{ "responseHeader": { "status": 400, "QTime": 1, "params": { "q": "-{!field 
f=schedule op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]", 
"indent": "true", "wt": "json", "_": "1474373612202" } }, "error": { "msg": 
"Invalid Date in Date Math String:'[2016-08-26T12:00:12Z'", "code": 400 }}
 SRK 

On Tuesday, September 20, 2016 5:34 PM, David Smiley 
 wrote:
 

 It should, I think... what happens? Can you ascertain the nature of the
results?
~ David

On Tue, Sep 20, 2016 at 5:35 AM Sandeep Khanzode
 wrote:

> For Solr 6.1.0
> This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z
>
> This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> 2016-08-26T15:00:12Z]
>
>
> Why does this not work?-{!field f=schedule
> op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]
>  SRK

-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


   

Re: Solr Special Character Search

2016-09-20 Thread Alexandre Rafalovitch
What's your field definition? What happens when the text goes through the
analysis chain as you can test in Admin UI?

Regards,
   Alex

On 20 Sep 2016 6:49 PM, "Cheatham, Kevin" 
wrote:

> Hello - Has anyone out there had success with anything similar to our
> issue below and be kind enough to share?
>
> We posted several files as text and we're able to search for alphanumeric
> characters, but not able to search for special characters such as @ or ©
> through Solrcloud Admin 5.2 UI.
> We've searched through lots of documentation but haven't had success yet.
>
> We also tried posting files not as text but seems we're not able to search
> for any special characters below hexadecimal 20.
>
> Any assistance would be greatly appreciated!
>
> Thanks!
>
> Kevin Cheatham | Office (314) 573-5534 | kevin.cheat...@graybar.com
> www.graybar.com - Graybar Works to Your Advantage
>
>


Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread David Smiley
OH!  Ok the moment the query no longer starts with "{!", the query is
parsed by defType (for 'q') and will default to lucene QParser.  So then it
appears we have a clause with a NOT operator.  In this parsing mode,
embedded "{!" terminates at the "}".  This means you can't put the
sub-query text after the "}", you instead need to put it in the special "v"
local-param.  e.g.:
-{!field f=schedule op=Contains v='[2016-08-26T12:00:12Z TO
2016-08-26T15:00:12Z]'}

On Tue, Sep 20, 2016 at 8:15 AM Sandeep Khanzode
 wrote:

> This is what I get ...
> { "responseHeader": { "status": 400, "QTime": 1, "params": { "q":
> "-{!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> 2016-08-26T15:00:12Z]", "indent": "true", "wt": "json", "_":
> "1474373612202" } }, "error": { "msg": "Invalid Date in Date Math
> String:'[2016-08-26T12:00:12Z'", "code": 400 }}
>  SRK
>
> On Tuesday, September 20, 2016 5:34 PM, David Smiley <
> david.w.smi...@gmail.com> wrote:
>
>
>  It should, I think... what happens? Can you ascertain the nature of the
> results?
> ~ David
>
> On Tue, Sep 20, 2016 at 5:35 AM Sandeep Khanzode
>  wrote:
>
> > For Solr 6.1.0
> > This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z
> >
> > This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> > 2016-08-26T15:00:12Z]
> >
> >
> > Why does this not work?-{!field f=schedule
> > op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]
> >  SRK
>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
>
>

-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


problem setting default params for /browse (velocity) queries

2016-09-20 Thread Matt Work Coarr
Good morning. I'm attempting to set some default query parameters via
solrconfig.xml but they are not taking effect.  I'm using the /browse
interface (e.g. velocity templates).

To keep it simple, let's start with "fl".  When I specify fl as a url
parameter, it does take effect.  But when I put it in solrconfig.xml, it's
not being used.

Right now, I'm putting this in a new init params element:


  
field1,field2,field3
  



Then I tell the /browse request handler to add it to the list of useParams
(I've tried the beginning and end of the list):



I have also tried adding "useParams=myParams" in the url.

FYI, I have been including wt=xml in my url so that I eliminate the
possibility that the velocity templates are causing a problem.

Any ideas?

Thanks!
Matt


Re: problem setting default params for /browse (velocity) queries

2016-09-20 Thread Erick Erickson
I'm not entirely sure about this, but I think you're mixing up
useParams and initParams.

If I'm reading the docs correctly, useParams is used in conjunction
with the "Request
Parameters API" and whatever you're setting is stored in a separate
file (params.json),
not solrconfig.xml.

This is separate from the initParams which is all internal to
solroconfig.xml. I _think_
you'll be fine if you substitute "initParams" for "useParms" in your
requestHandler definition.

WARNING: This isn't really code I've used much, so this may be totally off base.

Best
Erick

On Tue, Sep 20, 2016 at 8:17 AM, Matt Work Coarr
 wrote:
> Good morning. I'm attempting to set some default query parameters via
> solrconfig.xml but they are not taking effect.  I'm using the /browse
> interface (e.g. velocity templates).
>
> To keep it simple, let's start with "fl".  When I specify fl as a url
> parameter, it does take effect.  But when I put it in solrconfig.xml, it's
> not being used.
>
> Right now, I'm putting this in a new init params element:
>
> 
>   
> field1,field2,field3
>   
> 
>
>
> Then I tell the /browse request handler to add it to the list of useParams
> (I've tried the beginning and end of the list):
>
>  useParams="query,facets,velocity,browse,myParams">
>
> I have also tried adding "useParams=myParams" in the url.
>
> FYI, I have been including wt=xml in my url so that I eliminate the
> possibility that the velocity templates are causing a problem.
>
> Any ideas?
>
> Thanks!
> Matt


Re: problem setting default params for /browse (velocity) queries

2016-09-20 Thread Matt Work Coarr
Thanks Erick.

The pre-existing request handler for /browse (the velocity template driven
interface) already had this:



I just added an entry for "myParams" and added the initParams element in
solrconfig.xml.

I also tried adding a initParam with a path of /browse (similar to how the
existing initParams elements were setup).

I was wondering where these param set on the /browse event handler were
coming from.  Now that I know to look for params.json, I see a copy in my
core's conf directory and it has "query", "facets", and "velocity" defined.

I'm going to try setting via the parameters api and see what happens...

Thanks for the pointers.

Matt


Re: problem setting default params for /browse (velocity) queries

2016-09-20 Thread Erik Hatcher
Matt -

Those params (query, facets, and velocity) are defined in params.json as Erick 
mentioned.  See here:


https://github.com/apache/lucene-solr/blob/master/solr/server/solr/configsets/data_driven_schema_configs/conf/params.json
 


I did a bit explaining of this here - 
https://lucidworks.com/blog/2015/12/08/browse-new-improved-solr-5/ 


Hope that helps.

Erik


> On Sep 20, 2016, at 1:38 PM, Matt Work Coarr  wrote:
> 
> Thanks Erick.
> 
> The pre-existing request handler for /browse (the velocity template driven
> interface) already had this:
> 
>  useParams="query,facets,velocity,browse">
> 
> I just added an entry for "myParams" and added the initParams element in
> solrconfig.xml.
> 
> I also tried adding a initParam with a path of /browse (similar to how the
> existing initParams elements were setup).
> 
> I was wondering where these param set on the /browse event handler were
> coming from.  Now that I know to look for params.json, I see a copy in my
> core's conf directory and it has "query", "facets", and "velocity" defined.
> 
> I'm going to try setting via the parameters api and see what happens...
> 
> Thanks for the pointers.
> 
> Matt



Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread Sandeep Khanzode
Wow. Simply awesome!
Where can I read more about this? I am not sure whether I understand what is 
going on behind the scenes ... like which parser is invoked for !field, how can 
we know which all special local params exist, whether we should prefer edismax 
over others, when is the LuceneQParser invoked in other conditions, etc? Would 
appreciate if you could indicate some references to catch up. 
Thanks a lot ...  SRK 

  Show original message On Tuesday, September 20, 2016 5:54 PM, David 
Smiley  wrote:
 

 OH!  Ok the moment the query no longer starts with "{!", the query is
parsed by defType (for 'q') and will default to lucene QParser.  So then it
appears we have a clause with a NOT operator.  In this parsing mode,
embedded "{!" terminates at the "}".  This means you can't put the
sub-query text after the "}", you instead need to put it in the special "v"
local-param.  e.g.:
-{!field f=schedule op=Contains v='[2016-08-26T12:00:12Z TO
2016-08-26T15:00:12Z]'}

On Tue, Sep 20, 2016 at 8:15 AM Sandeep Khanzode
 wrote:

> This is what I get ...
> { "responseHeader": { "status": 400, "QTime": 1, "params": { "q":
> "-{!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> 2016-08-26T15:00:12Z]", "indent": "true", "wt": "json", "_":
> "1474373612202" } }, "error": { "msg": "Invalid Date in Date Math
> String:'[2016-08-26T12:00:12Z'", "code": 400 }}
>  SRK
>
>    On Tuesday, September 20, 2016 5:34 PM, David Smiley <
> david.w.smi...@gmail.com> wrote:
>
>
>  It should, I think... what happens? Can you ascertain the nature of the
> results?
> ~ David
>
> On Tue, Sep 20, 2016 at 5:35 AM Sandeep Khanzode
>  wrote:
>
> > For Solr 6.1.0
> > This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z
> >
> > This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> > 2016-08-26T15:00:12Z]
> >
> >
> > Why does this not work?-{!field f=schedule
> > op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]
> >  SRK
>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
>
>

-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


   

How to limit resources in multi-tenant systems

2016-09-20 Thread Georg Sorst
Hi list!

I am running a multi-tenant system where the tenants can upload and import
their own data into their respective cores. Fortunately, Solr makes it easy
to make sure that the search indices don't mix and that clients can only
access their "cores".

However, isolating the resource consumption seems a little trickier. Of
course it's fairly easy to limit the number of documents and queries per
second for each tenant, but what if they add a few GBs of text to their
documents? What if they use millions of different filter values? This may
quickly fill up the VM heap and negatively impact the other tenants (I'm
totally fine if the search for that one tenant goes down).

Of course I can check their input data and apply a seemingly endless number
of limits for all kinds of cases but that smells. Is there a more general
solution to limit resource consumption per core? Something along the lines
of "each core may use up to 5% of the heap".

One suggestion I found on the mailing list was to run a separate Solr
instance for each tenant. While this is certainly possible there is a
significant administrative and resource overhead.

Another way may be to go full on SolrCloud and add shards and replicas as
required, but I have to limit the resources I can use.

Thanks!
Georg


Re: problem setting default params for /browse (velocity) queries

2016-09-20 Thread Matt Work Coarr
Awesome! Thanks Erik and Erick!!

To close the loop on this, I was able to create a paramset via the rest api
and then use it in a query via ?paramSet=myParams and it's working!!

Hopefully this information will help someone else...

My dataset has some text fields that should be used in more-like-this and
it has some machine learning classifier score fields that vary from 0..1
that I want to be able to do facets over different scores.

Here's the rest call to create my paramset:

export SOLR_BASE=http://myserver.mycompany.com:8983/solr
export CORE=mycore

curl "$SOLR_BASE/$CORE/config/params" -H 'Content-type:application/json'
 -d '{
  "update":{
"myParams":{
  "rows":"5",
  "facet": "on",
  "facet.range":
["classificationfield1","classificationfield2","classificationfield3"],
  "facet.range.start": "0.5",
  "facet.range.end": "1.0",
  "facet.range.gap": "0.1",
  "facet.range.other" : "all",
  "fl":
"title,textfield1,textfield2,classificationfield1,classificationfield2,classificationfield3,score",
  "mlt": "on",
  "mlt.fl": "textfield1,textfield2",
  "df":"_text_"}}
}'


Then I needed to add this new paramset to the *END* of the list in the
requestHandler's useParams attribute:




A few wiki pages that I found useful...

   - "Request Parameters API":
  -
  https://cwiki.apache.org/confluence/display/solr/Request+Parameters+API
   - "InitParms in SolrConfig"
  -
  https://cwiki.apache.org/confluence/display/solr/InitParams+in+SolrConfig
  - "Config API"
  - https://cwiki.apache.org/confluence/display/solr/Config+API

Matt


[ANNOUNCE] Apache Solr 6.2.1 released

2016-09-20 Thread Shalin Shekhar Mangar
20 September 2016, Apache Solr™ 6.2.1 available

The Lucene PMC is pleased to announce the release of Apache Solr 6.2.1

Solr is the popular, blazing fast, open source NoSQL search platform from
the Apache Lucene project. Its major features include powerful full-text
search, hit highlighting, faceted search and analytics, rich document
parsing, geospatial search, extensive REST APIs as well as parallel SQL.
Solr is enterprise grade, secure and highly scalable, providing fault
tolerant distributed search and indexing, and powers the search and
navigation features of many of the world's largest internet sites.

This release includes 11 bug fixes since the 6.2.0 release. Some of the
major fixes are:

   -

   SOLR-9490: BoolField always returning false for non-DV fields when
   javabin involved (via solrj, or intra node communication)
   -

   SOLR-9188: BlockUnknown property makes inter-node communication
   impossible
   - SOLR-9389: HDFS Transaction logs stay open for writes which leaks
   Xceivers
   - SOLR-9438: Shard split can fail to write commit data on shutdown
   leading to data loss

Furthermore, this release includes Apache Lucene 6.2.1 which includes 3 bug
fixes since the 6.2.0 release.

The release is available for immediate download at:

   -

   http://www.apache.org/dyn/closer.lua/lucene/solr/6.2.1

Please read CHANGES.txt for a detailed list of changes:

   -

   https://lucene.apache.org/solr/6_2_1/changes/Changes.html

Please report any feedback to the mailing lists (http://lucene.apache.or
g/solr/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases. It is possible that the mirror you are using may
not have replicated the release yet. If that is the case, please try
another mirror. This also goes for Maven access.


-- 
Regards,
Shalin Shekhar Mangar.


Re: problem setting default params for /browse (velocity) queries

2016-09-20 Thread Erik Hatcher
And for the curious, there’s examples of this sort of thing here… 
https://github.com/apache/lucene-solr/tree/master/solr/example/films 


This is one for Alexandre to take note of ;)   #solr_examples_protips (that 
admittedly could be documented better somewhere no doubt)

Erik


> On Sep 20, 2016, at 3:31 PM, Matt Work Coarr  wrote:
> 
> Awesome! Thanks Erik and Erick!!
> 
> To close the loop on this, I was able to create a paramset via the rest api
> and then use it in a query via ?paramSet=myParams and it's working!!
> 
> Hopefully this information will help someone else...
> 
> My dataset has some text fields that should be used in more-like-this and
> it has some machine learning classifier score fields that vary from 0..1
> that I want to be able to do facets over different scores.
> 
> Here's the rest call to create my paramset:
> 
> export SOLR_BASE=http://myserver.mycompany.com:8983/solr
> export CORE=mycore
> 
> curl "$SOLR_BASE/$CORE/config/params" -H 'Content-type:application/json'
> -d '{
>  "update":{
>"myParams":{
>  "rows":"5",
>  "facet": "on",
>  "facet.range":
> ["classificationfield1","classificationfield2","classificationfield3"],
>  "facet.range.start": "0.5",
>  "facet.range.end": "1.0",
>  "facet.range.gap": "0.1",
>  "facet.range.other" : "all",
>  "fl":
> "title,textfield1,textfield2,classificationfield1,classificationfield2,classificationfield3,score",
>  "mlt": "on",
>  "mlt.fl": "textfield1,textfield2",
>  "df":"_text_"}}
> }'
> 
> 
> Then I needed to add this new paramset to the *END* of the list in the
> requestHandler's useParams attribute:
> 
>  useParams="query,facets,velocity,browse,myParams">
> 
> 
> A few wiki pages that I found useful...
> 
>   - "Request Parameters API":
>  -
>  https://cwiki.apache.org/confluence/display/solr/Request+Parameters+API
>   - "InitParms in SolrConfig"
>  -
>  https://cwiki.apache.org/confluence/display/solr/InitParams+in+SolrConfig
>  - "Config API"
>  - https://cwiki.apache.org/confluence/display/solr/Config+API
> 
> Matt



Re: problem setting default params for /browse (velocity) queries

2016-09-20 Thread Matt Work Coarr
It would be nice to have a link to that films example in the cwiki "Request
Parameters API" page.

Erik H, in your Lucidworks blog post, what is the meaning of the empty
string keyed entries in each of the param sets?

"":{"v":0}


Matt


IOException errors in SolrJ client

2016-09-20 Thread Brent
I'm getting periodic errors when adding documents from a Java client app.
It's always a sequence of an error message logged in CloudSolrClient, then
an exception thrown from 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143):

Error message:
[ERROR][impl.CloudSolrClient][pool-6-thread-27][CloudSolrClient.java@904] -
Request to collection test_collection failed due to (0)
org.apache.http.NoHttpResponseException: 10.112.7.3:8983 failed to respond,
retry? 0

Exception with stack:
org.apache.solr.client.solrj.SolrServerException: IOException occured when
talking to server at: http://10.112.7.3:8983/solr/test_collection
at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:589)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:241)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:230)
at
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:372)
at
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:325)
at
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1100)
at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:871)
at
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:807)
at
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:150)
...
Caused by: org.apache.http.NoHttpResponseException: 10.112.7.3:8983 failed
to respond
at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143)
at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
at
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165)
at
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167)
at
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
at
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271)
at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at
org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
at
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:480)
... 22 more

I'm guessing it's due to a timeout, but maybe not my client's timeout
setting, but instead a timeout between two Solr Cloud servers, perhaps when
the document is being sent from one to the other. This app is running on
machine 10.112.7.4, and test_collection has a single shard, replicated on
both 10.112.7.4 and 10.112.7.3, with .3 being the leader. Is this saying
that .4 got the add request from my app, and tried to tell .3 to add the
doc, but .3 didn't respond? If so, is it a timeout, and if so, can I
increase the timeout value?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/IOException-errors-in-SolrJ-client-tp4297015.html
Sent from the Solr - User mailing list archive at Nabble.com.


Month facet - possible bucket values are Jan, Feb, Mar,…. Nov, Dec

2016-09-20 Thread Aswath Srinivasan (TMS)
Hello,

How to build a Month facet from a date field? The facet that I’m looking for 
should have a maximum of only 12 buckets. The possible bucket values are Jan, 
Feb, Mar,…. Nov, Dec.

http://localhost:8983/solr/collection1/select?facet=on&rows=0&indent=on&q=*:*&wt=json&json.facet.category={type:range,field:cdate,start:"2000-01-01T00:00:00Z",end:NOW,gap:"+1MONTH"}}

This is the query that I have so far but this doesn’t group the facet by Month, 
obviously, because of the gap:"+1MONTH"

Really appreciate the help.

Aswath NS


Very Slow Commits After Solr Index Optimization

2016-09-20 Thread vsolakhian
We are using Solr Cloud 4.10.3-cdh5.4.5 that is part of CLoudera CDH 5.4.5.
Our collection (one shard with three replicas) became really big and we
decided to delete some old records to improve performance (tests in staging
environment have shown that after reaching 500 million records the index
becomes very slow and Solr is less responsive). After deleting about 100
million records (out of 260 mil.), they were still shown as "Deleted Docs'
in Solr Admin Statistics page.  This page was showing 'Optimized: No (red)'
and 'Current: No (red)'. Theoretically, having 100 million deleted (but not
removed) records would be a performance issue and also, people tend to have
clean picture.

Information found in Solr forums was that the only way to removed deleted
records is to optimize the index.

We knew that optimization is not a good idea and it was discussed in forums
that it should be completely removed from API and Solr Admin, but discussing
is one  thing and doing it is another. To make the story short, we tried to
optimize through Solr API to remove deleted records:

URL=http://:8983/solr//update
curl "$URL?optimize=true&maxSegments=18&waitFlush=true"

and all three replicas of the collection were merged to 18 segments and Solr
Admin was showing "Optimized: Yes (green)", but the deleted records were not
removed (which is an inconsistency with Solr Admin or a bug in the API).
Finally, because people usually trust features fuond in UI (even if official
documentation is not found, see
https://cwiki.apache.org/confluence/display/solr/Using+the+Solr+Administration+User+Interface),
the "Optimize Now" button in Solr Admin was pressed and it removed all
deleted records and made the collection look very good (in UI). Here is the
problem:

1. The index was reduced to one large (60 GB) segment (some people's opinion
is that it is good, but I doubt).
2. Our use case includes batch updates and then a soft commit (after which
the user sees results). Commit operation that was taking about 1.5 minutes
now takes from 12 to 25 minutes.

Overall performance of our application is severely degraded.

I am not going to talk about how confusing Solr optimization is,  but I am
asking if anyone knows *what caused slowness of the commit operation after
optimization*. If the issue is having a large segment, then how is it
possible to split this segment into smaller ones (without sharding)?

Thanks,

Victor



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Very-Slow-Commits-After-Solr-Index-Optimization-tp4297022.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: problem setting default params for /browse (velocity) queries

2016-09-20 Thread Erik Hatcher

> On Sep 20, 2016, at 5:04 PM, Matt Work Coarr  wrote:
> 
> It would be nice to have a link to that films example in the cwiki "Request
> Parameters API" page.

Done (put at the end).

> 
> Erik H, in your Lucidworks blog post, what is the meaning of the empty
> string keyed entries in each of the param sets?
> 
> "":{"v":0}

It’s an internal detail to the params serialization - some kind of version 
number for internal use.   The idea being that all operations and visibility of 
the params should be through the API rather than looking at that JSON file. 

Erik



Re: Month facet - possible bucket values are Jan, Feb, Mar,…. Nov, Dec

2016-09-20 Thread Erik Hatcher
Two options come to mind -

  * index a field for just the month names
  * leverage facet.query…

  &facet.query={!key=Jan}cdate:[2003-01-01 TO 2003-01-31] OR 
cdate:[2004-01-01 TO 2004-01-31]…. 

I don’t know a way to select just “January’s” from a date field any more 
elegantly than that.  

I’d really go with indexing the month names (in addition to the full date too).

Erik


> On Sep 20, 2016, at 5:47 PM, Aswath Srinivasan (TMS) 
>  wrote:
> 
> Hello,
> 
> How to build a Month facet from a date field? The facet that I’m looking for 
> should have a maximum of only 12 buckets. The possible bucket values are Jan, 
> Feb, Mar,…. Nov, Dec.
> 
> http://localhost:8983/solr/collection1/select?facet=on&rows=0&indent=on&q=*:*&wt=json&json.facet.category={type:range,field:cdate,start:"2000-01-01T00:00:00Z",end:NOW,gap:"+1MONTH"}}
> 
> This is the query that I have so far but this doesn’t group the facet by 
> Month, obviously, because of the gap:"+1MONTH"
> 
> Really appreciate the help.
> 
> Aswath NS



Create collection PeerSync "no frame of reference" warnings

2016-09-20 Thread Brent
Whenver I create a collection that's replicated, I get these warnings in the
follower Solr log:

WARN  [c:test_collection s:shard1 r:core_node1
x:test_collection_shard1_replica1] o.a.s.u.PeerSync no frame of reference to
tell if we've missed updates

Are these harmless?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Create-collection-PeerSync-no-frame-of-reference-warnings-tp4297032.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Create collection PeerSync "no frame of reference" warnings

2016-09-20 Thread Pushkar Raste
If you are creating a collection these warnings are harmless. There is
patch being worked on under SOLR-9446 (although for a different scen) it
would help suppressing this error.


Re: How to limit resources in multi-tenant systems

2016-09-20 Thread Erick Erickson
There's really no OOB way that I know of to do what you're asking
about. I'm not even sure what the "right thing to do" would be if such
a limit was encountered. Fail the query? Try to execute it really slowly
within the constraints? (Actually I doubt this latter is possible).

The way Lucene sorts for instance, simply sorting requires an int array
maxDoc long.

The transient core stuff can help by limiting the total number of open
cores (NOTE: only stand-alone Solr, not SolrCloud). That doesn't
really address the question of one of the active cores firing a horribly
expensive query though.

What I've seen usually in this situation is that the number of docs in the
cumulative cores is monitored and tenants moved around when
the total number of docs per JVM approaches some limit (that you have
to determine empirically). Usually this follows a "long tail" pattern with
a few clients having their own dedicated JVMs down to 100s of clients
in the same JVM

But if you allow free-form queries to come in then there's no effective way
to limit it. There is some work being done on estimating query costs and
doing something reasonable, but I don't have the JIRAs at hand and don't
know the current progress there. So often people will restrict the kinds of
queries that _can_ be performed at the app layer. After all, if you allow me
unrestricted access to Solr I can delete everything.

Not much help I know,
Erick

On Tue, Sep 20, 2016 at 11:48 AM, Georg Sorst  wrote:
> Hi list!
>
> I am running a multi-tenant system where the tenants can upload and import
> their own data into their respective cores. Fortunately, Solr makes it easy
> to make sure that the search indices don't mix and that clients can only
> access their "cores".
>
> However, isolating the resource consumption seems a little trickier. Of
> course it's fairly easy to limit the number of documents and queries per
> second for each tenant, but what if they add a few GBs of text to their
> documents? What if they use millions of different filter values? This may
> quickly fill up the VM heap and negatively impact the other tenants (I'm
> totally fine if the search for that one tenant goes down).
>
> Of course I can check their input data and apply a seemingly endless number
> of limits for all kinds of cases but that smells. Is there a more general
> solution to limit resource consumption per core? Something along the lines
> of "each core may use up to 5% of the heap".
>
> One suggestion I found on the mailing list was to run a separate Solr
> instance for each tenant. While this is certainly possible there is a
> significant administrative and resource overhead.
>
> Another way may be to go full on SolrCloud and add shards and replicas as
> required, but I have to limit the resources I can use.
>
> Thanks!
> Georg


Re: Distributing nodes with the collections API RESTORE command

2016-09-20 Thread Stephen Lewis
Hello Again,

I've just submitted a patch on Jira to this issue in the branch
"branch_6_2". This is my first time submitting a patch (or even building
solr!), so please let me know if there is anything I should change to be
more helpful.

Thanks!

On Mon, Sep 19, 2016 at 4:47 PM, Stephen Lewis  wrote:

> Thanks Hrishikesh! Looking forward to hearing from you.
>
> On Fri, Sep 16, 2016 at 9:30 PM, Hrishikesh Gadre 
> wrote:
>
>> Hi Stephen,
>>
>> Thanks for the update. I filed SOLR-9527
>>  for tracking purpose. I
>> will take a look and get back to you.
>>
>> Thanks
>> Hrishikesh
>>
>> On Fri, Sep 16, 2016 at 2:56 PM, Stephen Lewis 
>> wrote:
>>
>> > Hello,
>> >
>> > I've tried this on both solr 6.1 and 6.2, with the same result. You are
>> > right that the collections API offering collection level backup/restore
>> > from remote server is a new feature.
>> >
>> > After some more experimentation, I am fairly sure that this is a bug
>> which
>> > is specific to the leaders in backup restore. After I ran a command to
>> > restore a backup of the collection "foo" (which has maxShardsPerNode
>> set to
>> > 1 as well) with a replication factor of 2, I see consistently that the
>> > followers (replica > 1) are correctly distributed, but all of the
>> leaders
>> > are brought up hosted on one shard.
>> >
>> > *Repro*
>> >
>> > *create *
>> > http://solr.test:8983/solr/admin/collections?action=
>> > CREATE&name=foo&numShards=3&maxShardsPerNode=1&collection.
>> > configName=test-one
>> > (after creation, all shards are on different nodes as expected)
>> >
>> > *backup*
>> > http://solr.test:8983/solr/admin/collections?action=
>> > BACKUP&name=foo-2&collection=foo&async=foo-2
>> >
>> > *delete*
>> > http://solr.test:8983/solr/admin/collections?action=DELETE&name=foo
>> >
>> > *restore*
>> > Result: All leaders are hosted on node, followers are spread about.
>> >
>> >  {
>> >   "responseHeader" : { "status" : 0,"QTime" : 7},
>> >   "cluster" : {
>> > "collections" : {
>> >   "foo" : {
>> > "replicationFactor" : "2",
>> > "shards" : {
>> >   "shard2" : {
>> > "range" : "d555-2aa9",
>> > "state" : "active",
>> > "replicas" : {
>> >   "core_node1" : {
>> > "core" : "foo_shard2_replica0",
>> > "base_url" : "http://IP1:8983/solr";,
>> > "node_name" : "IP1:8983_solr",
>> > "state" : "active",
>> > "leader" : "true"
>> >   },
>> >   "core_node4" : {
>> > "core" : "foo_shard2_replica1",
>> > "base_url" : "http://IP2:8983/solr";,
>> > "node_name" : "IP2:8983_solr",
>> > "state" : "recovering"
>> >   }
>> > }
>> >   },
>> >   "shard3" : {
>> > "range" : "2aaa-7fff",
>> > "state" : "active",
>> > "replicas" : {
>> >   "core_node2" : {
>> > "core" : "foo_shard3_replica0",
>> > "base_url" : "http://IP1:8983/solr";,
>> > "node_name" : "IP1:8983_solr",
>> > "state" : "active",
>> > "leader" : "true"
>> >   },
>> >   "core_node5" : {
>> > "core" : "foo_shard3_replica1",
>> > "base_url" : "http://IP3:8983/solr";,
>> > "node_name" : "IP3:8983_solr",
>> > "state" : "recovering"
>> >   }
>> > }
>> >   },
>> >   "shard1" : {
>> > "range" : "8000-d554",
>> > "state" : "active",
>> > "replicas" : {
>> >   "core_node3" : {
>> > "core" : "foo_shard1_replica0",
>> > "base_url" : "http://IP1:8983/solr";,
>> > "node_name" : "IP1:8983_solr",
>> > "state" : "active",
>> > "leader" : "true"
>> >   },
>> >   "core_node6" : {
>> > "core" : "foo_shard1_replica1",
>> > "base_url" : "http://IP4:8983/solr";,
>> > "node_name" : "IP4:8983_solr",
>> > "state" : "recovering"
>> >   }
>> > }
>> >   }
>> > },
>> > "router" : {
>> >   "name" : "compositeId"
>> > },
>> > "maxShardsPerNode" : "1",
>> > "autoAddReplicas" : "false",
>> > "znodeVersion" : 204,
>> > "configName" : "test-one"
>> >   }
>> > },
>> > "properties" : {
>> >   "location" : "/mnt/solr_backups"
>> > },
>> > "live_nodes" : [
>> >   "IP5:8983_solr",
>> >   "IP3:8983_solr",
>> >   "IP6:8983_solr",
>> >   "IP4:8983_solr",
>> >   "IP7:8983_solr",
>> >   "IP1:8983_solr",
>> >   "IP8:8983_solr",
>> >   "IP9:8983_solr",
>> >   "IP

Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread David Smiley
Personally I learned this by pouring over Solr's source code some time
ago.  I suppose the only official reference to this stuff is:
https://cwiki.apache.org/confluence/display/solr/Local+Parameters+in+Queries
But that page doesn't address the implications for when the syntax is a
clause of a larger query instead of being the whole query (i.e. has "{!"...
but but not at the first char).

On Tue, Sep 20, 2016 at 2:06 PM Sandeep Khanzode
 wrote:

> Wow. Simply awesome!
> Where can I read more about this? I am not sure whether I understand what
> is going on behind the scenes ... like which parser is invoked for !field,
> how can we know which all special local params exist, whether we should
> prefer edismax over others, when is the LuceneQParser invoked in other
> conditions, etc? Would appreciate if you could indicate some references to
> catch up.
> Thanks a lot ...  SRK
>
>   Show original message On Tuesday, September 20, 2016 5:54 PM, David
> Smiley  wrote:
>
>
>  OH!  Ok the moment the query no longer starts with "{!", the query is
> parsed by defType (for 'q') and will default to lucene QParser.  So then it
> appears we have a clause with a NOT operator.  In this parsing mode,
> embedded "{!" terminates at the "}".  This means you can't put the
> sub-query text after the "}", you instead need to put it in the special "v"
> local-param.  e.g.:
> -{!field f=schedule op=Contains v='[2016-08-26T12:00:12Z TO
> 2016-08-26T15:00:12Z]'}
>
> On Tue, Sep 20, 2016 at 8:15 AM Sandeep Khanzode
>  wrote:
>
> > This is what I get ...
> > { "responseHeader": { "status": 400, "QTime": 1, "params": { "q":
> > "-{!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> > 2016-08-26T15:00:12Z]", "indent": "true", "wt": "json", "_":
> > "1474373612202" } }, "error": { "msg": "Invalid Date in Date Math
> > String:'[2016-08-26T12:00:12Z'", "code": 400 }}
> >  SRK
> >
> >On Tuesday, September 20, 2016 5:34 PM, David Smiley <
> > david.w.smi...@gmail.com> wrote:
> >
> >
> >  It should, I think... what happens? Can you ascertain the nature of the
> > results?
> > ~ David
> >
> > On Tue, Sep 20, 2016 at 5:35 AM Sandeep Khanzode
> >  wrote:
> >
> > > For Solr 6.1.0
> > > This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z
> > >
> > > This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> > > 2016-08-26T15:00:12Z]
> > >
> > >
> > > Why does this not work?-{!field f=schedule
> > > op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]
> > >  SRK
> >
> > --
> > Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> > LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> > http://www.solrenterprisesearchserver.com
> >
> >
> >
>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
>
>

-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread David Smiley
So that page referenced describes local-params, and describes the special
"v" local-param.  But first, see a list of all query parsers (which lists
"field"): https://cwiki.apache.org/confluence/display/solr/Other+Parsers
and
https://cwiki.apache.org/confluence/display/solr/The+Standard+Query+Parser for
the "lucene" one.

The "op" param is rather unique... it's not defined by any query parser.  A
trick is done in which a custom field type (DateRangeField in this case) is
able to inspect the local-params, and thus define and use params it needs.
https://cwiki.apache.org/confluence/display/solr/Working+with+Dates "More
DateRangeField Details" mentions "op".  {!lucene df=dateRange
op=Contains}... would also work.  I don't know of any other local-param
used in this way.

On Tue, Sep 20, 2016 at 11:21 PM David Smiley 
wrote:

> Personally I learned this by pouring over Solr's source code some time
> ago.  I suppose the only official reference to this stuff is:
>
> https://cwiki.apache.org/confluence/display/solr/Local+Parameters+in+Queries
> But that page doesn't address the implications for when the syntax is a
> clause of a larger query instead of being the whole query (i.e. has "{!"...
> but but not at the first char).
>
> On Tue, Sep 20, 2016 at 2:06 PM Sandeep Khanzode
>  wrote:
>
>> Wow. Simply awesome!
>> Where can I read more about this? I am not sure whether I understand what
>> is going on behind the scenes ... like which parser is invoked for !field,
>> how can we know which all special local params exist, whether we should
>> prefer edismax over others, when is the LuceneQParser invoked in other
>> conditions, etc? Would appreciate if you could indicate some references to
>> catch up.
>> Thanks a lot ...  SRK
>>
>>   Show original message On Tuesday, September 20, 2016 5:54 PM, David
>> Smiley  wrote:
>>
>>
>>  OH!  Ok the moment the query no longer starts with "{!", the query is
>> parsed by defType (for 'q') and will default to lucene QParser.  So then
>> it
>> appears we have a clause with a NOT operator.  In this parsing mode,
>> embedded "{!" terminates at the "}".  This means you can't put the
>> sub-query text after the "}", you instead need to put it in the special
>> "v"
>> local-param.  e.g.:
>> -{!field f=schedule op=Contains v='[2016-08-26T12:00:12Z TO
>> 2016-08-26T15:00:12Z]'}
>>
>> On Tue, Sep 20, 2016 at 8:15 AM Sandeep Khanzode
>>  wrote:
>>
>> > This is what I get ...
>> > { "responseHeader": { "status": 400, "QTime": 1, "params": { "q":
>> > "-{!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
>> > 2016-08-26T15:00:12Z]", "indent": "true", "wt": "json", "_":
>> > "1474373612202" } }, "error": { "msg": "Invalid Date in Date Math
>> > String:'[2016-08-26T12:00:12Z'", "code": 400 }}
>> >  SRK
>> >
>> >On Tuesday, September 20, 2016 5:34 PM, David Smiley <
>> > david.w.smi...@gmail.com> wrote:
>> >
>> >
>> >  It should, I think... what happens? Can you ascertain the nature of the
>> > results?
>> > ~ David
>> >
>> > On Tue, Sep 20, 2016 at 5:35 AM Sandeep Khanzode
>> >  wrote:
>> >
>> > > For Solr 6.1.0
>> > > This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z
>> > >
>> > > This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
>> > > 2016-08-26T15:00:12Z]
>> > >
>> > >
>> > > Why does this not work?-{!field f=schedule
>> > > op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]
>> > >  SRK
>> >
>> > --
>> > Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>> > LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>> > http://www.solrenterprisesearchserver.com
>> >
>> >
>> >
>>
>> --
>> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>> http://www.solrenterprisesearchserver.com
>>
>>
>>
>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


leader/replica update error + client "expected mime type/stale state" error

2016-09-20 Thread Brent P
I'm running Solr Cloud 6.1.0, with a Java client using SolrJ 5.4.1.

Every once in awhile, during a query, I get a pair of messages logged in
the client from CloudSolrClient -- an error about a request failing, then a
warning saying that it's retrying after a stale state error.

For this test, the collection (test_collection) has one shard, with RF=2.
There are two machines, 10.112.7.2 (replica) and 10.112.7.4 (leader). The
client is on 10.112.7.4. Note that the system time on 10.112.7.4 is about 1
minute, 5-6 seconds ahead of the other machine.

---
Leader (10.112.7.4) Solr log:
---
19:27:16.583 ERROR [c:test_collection s:shard1 r:core_node2
x:test_collection_shard1_replica2] o.a.s.u.StreamingSolrClients error
org.apache.http.NoHttpResponseException: 10.112.7.2:8983 failed to respond
at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143)
at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
at
org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
at
org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
at
org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
at
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
at
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
at
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:685)
at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:487)
at
org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:882)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:311)
at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

19:27:16.587 WARN  [c:test_collection s:shard1 r:core_node2
x:test_collection_shard1_replica2] o.a.s.u.p.DistributedUpdateProcessor
Error sending update to http://10.112.7.2:8983/solr
org.apache.http.NoHttpResponseException: 10.112.7.2:8983 failed to respond
at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143)
at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
at
org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
at
org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
at
org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
at
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
at
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
at
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:685)
at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:487)
at
org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:882)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:311)
at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambd

JSON Facet API

2016-09-20 Thread Sandeep Khanzode
Hello,
How can I specify JSON Facets in SolrJ? The below facet query for example ... 
&json.facet={ 
 facet1: { 
 type: query, 
 q: "field1:value1 AND field2:value2", 
 facet: 
 { 
 facet1sub1: {
 type: query, 
 q: "{!field f=mydate op=Intersects}2016-09-08T08:00:00", 
 facet: 
 { 
 id: 
 { 
 type: terms, 
 field: id 
 } 
 } 
 }, 
 facet1sub2: { 
 type: query,
 q: "-{!field f=myseconddate op=Intersects}2016-09-08T08:00:00 AND -{!field 
f=mydateop=Intersects}2016-05-08T08:00:00", 
 facet: 
 { 
 id: 
 { 
 type: terms, 
 field: id 
 } 
 } 
 }
 } 
    } 
},

 SRK

Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread Sandeep Khanzode
Thanks, David! Perhaps browsing the Solr sources may be a necessity at some 
point in time. :) SRK 

On Wednesday, September 21, 2016 9:08 AM, David Smiley 
 wrote:
 

 So that page referenced describes local-params, and describes the special
"v" local-param.  But first, see a list of all query parsers (which lists
"field"): https://cwiki.apache.org/confluence/display/solr/Other+Parsers
and
https://cwiki.apache.org/confluence/display/solr/The+Standard+Query+Parser for
the "lucene" one.

The "op" param is rather unique... it's not defined by any query parser.  A
trick is done in which a custom field type (DateRangeField in this case) is
able to inspect the local-params, and thus define and use params it needs.
https://cwiki.apache.org/confluence/display/solr/Working+with+Dates "More
DateRangeField Details" mentions "op".  {!lucene df=dateRange
op=Contains}... would also work.  I don't know of any other local-param
used in this way.

On Tue, Sep 20, 2016 at 11:21 PM David Smiley 
wrote:

> Personally I learned this by pouring over Solr's source code some time
> ago.  I suppose the only official reference to this stuff is:
>
> https://cwiki.apache.org/confluence/display/solr/Local+Parameters+in+Queries
> But that page doesn't address the implications for when the syntax is a
> clause of a larger query instead of being the whole query (i.e. has "{!"...
> but but not at the first char).
>
> On Tue, Sep 20, 2016 at 2:06 PM Sandeep Khanzode
>  wrote:
>
>> Wow. Simply awesome!
>> Where can I read more about this? I am not sure whether I understand what
>> is going on behind the scenes ... like which parser is invoked for !field,
>> how can we know which all special local params exist, whether we should
>> prefer edismax over others, when is the LuceneQParser invoked in other
>> conditions, etc? Would appreciate if you could indicate some references to
>> catch up.
>> Thanks a lot ...  SRK
>>
>>  Show original message    On Tuesday, September 20, 2016 5:54 PM, David
>> Smiley  wrote:
>>
>>
>>  OH!  Ok the moment the query no longer starts with "{!", the query is
>> parsed by defType (for 'q') and will default to lucene QParser.  So then
>> it
>> appears we have a clause with a NOT operator.  In this parsing mode,
>> embedded "{!" terminates at the "}".  This means you can't put the
>> sub-query text after the "}", you instead need to put it in the special
>> "v"
>> local-param.  e.g.:
>> -{!field f=schedule op=Contains v='[2016-08-26T12:00:12Z TO
>> 2016-08-26T15:00:12Z]'}
>>
>> On Tue, Sep 20, 2016 at 8:15 AM Sandeep Khanzode
>>  wrote:
>>
>> > This is what I get ...
>> > { "responseHeader": { "status": 400, "QTime": 1, "params": { "q":
>> > "-{!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
>> > 2016-08-26T15:00:12Z]", "indent": "true", "wt": "json", "_":
>> > "1474373612202" } }, "error": { "msg": "Invalid Date in Date Math
>> > String:'[2016-08-26T12:00:12Z'", "code": 400 }}
>> >  SRK
>> >
>> >    On Tuesday, September 20, 2016 5:34 PM, David Smiley <
>> > david.w.smi...@gmail.com> wrote:
>> >
>> >
>> >  It should, I think... what happens? Can you ascertain the nature of the
>> > results?
>> > ~ David
>> >
>> > On Tue, Sep 20, 2016 at 5:35 AM Sandeep Khanzode
>> >  wrote:
>> >
>> > > For Solr 6.1.0
>> > > This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z
>> > >
>> > > This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
>> > > 2016-08-26T15:00:12Z]
>> > >
>> > >
>> > > Why does this not work?-{!field f=schedule
>> > > op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]
>> > >  SRK
>> >
>> > --
>> > Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>> > LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>> > http://www.solrenterprisesearchserver.com
>> >
>> >
>> >
>>
>> --
>> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>> http://www.solrenterprisesearchserver.com
>>
>>
>>
>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


   

Re: [Result Query Solr] How to retrieve the content of pdfs

2016-09-20 Thread Dmitry Kan
Hi Alexandre,

Could you add fl=* to your query and check the output? Alternatively, have
a look at your schema file and check what could look like content field:
text or similar.

Dmitry

14 сент. 2016 г. 1:27 AM пользователь "Alexandre Martins" <
alexandremart...@gmail.com> написал:

> Hi Guys,
>
> I'm trying to use the last version of solr and i have used the post tool
> to upload 28 pdf files and it works fine. However, I don't know how to show
> the content of the files in the resulted json. Anybody know how to include
> this field?
>
> "responseHeader":{ "zkConnected":true, "status":0, "QTime":43, "params":{
> "q":"ABC", "indent":"on", "wt":"json", "_":"1473804420750"}}, "response":
> {"numFound":40,"start":0,"maxScore":9.1066065,"docs":[ { "id":
> "/home/alexandre/desenvolvimento/workspace/solr-6.2.0/pdfs_hack/abc.pdf",
> "date":["2016-09-13T14:44:17Z"], "pdf_pdfversion":[1.5], "xmp_creatortool
> ":["PDFCreator Version 1.7.3"], "stream_content_type":["application/pdf"],
> "access_permission_modify_annotations":[false], "
> access_permission_can_print_degraded":[false], "dc_creator":["abc"], "
> dcterms_created":["2016-09-13T14:44:17Z"], "last_modified":["2016-09-
> 13T14:44:17Z"], "dcterms_modified":["2016-09-13T14:44:17Z"], 
> "dc_format":["application/pdf;
> version=1.5"], "title":["ABC tittle"], "xmpmm_documentid":["uuid:
> 100ccff2-7c1c-11e6--ab7b62fc46ae"], "last_save_date":["2016-09-
> 13T14:44:17Z"], "access_permission_fill_in_form":[false], "meta_save_date
> ":["2016-09-13T14:44:17Z"], "pdf_encrypted":[false], "dc_title":["Tittle
> abc"], "modified":["2016-09-13T14:44:17Z"], "content_type":["application/
> pdf"], "stream_size":[101948], "x_parsed_by":["org.apache.
> tika.parser.DefaultParser", "org.apache.tika.parser.pdf.PDFParser"], "
> creator":["mauricio.tostes"], "meta_author":["mauricio.tostes"], "
> meta_creation_date":["2016-09-13T14:44:17Z"], "created":["Tue Sep 13
> 14:44:17 UTC 2016"], "access_permission_extract_for_accessibility":[false],
> "access_permission_assemble_document":[false], "xmptpg_npages":[3], "
> creation_date":["2016-09-13T14:44:17Z"], "resourcename":["/home/
> alexandre/desenvolvimento/workspace/solr-6.2.0/pdfs_hack/abc.pdf"], "
> access_permission_extract_content":[false], "access_permission_can_print":
> [false], "author":["abc.add"], "producer":["GPL Ghostscript 9.10"], "
> access_permission_can_modify":[false], "_version_":1545395897488113664},
>
> Alexandre Costa Martins
> DATAPREV - IT Analyst
> Software Reuse Researcher
> MSc Federal University of Pernambuco
> RiSE Member - http://www.rise.com.br
> Sun Certified Programmer for Java 5.0 (SCPJ5.0)
>
> MSN: xandecmart...@hotmail.com
> GTalk: alexandremart...@gmail.com
> Skype: xandecmartins
> Mobile: +55 (85) 9626-3631
>


Search with the start of field

2016-09-20 Thread Mahmoud Almokadem
Hello,

What is the best way to search with the start token of field?

For example: the field contains these values 

Document1: ABC  DEF GHI
Document2: DEF GHI JKL

when I search with DEF, I want to get Document2 only. Is that possible?

Thanks,
Mahmoud 



Re: Search with the start of field

2016-09-20 Thread William Bell
Show us the FieldType and Field definitions.

On Wed, Sep 21, 2016 at 12:06 AM, Mahmoud Almokadem 
wrote:

> Hello,
>
> What is the best way to search with the start token of field?
>
> For example: the field contains these values
>
> Document1: ABC  DEF GHI
> Document2: DEF GHI JKL
>
> when I search with DEF, I want to get Document2 only. Is that possible?
>
> Thanks,
> Mahmoud
>
>


-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Re: JSON Facet API

2016-09-20 Thread Bram Van Dam
On 21/09/16 05:40, Sandeep Khanzode wrote:
> How can I specify JSON Facets in SolrJ? The below facet query for example ... 

SolrQuery query = new SolrQuery();
query.add("json.facet", jsonStringGoesHere);

 - Bram