Re: Use case for the Shingle Filter

2017-03-06 Thread Ryan Yacyshyn
The query parser will split on whitespace. I'm not sure how I can use the
shingle filter in my query, and use-cases for it. For example, if my
fieldType looks like this:


  


  
  


**
  


and I have a document that has "my babysitter is terrific" in the content_t
field, a query such as:

http://localhost:8983/solr/collection_name/select?q={!lucene}content_t:(the
baby
sitter was here)

won't return the document. I was hoping I'd get tokens like "the
thebaby baby babysitter sitter sitterwas ..." when querying.





On Sun, 5 Mar 2017 at 23:59 Ryan Josal  wrote:

> I thought new versions of solr didn't split on whitespace at the query
> parser anymore, so this should work?
>
> That being said, I think I remember it having a problem coming after a
> synonym filter.  IIRC, if your input is "Foo Bar" and you have a synonym
> "foo <=> baz" you would get foobaz bazbar instead of foobar and bazbar.  I
> wrote a custom shingler to account for that.
>
> Ryan
>
> On Sun, Mar 5, 2017 at 02:48 Markus Jelsma 
> wrote:
>
> > Hello - we use it for text classification and online near-duplicate
> > document detection/filtering. Using shingles means you want to consider
> > order in the text. It is analogous to using bigrams and trigrams when
> doing
> > language detection, you cannot distinguish between Danish and Norwegian
> > solely on single characters.
> >
> > Markus
> >
> >
> >
> > -Original message-
> > > From:Ryan Yacyshyn 
> > > Sent: Sunday 5th March 2017 5:57
> > > To: solr-user@lucene.apache.org
> > > Subject: Use case for the Shingle Filter
> > >
> > > Hi everyone,
> > >
> > > I was thinking of using the Shingle Filter to help solve an issue I'm
> > > facing. I can see this working in the analysis panel in the Solr admin,
> > but
> > > not when I make my queries.
> > >
> > > I find out it's because of the query parser splitting up the tokens on
> > > white space before passing them along.
> > >
> > > This made me wonder what a practical use case can be, for using the
> > shingle
> > > filter?
> > >
> > > Any enlightenment on this would be much appreciated!
> > >
> > > Thanks,
> > > Ryan
> > >
> >
>


Custom DelegatingCollector : collect sorted docs by score

2017-03-06 Thread Jamel ESSOUSSI
Hi ,

I developped a custom  DelegatingCollector in which I should receive the
documents (in collect method) sorted by score.

I used SolR 5.5.3.

In the older version of SolR, there was a method called
acceptsDocsOutOfOrder() .

Best Regards

--Jamel



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Custom-DelegatingCollector-collect-sorted-docs-by-score-tp4323549.html
Sent from the Solr - User mailing list archive at Nabble.com.


Learning to rank - Bad Request

2017-03-06 Thread Vincent

Hi all,

I've been trying to get learning to rank working on our own search 
index. Following the LTR-readme 
(https://github.com/bloomberg/lucene-solr/blob/master-ltr/solr/contrib/ltr/example/README.md) 
I ran the example python script to train and upload the model, but I 
already get an error during the uploading of the features:


Bad Request (400) - Expected Map to create a new ManagedResource but 
received a java.util.ArrayList
at 
org.apache.solr.rest.RestManager$RestManagerManagedResource.doPut(RestManager.java:523)
at 
org.apache.solr.rest.ManagedResource.doPost(ManagedResource.java:355)
at 
org.apache.solr.rest.RestManager$ManagedEndpoint.post(RestManager.java:351)
at 
org.restlet.resource.ServerResource.doHandle(ServerResource.java:454)

...

This makes sense: the json feature file is an array, and the RestManager 
needs a Map in doPut.


Using the curl command from the cwiki 
(https://cwiki.apache.org/confluence/display/solr/Learning+To+Rank) 
yields the same error, but instead of it having "received a 
java.util.ArrayList" it "received a java.lang.String".


I wonder how this actually is supposed to work, and what's going wrong 
in this case. I have tried the LTR with the default techproducts 
example, and that worked just fine. Does anyone have an idea of what's 
going wrong here?


Thanks in advance!
Vincent


Conditions for replication to copy full index

2017-03-06 Thread Chris Ulicny
Hi all,

We've recently had some issues with a 5.1.0 core copying the whole index
when it was set to replicate from a master core.

I've read that if there are documents that have been added to the slave
core by mistake, it will do a full copy. Though we are still investigating,
this is probably not the cause of it.

Are there any other conditions in which the slave core will do a full copy
of an index instead of only the necessary files?

Thanks,
Chris


Re: Conditions for replication to copy full index

2017-03-06 Thread Erick Erickson
We need to be pretty nit-picky here.

bq: do a full copy of an index instead of only the necessary files

It's all about "necessary files". "necessary" here means a
all changed segments. Since segments are not changed
after a commit, then replication can safely ignore any segments
files it already has and only copies new segments.

The rub is that "new" includes merged segments. And it's
possible that _all_ current segments are merged into a new
segment. At that point, technically, a full copy is done.

You can force this by an optimize (not recommended) or,
perhaps expungeDeletes options.

Here's a great video of segment merging, the third one down
is the TieredMergePolicy which has been the default for some
time.

http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

And, if you want to force a full replication, shut down the slave,
"rm -rf data". (data should be the parent of the "index" dir) and
restart solr.

Best,
Erick

On Mon, Mar 6, 2017 at 8:06 AM, Chris Ulicny  wrote:
> Hi all,
>
> We've recently had some issues with a 5.1.0 core copying the whole index
> when it was set to replicate from a master core.
>
> I've read that if there are documents that have been added to the slave
> core by mistake, it will do a full copy. Though we are still investigating,
> this is probably not the cause of it.
>
> Are there any other conditions in which the slave core will do a full copy
> of an index instead of only the necessary files?
>
> Thanks,
> Chris


Re: Conditions for replication to copy full index

2017-03-06 Thread Chris Ulicny
Thanks Erik. I love Mike's video on segment merging.

However I do not believe a large number of merged segments or accidental
optimization is the issue. The data in the core is mostly static and there
is no evidence so far of a large number of merges that took place. Usually
the only updates the index receives are deletes.

The other reason I assume it was a copy of the entire data directory is
that the log lines for the IndexFetcher threads have the fullCopy flag set
to true, where the usual replication seems to have it set to false. This
fullCopy for the core in question is preceded by a failure to fetch the
index on the previous replication attempt, but the subsequent check yields
matching generations between the slave and master. I've included the logs
for the indexFetcher thread for the core.


11:13:00,138 ERROR [org.apache.solr.handler.IndexFetcher]
(indexFetcher-23-thread-1) Master at:  is not available. Index
fetch failed. Exception: IOException occured when talking to server at:

11:14:00,036 INFO  [org.apache.solr.handler.IndexFetcher]
(indexFetcher-23-thread-1) Master's generation: 182823
11:14:00,044 INFO  [org.apache.solr.handler.IndexFetcher]
(indexFetcher-23-thread-1) Slave's generation: 182823
11:14:00,081 INFO  [org.apache.solr.handler.IndexFetcher]
(indexFetcher-23-thread-1) Starting replication process
11:14:00,422 INFO  [org.apache.solr.handler.IndexFetcher]
(indexFetcher-23-thread-1) Number of files in latest index in master: 404
11:14:00,435 INFO  [org.apache.solr.core.CachingDirectoryFactory]
(indexFetcher-23-thread-1) return new directory for
//data/index.20170306111400434
11:14:00,555 INFO  [org.apache.solr.handler.IndexFetcher]
(indexFetcher-23-thread-1) Starting download to
NRTCachingDirectory(MMapDirectory@//data/index.20170306111400434
lockFactory=org.apache.lucene.store.NativeFSLockFactory@6a453731;
maxCacheMB=48.0 maxMergeSizeMB=4.0) fullCopy=true

Thanks



On Mon, Mar 6, 2017 at 11:30 AM Erick Erickson 
wrote:

> We need to be pretty nit-picky here.
>
> bq: do a full copy of an index instead of only the necessary files
>
> It's all about "necessary files". "necessary" here means a
> all changed segments. Since segments are not changed
> after a commit, then replication can safely ignore any segments
> files it already has and only copies new segments.
>
> The rub is that "new" includes merged segments. And it's
> possible that _all_ current segments are merged into a new
> segment. At that point, technically, a full copy is done.
>
> You can force this by an optimize (not recommended) or,
> perhaps expungeDeletes options.
>
> Here's a great video of segment merging, the third one down
> is the TieredMergePolicy which has been the default for some
> time.
>
>
> http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
>
> And, if you want to force a full replication, shut down the slave,
> "rm -rf data". (data should be the parent of the "index" dir) and
> restart solr.
>
> Best,
> Erick
>
> On Mon, Mar 6, 2017 at 8:06 AM, Chris Ulicny  wrote:
> > Hi all,
> >
> > We've recently had some issues with a 5.1.0 core copying the whole index
> > when it was set to replicate from a master core.
> >
> > I've read that if there are documents that have been added to the slave
> > core by mistake, it will do a full copy. Though we are still
> investigating,
> > this is probably not the cause of it.
> >
> > Are there any other conditions in which the slave core will do a full
> copy
> > of an index instead of only the necessary files?
> >
> > Thanks,
> > Chris
>


Re: Does {!child} query support nested Queries ("v=")

2017-03-06 Thread Kelly, Frank
Hi Mikhail,
  Sorry I didn’t reply sooner
 
Here are some example docs - each document for a userAccount object has 1
or more nested documents for our userLinkedAccount object

SolrInputDocument(fields: [type=userAccount,
typeId=userAccount/HERE-8ce41333-7c08-40d3-9b2c-REDACTED,
id=userAccount/HERE-8ce41333-7c08-40d3-9b2c-REDACTED,
emailAddress=[redac...@here.com, REDACTED here.com], nameSort=�,
emailType=Primary, familyName=REDACTED, allText=[REDACTED, REDACTED ,
untokenized=[REDACTED, REDACTED , isEnabled=1,
createdTimeNumeric=1406972278682,
haAccountId=HERE-8ce41333-7c08-40d3-9b2c-REDACTED, givenName=REDACTED,
readAccess=application, indexTime=1488828050933])
SolrInputDocument(fields: [type=userLinkedAccount,
typeId=userLinkedAccount/5926990ea0708fa82c9ddca5d1bda6ed3331a450,
id=userLinkedAccount/5926990ea0708fa82c9ddca5d1bda6ed3331a450,
haAccountId=HERE-8ce41333-7c08-40d3-9b2c-REDACTED, nameSort=�,
hereRealm=HERE, haAccountType=password, haUserId= redac...@here.com,
readAccess=application, createdTimeNumeric=1406972278646,
indexTime=1488828050933])

SolrInputDocument(fields: [type=userAccount,
typeId=userAccount/HERE-4797487f-7659-4c58-80b5-REDACTED,
id=userAccount/HERE-4797487f-7659-4c58-80b5-REDACTED,
emailAddress=[redac...@live.de, redac...@live.de], nameSort=�,
emailType=Primary, familyName= REDACTED, allText=[REDACTED, REDACTED],
untokenized=[REDACTED, REDACTED], isEnabled=1,
createdTimeNumeric=1447141199050,
haAccountId=HERE-4797487f-7659-4c58-80b5-REDACTED, givenName=Krzysztof,
readAccess=application, indexTime=1488828050941])
SolrInputDocument(fields: [type=userLinkedAccount,
typeId=userLinkedAccount/02d11e8096dc4727ee7c2c4f6cc4723190620088,
id=userLinkedAccount/02d11e8096dc4727ee7c2c4f6cc4723190620088,
haAccountId=HERE-4797487f-7659-4c58-80b5-REDACTED, nameSort=�,
hereRealm=HERE, haAccountType=password, haUserId=redac...@live.de,
readAccess=application, createdTimeNumeric=1447141199009,
indexTime=1488828050941])

SolrInputDocument(fields: [type=userAccount,
typeId=userAccount/HERE-8ce41333-7c08-40d3-9b2c-REDACTED,
id=userAccount/HERE-8ce41333-7c08-40d3-9b2c-REDACTED,
emailAddress=[redac...@here.com, REDACTED here.com], nameSort=�,
emailType=Primary, familyName= REDACTED, allText=[REDACTED, REDACTED],
untokenized=[REDACTED, REDACTED], isEnabled=1,
createdTimeNumeric=1406972278682,
haAccountId=HERE-8ce41333-7c08-40d3-9b2c-REDACTED, givenName= REDACTED,
readAccess=application, indexTime=1488828051697])
SolrInputDocument(fields: [type=userLinkedAccount,
typeId=userLinkedAccount/5926990ea0708fa82c9ddca5d1bda6ed3331a450,
id=userLinkedAccount/5926990ea0708fa82c9ddca5d1bda6ed3331a450,
haAccountId=HERE-8ce41333-7c08-40d3-9b2c-REDACTED, nameSort=�,
hereRealm=HERE, haAccountType=password, haUserId= redac...@here.com,
readAccess=application, createdTimeNumeric=1406972278646,
indexTime=1488828051697])


So we often want to
FIND userLinkedAccount document WHERE parentDocument has some filter
properties e.g. Name / email address
E.g.

+type:userLinkedAccount +{!child of="type:userAccount"
v="givenName:frank*”}

The results appear to come back fine but the numFound often has a small
delta we cannot explain

Here is the output of the debugQuery

"rawquerystring": "+type:userLinkedAccount +{!child
of=\"type:userAccount\" v=\"givenName:frank*\"}",
"querystring": "+type:userLinkedAccount +{!child
of=\"type:userAccount\" v=\"givenName:frank*\"}",
"parsedquery": "+type:userLinkedAccount
+ToChildBlockJoinQuery(ToChildBlockJoinQuery (givenName:frank*))",
"parsedquery_toString": "+type:userLinkedAccount
+ToChildBlockJoinQuery (givenName:frank*)",
"QParser": "LuceneQParser",
"explain": {
  "userLinkedAccount/eb86bc13944094ce16f684a7f58e2294c84ca956":
"\n1.9348345 = sum of:\n  1.4179944 = weight(type:userLinkedAccount in
84623) [DefaultSimilarity], result of:\n1.4179944 =
score(doc=84623,freq=1.0), product of:\n  0.85608196 = queryWeight,
product of:\n1.6563768 = idf(docFreq=14190942, maxDocs=27357228)\n
   0.5168401 = queryNorm\n  1.6563768 = fieldWeight in 84623,
product of:\n1.0 = tf(freq=1.0), with freq of:\n  1.0 =
termFreq=1.0\n1.6563768 = idf(docFreq=14190942,
maxDocs=27357228)\n1.0 = fieldNorm(doc=84623)\n  0.5168401 = Score
based on parent document 84624\n0.5168401 = givenName:frank*, product
of:\n  1.0 = boost\n  0.5168401 = queryNorm\n",
  "userLinkedAccount/78498d9d7d5c1a52de0f61d90df138ac7381d37f":
"\n1.9348345 = sum of:\n  1.4179944 = weight(type:userLinkedAccount in
113884) [DefaultSimilarity], result of:\n1.4179944 =
score(doc=113884,freq=1.0), product of:\n  0.85608196 = queryWeight,
product of:\n1.6563768 = idf(docFreq=14190942, maxDocs=27357228)\n
   0.5168401 = queryNorm\n  1.6563768 = fieldWeight in 113884,
product of:\n1.0 = tf(freq=1.0), with freq of:\n  1.0 =
termFreq=1.0\n1.6563768 = idf(docFreq=14190942,
maxDocs=27357228)\n1.0 = fieldNorm(doc=113

Recommendation for production SOLR

2017-03-06 Thread Phil Scadden
Given the known issues with 6.4.1 and no release date for  6.4.2, is the best 
recommendation for a production version of SOLR 6.3.0? Hoping to take to 
production in first week of April.
Notice: This email and any attachments are confidential and may not be used, 
published or redistributed without the prior written consent of the Institute 
of Geological and Nuclear Sciences Limited (GNS Science). If received in error 
please destroy and immediately notify GNS Science. Do not copy or disclose the 
contents.


Re: Recommendation for production SOLR

2017-03-06 Thread Walter Underwood
We are going to production this week using 6.3.0. We don’t have time to re-run 
all the load benchmarks on 6.4.2.

We’ll qualify 6.4.2 in a couple of weeks, then upgrade prod if it passes.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 6, 2017, at 11:48 AM, Phil Scadden  wrote:
> 
> Given the known issues with 6.4.1 and no release date for  6.4.2, is the best 
> recommendation for a production version of SOLR 6.3.0? Hoping to take to 
> production in first week of April.
> Notice: This email and any attachments are confidential and may not be used, 
> published or redistributed without the prior written consent of the Institute 
> of Geological and Nuclear Sciences Limited (GNS Science). If received in 
> error please destroy and immediately notify GNS Science. Do not copy or 
> disclose the contents.



Re:Learning to rank - Bad Request

2017-03-06 Thread Christine Poerschke (BLOOMBERG/ LONDON)
Hi Vincent,

Would you be comfortable sharing (redacted) details of the exact upload command 
you used and (redacted) extracts of the features json file that gave the upload 
error?

Two things I have encountered commonly myself:
* uploading features to the model endpoint or model to the feature endpoint
* forgotten double-quotes around the numbers in MultipleAdditiveTreesModel json

Regards,
Christine

- Original Message -
From: solr-user@lucene.apache.org
To: solr-user@lucene.apache.org
At: 03/06/17 13:22:40

Hi all,

I've been trying to get learning to rank working on our own search 
index. Following the LTR-readme 
(https://github.com/bloomberg/lucene-solr/blob/master-ltr/solr/contrib/ltr/example/README.md)
 
I ran the example python script to train and upload the model, but I 
already get an error during the uploading of the features:

Bad Request (400) - Expected Map to create a new ManagedResource but 
received a java.util.ArrayList
 at 
org.apache.solr.rest.RestManager$RestManagerManagedResource.doPut(RestManager.java:523)
 at 
org.apache.solr.rest.ManagedResource.doPost(ManagedResource.java:355)
 at 
org.apache.solr.rest.RestManager$ManagedEndpoint.post(RestManager.java:351)
 at 
org.restlet.resource.ServerResource.doHandle(ServerResource.java:454)
 ...

This makes sense: the json feature file is an array, and the RestManager 
needs a Map in doPut.

Using the curl command from the cwiki 
(https://cwiki.apache.org/confluence/display/solr/Learning+To+Rank) 
yields the same error, but instead of it having "received a 
java.util.ArrayList" it "received a java.lang.String".

I wonder how this actually is supposed to work, and what's going wrong 
in this case. I have tried the LTR with the default techproducts 
example, and that worked just fine. Does anyone have an idea of what's 
going wrong here?

Thanks in advance!
Vincent



question related to solr LTR plugin

2017-03-06 Thread Saurabh Agarwal (BLOOMBERG/ 731 LEX)
Hi, 

I do have a question related to solr LTR plugin. I have a use case of 
personalization and wondering whether you can help me there. I would like to 
rerank my query based on the relationship of searcher with the author of the 
returned documents. I do have relationship score in the external datastore in 
form of user1(searcher), user2(author), relationship score. In my query, I can 
pass searcher id as external feature. My question is that during querying, how 
do I retrieve relationship score for each documents as a feature and rerank the 
documents. Would I need to implement a custom feature to do so? and How to 
implement the custom feature.

Thanks,
Saurabh

Re: Recommendation for production SOLR

2017-03-06 Thread Erick Erickson
6.4.2 has passed the vote to release, so it should be hitting the
mirrors in a few days at most.

On Mon, Mar 6, 2017 at 11:50 AM, Walter Underwood  wrote:
> We are going to production this week using 6.3.0. We don’t have time to 
> re-run all the load benchmarks on 6.4.2.
>
> We’ll qualify 6.4.2 in a couple of weeks, then upgrade prod if it passes.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>> On Mar 6, 2017, at 11:48 AM, Phil Scadden  wrote:
>>
>> Given the known issues with 6.4.1 and no release date for  6.4.2, is the 
>> best recommendation for a production version of SOLR 6.3.0? Hoping to take 
>> to production in first week of April.
>> Notice: This email and any attachments are confidential and may not be used, 
>> published or redistributed without the prior written consent of the 
>> Institute of Geological and Nuclear Sciences Limited (GNS Science). If 
>> received in error please destroy and immediately notify GNS Science. Do not 
>> copy or disclose the contents.
>


Getting an error: was indexed without position data; cannot run PhraseQuery

2017-03-06 Thread Pouliot, Scott
We keep getting this in our Tomcat/SOLR Logs and I was wondering if a simple 
schema change will alleviate this issue:

INFO  - 2017-03-06 07:26:58.751; org.apache.solr.core.SolrCore; 
[Client_AdvanceAutoParts] webapp=/solr path=/select 
params={fl=candprofileid,+candid&start=0&q=*:*&wt=json&fq=issearchable:1+AND+cpentitymodifiedon:[2017-01-20T00:00:00.000Z+TO+*]+AND+clientreqid:17672+AND+folderid:132+AND+(engagedid_s:(0)+AND+atleast21_s:(1))+AND+(preferredlocations_s:(3799H))&rows=1000}
 status=500 QTime=1480
ERROR - 2017-03-06 07:26:58.766; org.apache.solr.common.SolrException; 
null:java.lang.IllegalStateException: field "preferredlocations_s" was indexed 
without position data; cannot run PhraseQuery (term=3799)
at 
org.apache.lucene.search.PhraseQuery$PhraseWeight.scorer(PhraseQuery.java:277)
at 
org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:351)
at org.apache.lucene.search.Weight.bulkScorer(Weight.java:131)
at 
org.apache.lucene.search.BooleanQuery$BooleanWeight.bulkScorer(BooleanQuery.java:313)
at 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
at 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
at 
org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:1158)
at 
org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:846)
at 
org.apache.solr.search.SolrIndexSearcher.getProcessedFilter(SolrIndexSearcher.java:1004)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1517)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1397)
at 
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:478)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:461)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1023)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
Source)
at java.lang.Thread.run(Unknown Source)


The field in question "preferredlocations_s" is not defined in schema.xml 
explicitly, but we have a dynamicField schema entry that covers it.



Would adding omitTermFreqAndPositions="false" to this schema line help out 
here?  Should I explicitly define this "preferredlocations_s" field in the 
schema instead and add it there?  We do have a handful of dynamic fields that 
all get covered by this rule, but it seems the "preferredlocations_s" field is 
the only one throwing errors.  All it stores is a CSV string with location IDs 
in it.



negative array size exception

2017-03-06 Thread Walker, Darren
After migrating from solr to a load balanced solrcloud with 3 ZKs on the same 
machines and solr has 3 shards (one per node) We see this logged in the UI on 
one of our solrs.
Does anyone know what this is symptomatic of?

java.lang.NegativeArraySizeException
 at org.apache.lucene.util.PriorityQueue.(PriorityQueue.java:63)
 at org.apache.lucene.util.PriorityQueue.(PriorityQueue.java:44)
 at 
org.apache.solr.handler.component.ShardFieldSortedHitQueue.(ShardFieldSortedHitQueue.java:45)
 at 
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:979)
 at 
org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:763)
 at 
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:742)
 at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:428)
 at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:2306)
 at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:658)
 at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)
 at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
 at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:296)
 at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
 at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
 at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
 at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
 at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
 at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
 at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
 at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
 at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
 at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
 at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
 at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
 at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
 at org.eclipse.jetty.server.Server.handle(Server.java:534)
 at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
 at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
 at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
 at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
 at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
 at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
 at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
 at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
 at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
 at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
 at java.lang.Thread.run(Thread.java:745)



Re: Getting an error: was indexed without position data; cannot run PhraseQuery

2017-03-06 Thread Erick Erickson
Usually an _s field is a "string" type, so be sure you didn't change
the definition without completely re-indexing. In fact I generally
either index to a new collection or remove the data directory
entirely.

right, the field isn't indexed with position information. That
combined with (probably) the WordDelimiterFilterFactory in
text_en_splitting is generating multiple tokens for inputs like 3799H.
See the admin/analysis page for how that gets broken up. Term
positions are usually enable by default, so I'm not quite sure why
they're gone unless you disabled them.

But you're on the right track regardless. you have to
1> include term positions for anything that generates phrase queries
or
2> make sure you don't generate phrase queries. edismax can do this if
you have it configured to, and then there's autoGeneratePhrasQueries
that you may find.

And do reindex completely from scratch if you change the definitions.

Best,
Erick

On Mon, Mar 6, 2017 at 1:41 PM, Pouliot, Scott
 wrote:
> We keep getting this in our Tomcat/SOLR Logs and I was wondering if a simple 
> schema change will alleviate this issue:
>
> INFO  - 2017-03-06 07:26:58.751; org.apache.solr.core.SolrCore; 
> [Client_AdvanceAutoParts] webapp=/solr path=/select 
> params={fl=candprofileid,+candid&start=0&q=*:*&wt=json&fq=issearchable:1+AND+cpentitymodifiedon:[2017-01-20T00:00:00.000Z+TO+*]+AND+clientreqid:17672+AND+folderid:132+AND+(engagedid_s:(0)+AND+atleast21_s:(1))+AND+(preferredlocations_s:(3799H))&rows=1000}
>  status=500 QTime=1480
> ERROR - 2017-03-06 07:26:58.766; org.apache.solr.common.SolrException; 
> null:java.lang.IllegalStateException: field "preferredlocations_s" was 
> indexed without position data; cannot run PhraseQuery (term=3799)
> at 
> org.apache.lucene.search.PhraseQuery$PhraseWeight.scorer(PhraseQuery.java:277)
> at 
> org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:351)
> at org.apache.lucene.search.Weight.bulkScorer(Weight.java:131)
> at 
> org.apache.lucene.search.BooleanQuery$BooleanWeight.bulkScorer(BooleanQuery.java:313)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
> at 
> org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:1158)
> at 
> org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:846)
> at 
> org.apache.solr.search.SolrIndexSearcher.getProcessedFilter(SolrIndexSearcher.java:1004)
> at 
> org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1517)
> at 
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1397)
> at 
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:478)
> at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:461)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
> at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
> at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
> at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
> at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
> at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
> at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
> at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
> at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
> at 
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1023)
> at 
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
> at 
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
> Source)
>   

RE: Getting an error: was indexed without position data; cannot run PhraseQuery

2017-03-06 Thread Pouliot, Scott
Hmm.  We haven’t changed data or the definition in YEARS now.  I'll have to do 
some more digging I guess.  Not sure re-indexing is a great thing to do though 
since this is a production setup and the database for this user is @ 50GB.  It 
would take quite a long time to reindex all that data from scratch.  H

Thanks for the quick reply Erick!

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Monday, March 6, 2017 5:33 PM
To: solr-user 
Subject: Re: Getting an error:  was indexed without position data; 
cannot run PhraseQuery

Usually an _s field is a "string" type, so be sure you didn't change the 
definition without completely re-indexing. In fact I generally either index to 
a new collection or remove the data directory entirely.

right, the field isn't indexed with position information. That combined with 
(probably) the WordDelimiterFilterFactory in text_en_splitting is generating 
multiple tokens for inputs like 3799H.
See the admin/analysis page for how that gets broken up. Term positions are 
usually enable by default, so I'm not quite sure why they're gone unless you 
disabled them.

But you're on the right track regardless. you have to
1> include term positions for anything that generates phrase queries
or
2> make sure you don't generate phrase queries. edismax can do this if
you have it configured to, and then there's autoGeneratePhrasQueries that you 
may find.

And do reindex completely from scratch if you change the definitions.

Best,
Erick

On Mon, Mar 6, 2017 at 1:41 PM, Pouliot, Scott  
wrote:
> We keep getting this in our Tomcat/SOLR Logs and I was wondering if a simple 
> schema change will alleviate this issue:
>
> INFO  - 2017-03-06 07:26:58.751; org.apache.solr.core.SolrCore; 
> [Client_AdvanceAutoParts] webapp=/solr path=/select 
> params={fl=candprofileid,+candid&start=0&q=*:*&wt=json&fq=issearchable:1+AND+cpentitymodifiedon:[2017-01-20T00:00:00.000Z+TO+*]+AND+clientreqid:17672+AND+folderid:132+AND+(engagedid_s:(0)+AND+atleast21_s:(1))+AND+(preferredlocations_s:(3799H))&rows=1000}
>  status=500 QTime=1480 ERROR - 2017-03-06 07:26:58.766; 
> org.apache.solr.common.SolrException; null:java.lang.IllegalStateException: 
> field "preferredlocations_s" was indexed without position data; cannot run 
> PhraseQuery (term=3799)
> at 
> org.apache.lucene.search.PhraseQuery$PhraseWeight.scorer(PhraseQuery.java:277)
> at 
> org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:351)
> at org.apache.lucene.search.Weight.bulkScorer(Weight.java:131)
> at 
> org.apache.lucene.search.BooleanQuery$BooleanWeight.bulkScorer(BooleanQuery.java:313)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
> at 
> org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:1158)
> at 
> org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:846)
> at 
> org.apache.solr.search.SolrIndexSearcher.getProcessedFilter(SolrIndexSearcher.java:1004)
> at 
> org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1517)
> at 
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1397)
> at 
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:478)
> at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:461)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
> at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
> at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
> at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
> at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
> at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
> at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
> at 
> org.apache.catalina.core.StandardEngineV

Re: Getting an error: was indexed without position data; cannot run PhraseQuery

2017-03-06 Thread Erick Erickson
You're in a pickle then. If you change the definition you need to re-index.

But you claim you haven't changed anything in years as far as the
schema is concerned so maybe you're going to get lucky ;).

The error you reported is because somehow there's a phrase search
going on against this field. You could have changed something in the
query parsers or eDismax definitions or the query generated on the app
side to have  phrase query get through. I'm not quite sure if you'll
get information back when the query fails, but try adding &debug=query
to the URL and see what the parsed_query and parsed_query_toString()
to see where phrases are getting generated.

Best,
Erick

On Mon, Mar 6, 2017 at 5:26 PM, Pouliot, Scott
 wrote:
> Hmm.  We haven’t changed data or the definition in YEARS now.  I'll have to 
> do some more digging I guess.  Not sure re-indexing is a great thing to do 
> though since this is a production setup and the database for this user is @ 
> 50GB.  It would take quite a long time to reindex all that data from scratch. 
>  H
>
> Thanks for the quick reply Erick!
>
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Monday, March 6, 2017 5:33 PM
> To: solr-user 
> Subject: Re: Getting an error:  was indexed without position data; 
> cannot run PhraseQuery
>
> Usually an _s field is a "string" type, so be sure you didn't change the 
> definition without completely re-indexing. In fact I generally either index 
> to a new collection or remove the data directory entirely.
>
> right, the field isn't indexed with position information. That combined with 
> (probably) the WordDelimiterFilterFactory in text_en_splitting is generating 
> multiple tokens for inputs like 3799H.
> See the admin/analysis page for how that gets broken up. Term positions are 
> usually enable by default, so I'm not quite sure why they're gone unless you 
> disabled them.
>
> But you're on the right track regardless. you have to
> 1> include term positions for anything that generates phrase queries
> or
> 2> make sure you don't generate phrase queries. edismax can do this if
> you have it configured to, and then there's autoGeneratePhrasQueries that you 
> may find.
>
> And do reindex completely from scratch if you change the definitions.
>
> Best,
> Erick
>
> On Mon, Mar 6, 2017 at 1:41 PM, Pouliot, Scott 
>  wrote:
>> We keep getting this in our Tomcat/SOLR Logs and I was wondering if a simple 
>> schema change will alleviate this issue:
>>
>> INFO  - 2017-03-06 07:26:58.751; org.apache.solr.core.SolrCore;
>> [Client_AdvanceAutoParts] webapp=/solr path=/select 
>> params={fl=candprofileid,+candid&start=0&q=*:*&wt=json&fq=issearchable:1+AND+cpentitymodifiedon:[2017-01-20T00:00:00.000Z+TO+*]+AND+clientreqid:17672+AND+folderid:132+AND+(engagedid_s:(0)+AND+atleast21_s:(1))+AND+(preferredlocations_s:(3799H))&rows=1000}
>>  status=500 QTime=1480 ERROR - 2017-03-06 07:26:58.766; 
>> org.apache.solr.common.SolrException; null:java.lang.IllegalStateException: 
>> field "preferredlocations_s" was indexed without position data; cannot run 
>> PhraseQuery (term=3799)
>> at 
>> org.apache.lucene.search.PhraseQuery$PhraseWeight.scorer(PhraseQuery.java:277)
>> at 
>> org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:351)
>> at 
>> org.apache.lucene.search.Weight.bulkScorer(Weight.java:131)
>> at 
>> org.apache.lucene.search.BooleanQuery$BooleanWeight.bulkScorer(BooleanQuery.java:313)
>> at 
>> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
>> at 
>> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
>> at 
>> org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:1158)
>> at 
>> org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:846)
>> at 
>> org.apache.solr.search.SolrIndexSearcher.getProcessedFilter(SolrIndexSearcher.java:1004)
>> at 
>> org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1517)
>> at 
>> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1397)
>> at 
>> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:478)
>> at 
>> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:461)
>> at 
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218)
>> at 
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)
>> at 
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
>> at 
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(Sol

RE: Solrcloud after restore collection, when index new documents into restored collection, leader not write to index.

2017-03-06 Thread Marquiss, John
I couldn't find an issue for this in JIRA so I thought I would add some of our 
own findings here... We are seeing the same problem with the Solr 6 Restore 
functionality. While I do not think it is important it happens on both our 
Linux environments and our local Windows development environments. Also, from 
our testing, I do not think it has anything to do with actual indexing (if you 
notice in the order of my test steps documents appear in replicas after 
creation, without re-indexing).

Test Environment:
•   Windows 10 (we see the same behavior on Linux as well)
•   Java 1.8.0_121
•   Solr 6.3.0 with patch for SOLR-9527 (To fix RESTORE shard distribution 
and add createNodeSet to RESTORE)
•   1 Zookeeper node running on localhost:2181
•   3 Solr nodes running on localhost:8171, localhost:8181 and 
localhost:8191 (hostname NY07LP521696)

Test and observations:
1)  Create a 2 shard collection 'test'

http://localhost:8181/solr/admin/collections?action=CREATE&name=test&numShards=2&replicationFactor=1&maxShardsPerNode=1&collection.configName=testconf&&createNodeSet=NY07LP521696:8171_solr,NY07LP521696:8181_solr

2)  Index 7 documents to 'test'
3)  Search 'test' - result count 7
4)  Backup collection 'test'

http://localhost:8181/solr/admin/collections?action=BACKUP&collection=test&name=copy&location=%2FData%2Fsolr%2Fbkp&async=1234

5)  Restore 'test' to collection 'test2'

http://localhost:8191/solr/admin/collections?action=RESTORE&name=copy&location=%2FData%2Fsolr%2Fbkp&collection=test2&async=1234&maxShardsPerNode=1&createNodeSet=NY07LP521696:8181_solr,NY07LP521696:8191_solr

6)  Search 'test2' - result count 7
7)  Index 2 new documents to 'test2'
8)  Search 'test2' - result count 7 (new documents do not appear in results)
9)  Create a replica for each of the shards of 'test2'

http://localhost:8191/solr/admin/collections?action=ADDREPLICA&collection=test2&shard=shard1&node=NY07LP521696:8181_solr

http://localhost:8191/solr/admin/collections?action=ADDREPLICA&collection=test2&shard=shard2&node=NY07LP521696:8171_solr

*** Note that it is not necessary to try to re-index the 2 new documents before 
this step, just create replicas and query ***
10) Repeatedly query 'test2' - result count randomly changes between 7, 8 
and 9. This is because Solr is randomly selecting replicas of 'test2' and one 
of the two new docs were added to each of the shards in the collection so if 
replica0 of both shards are selected the result is 7, if replica0 and replica1 
are selected for each of either shard the result is 8 and if replica1 is 
selected for both shards the result is 9. This is random behavior because we do 
not know ahead of time which shards the new documents will be added to and if 
they will be split evenly.

Query 'test2' with shards parameter of original restored shards - 
result count 7

http://localhost:8181/solr/test2/select?q=*:*&shards=localhost:8181/solr/test2_shard1_replica0,localhost:8181/solr/test2_shard2_replica0

Query 'test2' with shard parameter of one original restored shard and 
one replica shard - result count 8

http://localhost:8181/solr/test2/select?q=*:*&shards=localhost:8181/solr/test2_shard1_replica0,localhost:8181/solr/test2_shard2_replica1

http://localhost:8181/solr/test2/select?q=*:*&shards=localhost:8181/solr/test2_shard1_replica1,localhost:8181/solr/test2_shard2_replica0

Query 'test2' with shards parameter of replica shards - result count 9

http://localhost:8181/solr/test2/select?q=*:*&shards=localhost:8181/solr/test2_shard1_replica1,localhost:8181/solr/test2_shard2_replica1

13) Note that on the Solr admin Core statistics show the restored cores as 
not current, the Searching master is Gen 2, the Replicable master is Gen 3, on 
the replicated core both the Searching and Replicable master is Gen 3
14) Restarting Solr corrects the issue

Thoughts:
•   Solr is backing up and restoring correctly
•   The restored collection data is stored under a path like: 
…/node8181/test2_shard1_replica0/restore.20170307005909295 instead of 
…/node8181/test2_shard1_replica0/index
•   Indexing is actually behaving correctly (documents are available in 
replicas even without re-indexing)
•   When asked to about the state of the searcher though the admin page 
core details Solr does know that the searcher is not current

I was looking in the source but haven’t found the root cause yet. My gut 
feeling is that because the index data dir is …/restore.20170307005909295 
instead of …/index Solr isn't seeing the index changes and recycling the 
searcher for the restored cores. Neither committing the collection or forcing 
an optimize fix the issue, restarting Solr fixes the issue but this will not be 
viable for us in production.

John Marquiss

-Original Message-
>From: Jerome Yang [mailto:jey...@pivotal.io]