Re: Query fields with data of certain length

2018-01-04 Thread Emir Arnautović
Hi Edwin,
I don’t have enough knowledge in eastern languages to know what is expected 
number when you as for sting length. Maybe you can try some of regex unicode 
settings and see if you’ll get what you need: try setting unicode flag with 
(?U) or try using regex groups and ranges. If you provide example string and 
expected length, maybe we could provide you regex.

Thanks,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 4 Jan 2018, at 04:37, Zheng Lin Edwin Yeo  wrote:
> 
> Hi Emir,
> 
> So this would likely be different from what the operating system counts, as
> the operating system may consider each Chinese characters as 3 to 4 bytes.
> Which is probably why I could not find any record with subject:/.{255,}.*/
> 
> Is there other tools that we can use to query the length for data that are
> already indexed which are not in the standard English language? (Eg:
> Chinese, Japanese, etc)
> 
> Regards,
> Edwin
> 
> On 3 January 2018 at 23:51, Emir Arnautović 
> wrote:
> 
>> Hi Edwin,
>> I do not know, but my guess would be that each character is counted as 1
>> in regex regardless how many bytes it takes in used encoding.
>> 
>> Regards,
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>> 
>> 
>> 
>>> On 3 Jan 2018, at 16:43, Zheng Lin Edwin Yeo 
>> wrote:
>>> 
>>> Thanks for the reply.
>>> 
>>> I am doing the search on existing data that has already been indexed, and
>>> it is likely to be a one time thing.
>>> 
>>> This  subject:/.{255,}.*/  works for English characters. However, there
>> are
>>> Chinese characters in some of the records. The length seems to be more
>> than
>>> 255, but it does not shows up in the results.
>>> 
>>> Do you know how the length for Chinese characters and other languages are
>>> being determined?
>>> 
>>> Regards,
>>> Edwin
>>> 
>>> 
>>> On 3 January 2018 at 23:01, Alexandre Rafalovitch 
>>> wrote:
>>> 
 Do that during indexing as Emir suggested. Specifically, use an
 UpdateRequestProcessor chain, probably with the Clone and FieldLength
 processors: http://www.solr-start.com/javadoc/solr-lucene/org/
 apache/solr/update/processor/FieldLengthUpdateProcessorFactory.html
 
 Regards,
  Alex.
 
 On 31 December 2017 at 22:00, Zheng Lin Edwin Yeo >> 
 wrote:
> Hi,
> 
> Would like to check, if it is possible to query a field which has data
>> of
> more than a certain length?
> 
> Like for example, I want to query the field subject that has more than
 255
> bytes. Is it possible?
> 
> I am currently using Solr 6.5.1.
> 
> Regards,
> Edwin
 
>> 
>> 



Replacing \n with using RegexReplaceProcessorFactory

2018-01-04 Thread Zheng Lin Edwin Yeo
Hi,

I'm using Solr 7.2.0, and I'm trying to replace \n with  by using
RegexReplaceProcessorFactory.

However, I could not get the below configuration in solrconfig.xml to be
loaded.

 
   content
   \n  
   


Understand that  is a special character. Can we do some escape sequence
to it? I have tried
\, but it does not work.

Below is the error message which I got.

Exception during parsing file:
solrconfig.xml:org.xml.sax.SAXParseException; systemId:
solrres:/solrconfig.xml; lineNumber: 1508; columnNumber: 36; The
element type "br" must be terminated by the matching end-tag "".


Regards,
Edwin


Re: Replacing \n with using RegexReplaceProcessorFactory

2018-01-04 Thread Emir Arnautović
Hi Edwin,
You need to encode  as 
HTH, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 4 Jan 2018, at 10:59, Zheng Lin Edwin Yeo wrote: > > Hi, > > I'm using Solr 7.2.0, and I'm trying to replace \n with by using > RegexReplaceProcessorFactory. > > However, I could not get the below configuration in solrconfig.xml to be > loaded. > > > content > \n > > > > Understand that is a special character. Can we do some escape sequence > to it? I have tried > \, but it does not work. > > Below is the error message which I got. > > Exception during parsing file: > solrconfig.xml:org.xml.sax.SAXParseException; systemId: > solrres:/solrconfig.xml; lineNumber: 1508; columnNumber: 36; The > element type "br" must be terminated by the matching end-tag "". > > > Regards, > Edwin

Re: Query fields with data of certain length

2018-01-04 Thread Zheng Lin Edwin Yeo
Hi Emir,

An example of the string in Chinese is 预支款管理及账务处理办法

The number of characters is 12, but the expected length should be 36.

Regards,
Edwin


On 4 January 2018 at 16:21, Emir Arnautović 
wrote:

> Hi Edwin,
> I don’t have enough knowledge in eastern languages to know what is
> expected number when you as for sting length. Maybe you can try some of
> regex unicode settings and see if you’ll get what you need: try setting
> unicode flag with (?U) or try using regex groups and ranges. If you provide
> example string and expected length, maybe we could provide you regex.
>
> Thanks,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 4 Jan 2018, at 04:37, Zheng Lin Edwin Yeo 
> wrote:
> >
> > Hi Emir,
> >
> > So this would likely be different from what the operating system counts,
> as
> > the operating system may consider each Chinese characters as 3 to 4
> bytes.
> > Which is probably why I could not find any record with
> subject:/.{255,}.*/
> >
> > Is there other tools that we can use to query the length for data that
> are
> > already indexed which are not in the standard English language? (Eg:
> > Chinese, Japanese, etc)
> >
> > Regards,
> > Edwin
> >
> > On 3 January 2018 at 23:51, Emir Arnautović <
> emir.arnauto...@sematext.com>
> > wrote:
> >
> >> Hi Edwin,
> >> I do not know, but my guess would be that each character is counted as 1
> >> in regex regardless how many bytes it takes in used encoding.
> >>
> >> Regards,
> >> Emir
> >> --
> >> Monitoring - Log Management - Alerting - Anomaly Detection
> >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >>
> >>
> >>
> >>> On 3 Jan 2018, at 16:43, Zheng Lin Edwin Yeo 
> >> wrote:
> >>>
> >>> Thanks for the reply.
> >>>
> >>> I am doing the search on existing data that has already been indexed,
> and
> >>> it is likely to be a one time thing.
> >>>
> >>> This  subject:/.{255,}.*/  works for English characters. However, there
> >> are
> >>> Chinese characters in some of the records. The length seems to be more
> >> than
> >>> 255, but it does not shows up in the results.
> >>>
> >>> Do you know how the length for Chinese characters and other languages
> are
> >>> being determined?
> >>>
> >>> Regards,
> >>> Edwin
> >>>
> >>>
> >>> On 3 January 2018 at 23:01, Alexandre Rafalovitch 
> >>> wrote:
> >>>
>  Do that during indexing as Emir suggested. Specifically, use an
>  UpdateRequestProcessor chain, probably with the Clone and FieldLength
>  processors: http://www.solr-start.com/javadoc/solr-lucene/org/
>  apache/solr/update/processor/FieldLengthUpdateProcessorFactory.html
> 
>  Regards,
>   Alex.
> 
>  On 31 December 2017 at 22:00, Zheng Lin Edwin Yeo <
> edwinye...@gmail.com
> >>>
>  wrote:
> > Hi,
> >
> > Would like to check, if it is possible to query a field which has
> data
> >> of
> > more than a certain length?
> >
> > Like for example, I want to query the field subject that has more
> than
>  255
> > bytes. Is it possible?
> >
> > I am currently using Solr 6.5.1.
> >
> > Regards,
> > Edwin
> 
> >>
> >>
>
>


Re: Replacing \n with using RegexReplaceProcessorFactory

2018-01-04 Thread Zheng Lin Edwin Yeo
Thanks Emir.

It is working now.

Regards,
Edwin

On 4 January 2018 at 18:02, Emir Arnautović 
wrote:

> Hi Edwin,
> You need to encode  as 
> > HTH, > Emir > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > > > On 4 Jan 2018, at 10:59, Zheng Lin Edwin Yeo > wrote: > > > > Hi, > > > > I'm using Solr 7.2.0, and I'm trying to replace \n with by using > > RegexReplaceProcessorFactory. > > > > However, I could not get the below configuration in solrconfig.xml to be > > loaded. > > > > > > content > > \n > > > > > > > > Understand that is a special character. Can we do some escape > sequence > > to it? I have tried > > \, but it does not work. > > > > Below is the error message which I got. > > > > Exception during parsing file: > > solrconfig.xml:org.xml.sax.SAXParseException; systemId: > > solrres:/solrconfig.xml; lineNumber: 1508; columnNumber: 36; The > > element type "br" must be terminated by the matching end-tag "". > > > > > > Regards, > > Edwin > >

Re: SolrCloud Nodes going to recovery state during indexing

2018-01-04 Thread Sravan Kumar
Emir,
   'delete_by_query' is the cause for the replicas going to recover state.
   I replaced it with delete_by_id as you suggested. Everything works fine
after that. The cluster held for nearly 3 hours without any failures.
  Thanks Emir.


On Wed, Jan 3, 2018 at 8:41 PM, Emir Arnautović <
emir.arnauto...@sematext.com> wrote:

> Hi Sravan,
> DBQ does not play well with indexing - it causes indexing to be completely
> blocked on replicas while it is running. It is highly likely that it is the
> root cause of your issues. If you can change indexing logic to avoid it,
> you can quickly test it. What you can do as a workaround is to query for
> IDs that needs to be deleted and execute bulk delete by ID - that will not
> cause issues as DBQ.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 3 Jan 2018, at 16:04, Sravan Kumar  wrote:
> >
> > Emir,
> >Yes there is a delete_by_query on every bulk insert.
> >This delete_by_query deletes all the documents which are updated
> lesser
> > than a day before the current time.
> >Is bulk delete_by_query the reason?
> >
> > On Wed, Jan 3, 2018 at 7:58 PM, Emir Arnautović <
> > emir.arnauto...@sematext.com> wrote:
> >
> >> Do you have deletes by query while indexing or it is append only index?
> >>
> >> Regards,
> >> Emir
> >> --
> >> Monitoring - Log Management - Alerting - Anomaly Detection
> >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >>
> >>
> >>
> >>> On 3 Jan 2018, at 12:16, sravan  wrote:
> >>>
> >>> SolrCloud Nodes going to recovery state during indexing
> >>>
> >>>
> >>> We have solr cloud setup with the settings shared below. We have a
> >> collection with 3 shards and a replica for each of them.
> >>>
> >>> Normal State(As soon as the whole cluster is restarted):
> >>>- Status of all the shards is UP.
> >>>- a bulk update request of 50 documents each takes < 100ms.
> >>>- 6-10 simultaneous bulk updates.
> >>>
> >>> Nodes going to recover state after updates for 15-30 mins.
> >>>- Some shards starts giving the following ERRORs:
> >>>- o.a.s.h.RequestHandlerBase org.apache.solr.update.processor.
> >> DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async
> >> exception during distributed update: Read timed out
> >>>- o.a.s.u.StreamingSolrClients error java.net.
> SocketTimeoutException:
> >> Read timed out
> >>>- the following error is seen on the shard which goes to recovery
> >> state.
> >>>- too many updates received since start - startingUpdates no
> >> longer overlaps with our currentUpdates.
> >>>- Sometimes, the same shard even goes to DOWN state and needs a node
> >> restart to come back.
> >>>- a bulk update request of 50 documents takes more than 5 seconds.
> >> Sometimes even >120 secs. This is seen for all the requests if at least
> one
> >> node is in recovery state in the whole cluster.
> >>>
> >>> We have a standalone setup with the same collection schema which is
> able
> >> to take update & query load without any errors.
> >>>
> >>>
> >>> We have the following solrcloud setup.
> >>>- setup in AWS.
> >>>
> >>>- Zookeeper Setup:
> >>>- number of nodes: 3
> >>>- aws instance type: t2.small
> >>>- instance memory: 2gb
> >>>
> >>>- Solr Setup:
> >>>- Solr version: 6.6.0
> >>>- number of nodes: 3
> >>>- aws instance type: m5.xlarge
> >>>- instance memory: 16gb
> >>>- number of cores: 4
> >>>- JAVA HEAP: 8gb
> >>>- JAVA VERSION: oracle java version "1.8.0_151"
> >>>- GC settings: default CMS.
> >>>
> >>>collection settings:
> >>>- number of shards: 3
> >>>- replication factor: 2
> >>>- total 6 replicas.
> >>>- total number of documents in the collection: 12 million
> >>>- total number of documents in each shard: 4 million
> >>>- Each document has around 25 fields with 12 of them
> >> containing textual analysers & filters.
> >>>- Commit Strategy:
> >>>- No explicit commits from application code.
> >>>- Hard commit of 15 secs with OpenSearcher as false.
> >>>- Soft commit of 10 mins.
> >>>- Cache Strategy:
> >>>- filter queries
> >>>- number: 512
> >>>- autowarmCount: 100
> >>>- all other caches
> >>>- number: 512
> >>>- autowarmCount: 0
> >>>- maxWarmingSearchers: 2
> >>>
> >>>
> >>> - We tried the following
> >>>- commit strategy
> >>>- hard commit - 150 secs
> >>>- soft commit - 5 mins
> >>>- with GCG1 garbage collector based on
> https://wiki.apache.org/solr/
> >> ShawnHeisey#Java_8_recommendation_for_Solr:
> >>>- the nodes go

Re: SolrCloud Nodes going to recovery state during indexing

2018-01-04 Thread Emir Arnautović
Hi Sravan,
Glad to hear it helped!

Regards,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 4 Jan 2018, at 13:36, Sravan Kumar  wrote:
> 
> Emir,
>   'delete_by_query' is the cause for the replicas going to recover state.
>   I replaced it with delete_by_id as you suggested. Everything works fine
> after that. The cluster held for nearly 3 hours without any failures.
>  Thanks Emir.
> 
> 
> On Wed, Jan 3, 2018 at 8:41 PM, Emir Arnautović <
> emir.arnauto...@sematext.com> wrote:
> 
>> Hi Sravan,
>> DBQ does not play well with indexing - it causes indexing to be completely
>> blocked on replicas while it is running. It is highly likely that it is the
>> root cause of your issues. If you can change indexing logic to avoid it,
>> you can quickly test it. What you can do as a workaround is to query for
>> IDs that needs to be deleted and execute bulk delete by ID - that will not
>> cause issues as DBQ.
>> 
>> HTH,
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>> 
>> 
>> 
>>> On 3 Jan 2018, at 16:04, Sravan Kumar  wrote:
>>> 
>>> Emir,
>>>   Yes there is a delete_by_query on every bulk insert.
>>>   This delete_by_query deletes all the documents which are updated
>> lesser
>>> than a day before the current time.
>>>   Is bulk delete_by_query the reason?
>>> 
>>> On Wed, Jan 3, 2018 at 7:58 PM, Emir Arnautović <
>>> emir.arnauto...@sematext.com> wrote:
>>> 
 Do you have deletes by query while indexing or it is append only index?
 
 Regards,
 Emir
 --
 Monitoring - Log Management - Alerting - Anomaly Detection
 Solr & Elasticsearch Consulting Support Training - http://sematext.com/
 
 
 
> On 3 Jan 2018, at 12:16, sravan  wrote:
> 
> SolrCloud Nodes going to recovery state during indexing
> 
> 
> We have solr cloud setup with the settings shared below. We have a
 collection with 3 shards and a replica for each of them.
> 
> Normal State(As soon as the whole cluster is restarted):
>   - Status of all the shards is UP.
>   - a bulk update request of 50 documents each takes < 100ms.
>   - 6-10 simultaneous bulk updates.
> 
> Nodes going to recover state after updates for 15-30 mins.
>   - Some shards starts giving the following ERRORs:
>   - o.a.s.h.RequestHandlerBase org.apache.solr.update.processor.
 DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async
 exception during distributed update: Read timed out
>   - o.a.s.u.StreamingSolrClients error java.net.
>> SocketTimeoutException:
 Read timed out
>   - the following error is seen on the shard which goes to recovery
 state.
>   - too many updates received since start - startingUpdates no
 longer overlaps with our currentUpdates.
>   - Sometimes, the same shard even goes to DOWN state and needs a node
 restart to come back.
>   - a bulk update request of 50 documents takes more than 5 seconds.
 Sometimes even >120 secs. This is seen for all the requests if at least
>> one
 node is in recovery state in the whole cluster.
> 
> We have a standalone setup with the same collection schema which is
>> able
 to take update & query load without any errors.
> 
> 
> We have the following solrcloud setup.
>   - setup in AWS.
> 
>   - Zookeeper Setup:
>   - number of nodes: 3
>   - aws instance type: t2.small
>   - instance memory: 2gb
> 
>   - Solr Setup:
>   - Solr version: 6.6.0
>   - number of nodes: 3
>   - aws instance type: m5.xlarge
>   - instance memory: 16gb
>   - number of cores: 4
>   - JAVA HEAP: 8gb
>   - JAVA VERSION: oracle java version "1.8.0_151"
>   - GC settings: default CMS.
> 
>   collection settings:
>   - number of shards: 3
>   - replication factor: 2
>   - total 6 replicas.
>   - total number of documents in the collection: 12 million
>   - total number of documents in each shard: 4 million
>   - Each document has around 25 fields with 12 of them
 containing textual analysers & filters.
>   - Commit Strategy:
>   - No explicit commits from application code.
>   - Hard commit of 15 secs with OpenSearcher as false.
>   - Soft commit of 10 mins.
>   - Cache Strategy:
>   - filter queries
>   - number: 512
>   - autowarmCount: 100
>   - all other caches
>   - number: 512
>   - autowarmCount: 0
>   - maxWarmingSearchers: 2
> 
> 
> - We tried the following

Re: problem with Solr Sorting by score and distance together

2018-01-04 Thread Shawn Heisey

On 1/3/2018 6:16 PM, Deepak Udapudi wrote:

Assume that,  I am searching for car care centers. Solr collection has the data 
for all the major car care centers. As an example I search for Firestone car 
care centers in a 5 miles radius. In the search results I am supposed to 
receive the firestone car care centers list within 5 miles from the specified 
location and the centers should be sorted in distance order.

In the solr query handler I have specified the following.

i)I have specified the query condition (q) to be based on 
distance parameter (basically search for records within a certain distance in 
miles).

ii)   I have specified the filter query  conditions (fq) where 
fields accepting general text are matching free text input (for ex :- Firestone 
Carcare)


These should probably be the other way around -- so the q parameter 
contains the user provided query and the distance is a filter.  It's 
more in line with the way I expect things to be done, and it would 
probably result in more efficient performance of the filterCache.  Also, 
filter queries do not affect the score -- at all.  So you're not getting 
any relevance information from the query that your user is typing.


I don't know very much about spatial, but there is a lot of 
documentation I can read.



iii) I have specified the sort condition(sort ) to be based on 
score(calculated based on the filter query (fq) conditions applied in the 2nd 
item in the list) and distance. If there are duplicate records matching by 
score, then distance should be used to order the duplicate records.


Can you provide the entry from solr.log that shows the full query 
request?   Below is the kind of log entry I am talking about.  This was 
obtained by going to the admin UI query tab, typing in "example query", 
and then executing that query.


2018-01-04 13:10:03.042 INFO  (qtp1394336709-1102402) [   x:ncmain] 
o.a.s.c.S.Request [ncmain]  webapp=/solr path=/select 
params={q=example+query&indent=on&wt=json&_=1515071388516} hits=2163 
status=0 QTime=4473


Thanks,
Shawn


Json Facet Query Stripping Field Name with Hyphen

2018-01-04 Thread RAUNAK AGRAWAL
Hi Guys,

I am facing issue where I am trying to follow the JSON facet API. I have
data in my collection and field names are like "week_0", "week_-1" which
means current week and previous week respectively.

When I am querying for week_0 summation using the following query I am able
to get the result.

http://localhost:8983/solr/collection1/query?q=*:*&json.facet={week_0_sum:'sum(week_0)'}&rows=0


But when I am trying to do the same for any field "week_-*", it is break.

For example when I am trying:
http://localhost:8983/solr/collection1/query?q=*:*&json.facet={week_-91_sum:%27sum(week_-91)%27}&rows=0


I am getting the exception as* "msg": "undefined field: \"week_\"''*


That means solr is stripping field name after hyphen (-). Is there
workaround to fix this. I tried adding escape character (\) but it is of no
help.

With escape:
http://localhost:8983/solr/collection1/query?q=*:*&json.facet={week_-91_sum:%27sum(week_\-91)%27}&rows=0


Please help me regarding this.

Thanks


Replication Factor Bug in Collections Restore API?

2018-01-04 Thread Ansgar Wiechers
Hi all.

I'm running Solr 7.1 in SolrCloud mode ona a 3-node cluster and tried
using the backup/restore API for the first time. Backup worked fine, but
when trying to restore the backed-up collection I ran into an unexpected
problem with the replication factor setting.

Below command attempts to restore a backup of the collection "demo" with
3 shards, creating 2 replicas per shard:

# curl -s -k 
'https://localhost:8983/solr/admin/collections?action=restore&name=demo&location=/srv/backup/solr/solr-dev&collection=demo&maxShardsPerNode=2&replicationFactor=2'
{
  "error": {
"code": 400,
"msg": "Solr cloud with available number of nodes:3 is insufficient for 
restoring a collection with 3 shards, total replicas per shard 6 and 
maxShardsPerNode 2. Consider increasing maxShardsPerNode value OR number of 
available nodes.",
"metadata": [
  "error-class",
  "org.apache.solr.common.SolrException",
  "root-error-class",
  "org.apache.solr.common.SolrException"
]
  },
  "exception": {
"rspCode": 400,
"msg": "Solr cloud with available number of nodes:3 is insufficient for 
restoring a collection with 3 shards, total replicas per shard 6 and 
maxShardsPerNode 2. Consider increasing maxShardsPerNode value OR number of 
available nodes."
  },
  "Operation restore caused exception:": 
"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: 
Solr cloud with available number of nodes:3 is insufficient for restoring a 
collection with 3 shards, total replicas per shard 6 and maxShardsPerNode 2. 
Consider increasing maxShardsPerNode value OR number of available nodes.",
  "responseHeader": {
"QTime": 28,
"status": 400
  }
}

It looks to me like the restore API multiplies the replication factor
with the number of nodes, which is not how the replication factor
behaves in other contexts. The documentation[1] also didn't lead me to
expect this behavior:

> replicationFactor
>
>The number of replicas to be created for each shard.

Is this expected behavior (by anyone but me)?
Should I report it as a bug?

[1]: https://lucene.apache.org/solr/guide/7_1/collections-api.html

Regards
Ansgar Wiechers
-- 
"Abstractions save us time working, but they don't save us time learning."
--Joel Spolsky


Re: Learning to Rank (LTR) with grouping

2018-01-04 Thread Roopa Rao
Hi,

Any guidance on this would be helpful.

Thank you,
Roopa

On Tue, Dec 19, 2017 at 8:47 PM, Roopa Rao  wrote:

> Hi Diego,
>
> Thank you for looking into it further.
> We recently ported over to 6.6 version solely to use LTR feature as it is
> critical for us.
>
> Since its not working with grouping in the base version, I am trying to
> evaluate if there is any alternative way to make it work in 6.6 versus
> upgrading to 7.0.
>
> Any guidance you could provide on what can be done to use 6.6 with
> grouping + LTR or any alternatives would be helpful. Do I read your
> response as needing to go to 7.0 when you say upstream?
>
> Thank you,
> Roopa
>
>
> On Tue, Dec 19, 2017 at 1:37 PM, Diego Ceccarelli <
> diego.ceccare...@gmail.com> wrote:
>
>> Hi Roopa, unfortunately I can't port the patch to the branch_6_6, but
>> soon I'll update to upstream. Sorry about that.
>>
>> On Mon, Dec 18, 2017 at 7:52 PM, Roopa Rao  wrote:
>> > Hi -
>> >
>> > I merged the code from the bloomberg master-solr-8776 branch to
>> branch_6_6
>> > on Solr.
>> >
>> > When I tried to compile the solr source code, I am getting multiple
>> > compilation errors (Attached), which seems to be due to the fact that
>> the
>> > branch master-solr-8776 may not be compatible with branch_6_6.
>> >
>> > Could you please provide your input if master-solr-8776 is compatible
>> with
>> > branch_6_6?
>> >
>> > If this is not the case then how to proceed with using fix in
>> > master-solr-8776 with branch_6_6 can a new patch be created for this?
>> >
>> > Thank you,
>> > Roopa
>> >
>> > On Mon, Dec 11, 2017 at 9:54 AM, Roopa Rao  wrote:
>> >>
>> >> Hi Diego,
>> >>
>> >> Thank you,
>> >>
>> >> I am interested in reranking the documents inside one of the groups.
>> >>
>> >> I will try the options you mentioned here.
>> >>
>> >> Thank you,
>> >> Roopa
>> >>
>> >> On Mon, Dec 11, 2017 at 6:57 AM, Diego Ceccarelli (BLOOMBERG/ LONDON)
>> >>  wrote:
>> >>>
>> >>> Hi Roopa,
>> >>>
>> >>> If you look at the diff:
>> >>>
>> >>> https://github.com/apache/lucene-solr/pull/162/files
>> >>>
>> >>> I didn't change much in SolrIndexSearcher, you can try to skip the
>> file
>> >>> when applying the patch and redo the changes after.
>> >>>
>> >>> Alternatively, the feature branch is available here:
>> >>>
>> >>> https://github.com/bloomberg/lucene-solr/commits/master-solr-8776
>> >>>
>> >>> you could try to merge with that or cheery-pick my changes.
>> >>>
>> >>> Are you interested in reranking the groups or also in reranking the
>> >>> documents inside each group?
>> >>>
>> >>> Cheers,
>> >>> Diego
>> >>>
>> >>>
>> >>> From: solr-user@lucene.apache.org At: 12/09/17 19:07:25To:
>> >>> solr-user@lucene.apache.org
>> >>> Subject: Re: Learning to Rank (LTR) with grouping
>> >>>
>> >>> Hi I tried to apply this JIRA SOLR-8776 as a patch as this feature is
>> >>> critical.
>> >>>
>> >>> Here are the steps I took on my mac:
>> >>>
>> >>> On branch branch_6_5
>> >>>
>> >>> Your branch is up-to-date with 'origin/branch_6_5'
>> >>>
>> >>> patch -p1 -i 162.patch --dry-run
>> >>>
>> >>>
>> >>> I am getting Failures for certain Hunks
>> >>>
>> >>> Example:
>> >>>
>> >>> patching file
>> >>> solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java
>> >>>
>> >>> Hunk #1 FAILED at 1471.
>> >>>
>> >>>
>> >>> Could you please give your input on how to apply this ticket as a
>> patch
>> >>> for
>> >>> branch_6_5 ?
>> >>>
>> >>>
>> >>> Thank you,
>> >>>
>> >>> Roopa
>> >>>
>> >>> On Fri, Dec 8, 2017 at 6:52 PM, Roopa Rao  wrote:
>> >>>
>> >>> > Hi Diego,
>> >>> >
>> >>> > Thank you, I will look into this and see how I could patch this.
>> >>> >
>> >>> > Thank you for your quick response,
>> >>> > Roopa
>> >>> >
>> >>> >
>> >>> > On Fri, Dec 8, 2017 at 5:44 PM, Diego Ceccarelli <
>> >>> > diego.ceccare...@gmail.com> wrote:
>> >>> >
>> >>> >> Hi Roopa,
>> >>> >>
>> >>> >> LTR is implemented using RankQuery, and at the moment grouping
>> doens't
>> >>> >> support RankQuery.
>> >>> >> I opened a jira item time ago
>> >>> >> (https://issues.apache.org/jira/browse/SOLR-8776) and I would be
>> happy
>> >>> >> to receive feedback on that.  You can find the code here
>> >>> >> https://github.com/apache/lucene-solr/pull/162.
>> >>> >>
>> >>> >> Cheers,
>> >>> >> diego
>> >>> >>
>> >>> >> On Fri, Dec 8, 2017 at 9:15 PM, Roopa Rao 
>> wrote:
>> >>> >> > Hi,
>> >>> >> >
>> >>> >> > I am using grouping and LTR together and the results are not
>> getting
>> >>> >> > re-rank as it does without grouping.
>> >>> >> >
>> >>> >> > I am passing &rq parameter.
>> >>> >> >
>> >>> >> > Does LTR work with grouping on?
>> >>> >> > Solr version 6.5
>> >>> >> >
>> >>> >> > Thank you,
>> >>> >> > Roopa
>> >>> >>
>> >>> >
>> >>> >
>> >>>
>> >>>
>> >>
>> >
>>
>
>


Re: Json Facet Query Stripping Field Name with Hyphen

2018-01-04 Thread Erick Erickson
>From the ref guide:

"Field names should consist of alphanumeric or underscore characters
only and not start with a digit."

While field naming isn't strictly enforced, having field names like
week_-1 is also not guaranteed to be supported. You should change your
field name.

I raised SOLR-11819 for one place I see in the ref guide where a
hyphen is used, if you see any others please add the location to the
JIRA (SOLR-11819).

Best,
Erick

On Thu, Jan 4, 2018 at 7:02 AM, RAUNAK AGRAWAL  wrote:
> Hi Guys,
>
> I am facing issue where I am trying to follow the JSON facet API. I have
> data in my collection and field names are like "week_0", "week_-1" which
> means current week and previous week respectively.
>
> When I am querying for week_0 summation using the following query I am able
> to get the result.
>
> http://localhost:8983/solr/collection1/query?q=*:*&json.facet={week_0_sum:'sum(week_0)'}&rows=0
>
>
> But when I am trying to do the same for any field "week_-*", it is break.
>
> For example when I am trying:
> http://localhost:8983/solr/collection1/query?q=*:*&json.facet={week_-91_sum:%27sum(week_-91)%27}&rows=0
>
>
> I am getting the exception as* "msg": "undefined field: \"week_\"''*
>
>
> That means solr is stripping field name after hyphen (-). Is there
> workaround to fix this. I tried adding escape character (\) but it is of no
> help.
>
> With escape:
> http://localhost:8983/solr/collection1/query?q=*:*&json.facet={week_-91_sum:%27sum(week_\-91)%27}&rows=0
>
>
> Please help me regarding this.
>
> Thanks


RE: Solrcloud with Master/Slave

2018-01-04 Thread Sundaram, Dinesh
Thanks Shawn for your prompt response. Assume I have solrcloud A server with 1 
node runs on 8983 port and solrcloud B server with 1 node runs on 8983, here I 
want to synch up the collection between solrcloud A and B using the below 
replication handler. Is this advisable to use at the solrcloud B ?



http://solrcloudA:8983/solr/${solr.core.name}/replication
00:00:20





Dinesh Sundaram
MBS Platform Engineering

Mastercard



-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org]
Sent: Tuesday, January 2, 2018 5:33 PM
To: solr-user@lucene.apache.org
Subject: Re: Solrcloud with Master/Slave

On 1/2/2018 3:32 PM, Sundaram, Dinesh wrote:
> I have spun up single solrcloud node on 2 servers.

This makes no sense.  If you have two servers, then you probably have more than 
a single node.

> tried to synch up the data b/w those servers via zookeeper

This is not done with zookeeper.  SolrCloud should handle it automatically.  
SolrCloud uses the zookeeper database to *coordinate* keeping machines in sync, 
but it's Solr that does the work, not zookeeper.

This makes even less sense when taken in context with the previous sentence.  
If you only have a single node, then you can't possibly sync between them.

> but didn’t work well due to out of memory issues, ensemble issues with
> multiple ports connectivity. So had to move to Master slave
> replication b/w those 2 solrcloud nodes. I couldn’t find any issues so
> far. Is this advisable? Because I’m wondering that looks like mixing
> up solrcloud and master/slave replication.

If you're getting OOME problems, then whatever program threw the OOME most 
likely needs more heap.  Or you need to take steps to reduce the amount of heap 
that's required.  Note that this second option might not actually be possible 
... increasing the heap is probably the only option you have.  Since version 
5.0, Solr has shipped with the default heap set to 512MB, which is extremely 
small.  Most users need to increase it.

You can't mix master-slave replication and SolrCloud.  SolrCloud takes over the 
replication feature for its own purposes.  Trying to mix these is going to 
cause you problems.  You may not run into the problems immediately, but it is 
likely that you would run into a problem eventually.  Data loss would be 
possible.

The latest versions of Solr have new SolrCloud replication types that closely 
mimic the old master-slave replication.

Perhaps you should start over and describe what you've actually seen -- exactly 
what you've done and configured, and how the results differed from your 
expectations.  Precise commands entered will be helpful.

Thanks,
Shawn


CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for the 
use of the intended recipient and may contain information that is privileged, 
confidential or exempt from disclosure under applicable law. If you are not the 
intended recipient, any disclosure, distribution or other use of this e-mail 
message or attachments is prohibited. If you have received this e-mail message 
in error, please delete and notify the sender immediately. Thank you.


Re: Solrcloud with Master/Slave

2018-01-04 Thread Erick Erickson
Whoa. I don't think you should be doing this at all. This really
appears to be an XY problem. You're asking "how to do X" without
telling us what the problem you're trying to solve is (the Y). _Why_
do you want to set things up this way? A one-time synchronization or
to keep both collections in sync?


Cross Data Center Replication (CDCR) is designed to keep two separate
collections in sync on an ongoing basis.

If this is a one-time deal, you can manually issue a replication API
"fetchindex" command. What I'd do in that case is set up your
collection B with each shard having exactly one replica (i.e. a leader
and no followers). Do the fetch and verify that your new collection is
as you want it then ADDREPLICA to build out your redundancy.

Best,
Erick

On Thu, Jan 4, 2018 at 8:01 AM, Sundaram, Dinesh
 wrote:
> Thanks Shawn for your prompt response. Assume I have solrcloud A server with 
> 1 node runs on 8983 port and solrcloud B server with 1 node runs on 8983, 
> here I want to synch up the collection between solrcloud A and B using the 
> below replication handler. Is this advisable to use at the solrcloud B ?
>
> 
> 
>  name="masterUrl">http://solrcloudA:8983/solr/${solr.core.name}/replication
> 00:00:20
> 
> 
>
>
>
> Dinesh Sundaram
> MBS Platform Engineering
>
> Mastercard
>
>
>
> -Original Message-
> From: Shawn Heisey [mailto:apa...@elyograg.org]
> Sent: Tuesday, January 2, 2018 5:33 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solrcloud with Master/Slave
>
> On 1/2/2018 3:32 PM, Sundaram, Dinesh wrote:
>> I have spun up single solrcloud node on 2 servers.
>
> This makes no sense.  If you have two servers, then you probably have more 
> than a single node.
>
>> tried to synch up the data b/w those servers via zookeeper
>
> This is not done with zookeeper.  SolrCloud should handle it automatically.  
> SolrCloud uses the zookeeper database to *coordinate* keeping machines in 
> sync, but it's Solr that does the work, not zookeeper.
>
> This makes even less sense when taken in context with the previous sentence.  
> If you only have a single node, then you can't possibly sync between them.
>
>> but didn’t work well due to out of memory issues, ensemble issues with
>> multiple ports connectivity. So had to move to Master slave
>> replication b/w those 2 solrcloud nodes. I couldn’t find any issues so
>> far. Is this advisable? Because I’m wondering that looks like mixing
>> up solrcloud and master/slave replication.
>
> If you're getting OOME problems, then whatever program threw the OOME most 
> likely needs more heap.  Or you need to take steps to reduce the amount of 
> heap that's required.  Note that this second option might not actually be 
> possible ... increasing the heap is probably the only option you have.  Since 
> version 5.0, Solr has shipped with the default heap set to 512MB, which is 
> extremely small.  Most users need to increase it.
>
> You can't mix master-slave replication and SolrCloud.  SolrCloud takes over 
> the replication feature for its own purposes.  Trying to mix these is going 
> to cause you problems.  You may not run into the problems immediately, but it 
> is likely that you would run into a problem eventually.  Data loss would be 
> possible.
>
> The latest versions of Solr have new SolrCloud replication types that closely 
> mimic the old master-slave replication.
>
> Perhaps you should start over and describe what you've actually seen -- 
> exactly what you've done and configured, and how the results differed from 
> your expectations.  Precise commands entered will be helpful.
>
> Thanks,
> Shawn
>
>
> CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for 
> the use of the intended recipient and may contain information that is 
> privileged, confidential or exempt from disclosure under applicable law. If 
> you are not the intended recipient, any disclosure, distribution or other use 
> of this e-mail message or attachments is prohibited. If you have received 
> this e-mail message in error, please delete and notify the sender 
> immediately. Thank you.


Deliver static html content via solr

2018-01-04 Thread Matthias Geiger
Hello,
i have a web application that delivers static html content to the user.

I have been thinking about the possibility to deliver this content from
solr instead of delivering it from the filesystem.
This would prevent the "double" stored content (html files on file
systems + additional solr cores)

Is this a viable approach or a no go?
In case of a no go why do you think it is wrong

In case of the suggestion of a nosql database, what makes noSql superior to
solr?

Regards and Thanks for your time


Re: Deliver static html content via solr

2018-01-04 Thread Walter Underwood
Why would you even consider putting static HTML in a search engine? You don’t 
want to search it.

1. Filesystems are very fast, and operating systems are very good at caching 
them.
2. Files can be pre-compressed for some web servers (Apache, at least) saving 
CPU for compression
3. Solr is not a repository, so you need a copy of the files somewhere, maybe 
in the file system. You cannot get around the “double” copy by keeping the 
originals in Solr.
4. Filesystems are much, much more reliable than Solr. Solr is very good, but 
much more complicated than filesystems.

If you really want to fetch blobs by ID and don’t want to use a filesystem, use 
a database designed for that. That was the original focus of MySQL, for example.

Solr is not a database. Solr is not a repository. A design using Solr for 
primary storage of data is a broken design.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Jan 4, 2018, at 8:19 AM, Matthias Geiger  wrote:
> 
> Hello,
> i have a web application that delivers static html content to the user.
> 
> I have been thinking about the possibility to deliver this content from
> solr instead of delivering it from the filesystem.
> This would prevent the "double" stored content (html files on file
> systems + additional solr cores)
> 
> Is this a viable approach or a no go?
> In case of a no go why do you think it is wrong
> 
> In case of the suggestion of a nosql database, what makes noSql superior to
> solr?
> 
> Regards and Thanks for your time



Re: Deliver static html content via solr

2018-01-04 Thread David Hastings
Its really easy if find for people to start going down this road.  Have to
always remind myself of the hammer and nail analogy.  Use each tool for its
purpose.

On Thu, Jan 4, 2018 at 11:27 AM, Walter Underwood 
wrote:

> Why would you even consider putting static HTML in a search engine? You
> don’t want to search it.
>
> 1. Filesystems are very fast, and operating systems are very good at
> caching them.
> 2. Files can be pre-compressed for some web servers (Apache, at least)
> saving CPU for compression
> 3. Solr is not a repository, so you need a copy of the files somewhere,
> maybe in the file system. You cannot get around the “double” copy by
> keeping the originals in Solr.
> 4. Filesystems are much, much more reliable than Solr. Solr is very good,
> but much more complicated than filesystems.
>
> If you really want to fetch blobs by ID and don’t want to use a
> filesystem, use a database designed for that. That was the original focus
> of MySQL, for example.
>
> Solr is not a database. Solr is not a repository. A design using Solr for
> primary storage of data is a broken design.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Jan 4, 2018, at 8:19 AM, Matthias Geiger 
> wrote:
> >
> > Hello,
> > i have a web application that delivers static html content to the user.
> >
> > I have been thinking about the possibility to deliver this content from
> > solr instead of delivering it from the filesystem.
> > This would prevent the "double" stored content (html files on file
> > systems + additional solr cores)
> >
> > Is this a viable approach or a no go?
> > In case of a no go why do you think it is wrong
> >
> > In case of the suggestion of a nosql database, what makes noSql superior
> to
> > solr?
> >
> > Regards and Thanks for your time
>
>


Re: Json Facet Query Stripping Field Name with Hyphen

2018-01-04 Thread Yonik Seeley
The JSON Facet API uses the function query parser for something like
sum(week_-91) so you'll probably have problems with any function that
uses these fields as well.
As Erick says, you're better off renaming the fields.  There is a
workaround for wonky field names via the "field" function:
sum(field(week_-91))

-Yonik


On Thu, Jan 4, 2018 at 10:02 AM, RAUNAK AGRAWAL
 wrote:
> Hi Guys,
>
> I am facing issue where I am trying to follow the JSON facet API. I have
> data in my collection and field names are like "week_0", "week_-1" which
> means current week and previous week respectively.
>
> When I am querying for week_0 summation using the following query I am able
> to get the result.
>
> http://localhost:8983/solr/collection1/query?q=*:*&json.facet={week_0_sum:'sum(week_0)'}&rows=0
>
>
> But when I am trying to do the same for any field "week_-*", it is break.
>
> For example when I am trying:
> http://localhost:8983/solr/collection1/query?q=*:*&json.facet={week_-91_sum:%27sum(week_-91)%27}&rows=0
>
>
> I am getting the exception as* "msg": "undefined field: \"week_\"''*
>
>
> That means solr is stripping field name after hyphen (-). Is there
> workaround to fix this. I tried adding escape character (\) but it is of no
> help.
>
> With escape:
> http://localhost:8983/solr/collection1/query?q=*:*&json.facet={week_-91_sum:%27sum(week_\-91)%27}&rows=0
>
>
> Please help me regarding this.
>
> Thanks


Re: Json Facet Query Stripping Field Name with Hyphen

2018-01-04 Thread RAUNAK AGRAWAL
Hi Erick/Yonik,

Thank you guys. I am going to rename the fields.

On Thu, Jan 4, 2018 at 10:04 PM, Yonik Seeley  wrote:

> The JSON Facet API uses the function query parser for something like
> sum(week_-91) so you'll probably have problems with any function that
> uses these fields as well.
> As Erick says, you're better off renaming the fields.  There is a
> workaround for wonky field names via the "field" function:
> sum(field(week_-91))
>
> -Yonik
>
>
> On Thu, Jan 4, 2018 at 10:02 AM, RAUNAK AGRAWAL
>  wrote:
> > Hi Guys,
> >
> > I am facing issue where I am trying to follow the JSON facet API. I have
> > data in my collection and field names are like "week_0", "week_-1" which
> > means current week and previous week respectively.
> >
> > When I am querying for week_0 summation using the following query I am
> able
> > to get the result.
> >
> > http://localhost:8983/solr/collection1/query?q=*:*&json.
> facet={week_0_sum:'sum(week_0)'}&rows=0
> >
> >
> > But when I am trying to do the same for any field "week_-*", it is break.
> >
> > For example when I am trying:
> > http://localhost:8983/solr/collection1/query?q=*:*&json.
> facet={week_-91_sum:%27sum(week_-91)%27}&rows=0
> >
> >
> > I am getting the exception as* "msg": "undefined field: \"week_\"''*
> >
> >
> > That means solr is stripping field name after hyphen (-). Is there
> > workaround to fix this. I tried adding escape character (\) but it is of
> no
> > help.
> >
> > With escape:
> > http://localhost:8983/solr/collection1/query?q=*:*&json.
> facet={week_-91_sum:%27sum(week_\-91)%27}&rows=0
> >
> >
> > Please help me regarding this.
> >
> > Thanks
>


Re: Deliver static html content via solr

2018-01-04 Thread Erik Hatcher
All judgements aside on whether this is a preferred way to go, have a look at 
/browse and the VelocityResponseWriter (wt=velocity).  It can serve static 
resources.

I’ve built several prototypes this way that have been effective and business 
generating.  

   Erik

> On Jan 4, 2018, at 11:19, Matthias Geiger  wrote:
> 
> Hello,
> i have a web application that delivers static html content to the user.
> 
> I have been thinking about the possibility to deliver this content from
> solr instead of delivering it from the filesystem.
> This would prevent the "double" stored content (html files on file
> systems + additional solr cores)
> 
> Is this a viable approach or a no go?
> In case of a no go why do you think it is wrong
> 
> In case of the suggestion of a nosql database, what makes noSql superior to
> solr?
> 
> Regards and Thanks for your time


SOLR SSL Java command line properties

2018-01-04 Thread Bob Feider
When I use the provided Apache SOLR startup script (version 6.6.0), the 
script creates and then executes a java command line that has two sets 
of SSL properties who's related elements are set to the same values. One 
set has property names like |javax.net.ssl.*| while the other set has 
names like |solr.jetty.*|. For example:


   |java -server ...-Dsolr.jetty.keystore.password=secret
   ...-Djavax.net.ssl.keyStorePassword=secret ..-jar start.jar
   --module=https|

Our security team does not allow passwords to be passed along on the 
command line or in environment variables but will allow them to be 
placed in a file provided the file has restricted access permissions. I 
noticed that there is a |jetty-ssl.xml| file in the |solr/server/etc| 
directory that can be used to provide default values for the |SOLR SSL| 
related properties including the |solr.jetty.keystore.password|. When I 
remove the |javax.net.ssl.keyStorePassword| and 
|solr.jetty.keystore.password| properties from the java command line and 
update the |jetty-ssl.xml| file with my default keystore password, SOLR 
appears to start properly with the default keystore password contained 
in that file. I can then connect with my browser to 
|https://localhost:8983/solr/#| and access the SOLR Admin page just fine.


Are the |javax.net.ssl.*| properties used at all in the SOLR standalone 
or SOLR cloud products?


Do I need to provide the javax.net.ssl.* properties on the command line 
for proper operation or can I get away with simply providing them in the 
jetty-ssl.xml file?


I am concerned that they are used behind the scenes outside of the 
browser to SOLR server connections to connect to other processes like 
zookeeper and that by doing this I will uncover some problem down the 
road that my simple testing has not revealed. The only direct reference 
to the properties I can see in the source code is in the solr embedded 
code that is part of the solrj client inside the SSLConfig Java class.


Thanks for your help,

Bob



Re: Replication Factor Bug in Collections Restore API?

2018-01-04 Thread Shalin Shekhar Mangar
Sounds like a bug. Can you please open a Jira issue?

On Thu, Jan 4, 2018 at 8:37 PM, Ansgar Wiechers
 wrote:
> Hi all.
>
> I'm running Solr 7.1 in SolrCloud mode ona a 3-node cluster and tried
> using the backup/restore API for the first time. Backup worked fine, but
> when trying to restore the backed-up collection I ran into an unexpected
> problem with the replication factor setting.
>
> Below command attempts to restore a backup of the collection "demo" with
> 3 shards, creating 2 replicas per shard:
>
> # curl -s -k 
> 'https://localhost:8983/solr/admin/collections?action=restore&name=demo&location=/srv/backup/solr/solr-dev&collection=demo&maxShardsPerNode=2&replicationFactor=2'
> {
>   "error": {
> "code": 400,
> "msg": "Solr cloud with available number of nodes:3 is insufficient for 
> restoring a collection with 3 shards, total replicas per shard 6 and 
> maxShardsPerNode 2. Consider increasing maxShardsPerNode value OR number of 
> available nodes.",
> "metadata": [
>   "error-class",
>   "org.apache.solr.common.SolrException",
>   "root-error-class",
>   "org.apache.solr.common.SolrException"
> ]
>   },
>   "exception": {
> "rspCode": 400,
> "msg": "Solr cloud with available number of nodes:3 is insufficient for 
> restoring a collection with 3 shards, total replicas per shard 6 and 
> maxShardsPerNode 2. Consider increasing maxShardsPerNode value OR number of 
> available nodes."
>   },
>   "Operation restore caused exception:": 
> "org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: 
> Solr cloud with available number of nodes:3 is insufficient for restoring a 
> collection with 3 shards, total replicas per shard 6 and maxShardsPerNode 2. 
> Consider increasing maxShardsPerNode value OR number of available nodes.",
>   "responseHeader": {
> "QTime": 28,
> "status": 400
>   }
> }
>
> It looks to me like the restore API multiplies the replication factor
> with the number of nodes, which is not how the replication factor
> behaves in other contexts. The documentation[1] also didn't lead me to
> expect this behavior:
>
>> replicationFactor
>>
>>The number of replicas to be created for each shard.
>
> Is this expected behavior (by anyone but me)?
> Should I report it as a bug?
>
> [1]: https://lucene.apache.org/solr/guide/7_1/collections-api.html
>
> Regards
> Ansgar Wiechers
> --
> "Abstractions save us time working, but they don't save us time learning."
> --Joel Spolsky



-- 
Regards,
Shalin Shekhar Mangar.


RE: Solrcloud with Master/Slave

2018-01-04 Thread Sundaram, Dinesh
I want to keep both collections in sync always. This is really working fine 
without any issue so far.  My problem is pretty straight forward.

I'm starting two solr instances on two servers using the below command. I 
believe this command is for solrcloud mode. If so then I have that shared 
replication handler config also in in my _default/solrconfig.xml on one 
instance so that the slave instance will synch with master. I don’t use 
zookeeper at all. Just replication handler setting in solrconfig.xml. is this 
good for longtime? If not please help me understand the issues.

bin/solr start -cloud -p 8983 -noprompt



Dinesh Sundaram
MBS Platform Engineering

Mastercard



-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Thursday, January 4, 2018 10:10 AM
To: solr-user 
Subject: Re: Solrcloud with Master/Slave

Whoa. I don't think you should be doing this at all. This really appears to be 
an XY problem. You're asking "how to do X" without telling us what the problem 
you're trying to solve is (the Y). _Why_ do you want to set things up this way? 
A one-time synchronization or to keep both collections in sync?


Cross Data Center Replication (CDCR) is designed to keep two separate 
collections in sync on an ongoing basis.

If this is a one-time deal, you can manually issue a replication API 
"fetchindex" command. What I'd do in that case is set up your collection B with 
each shard having exactly one replica (i.e. a leader and no followers). Do the 
fetch and verify that your new collection is as you want it then ADDREPLICA to 
build out your redundancy.

Best,
Erick

On Thu, Jan 4, 2018 at 8:01 AM, Sundaram, Dinesh 
 wrote:
> Thanks Shawn for your prompt response. Assume I have solrcloud A server with 
> 1 node runs on 8983 port and solrcloud B server with 1 node runs on 8983, 
> here I want to synch up the collection between solrcloud A and B using the 
> below replication handler. Is this advisable to use at the solrcloud B ?
>
> 
> 
>  name="masterUrl">https://urldefense.proofpoint.com/v2/url?u=http-3A__solrcloudA-3A8983_solr_-24-257Bsolr.core.name-257D_replication&d=DwIFaQ&c=uc5ZRXl8dGLM1RMQwf7xTCjRqXF0jmCF6SP0bDlmMmY&r=gCFZFMR7y0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m=qBCZxvmkOHW9jt8JM8dVSQJuulIJp3Xk2hXvC5bL7DM&s=xGP-8z2aGBFGrtjIbBMFB6f2cfE4bukyOctAVK_HkyI&e=
> 00:00:20
> 
> 
>
>
>
> Dinesh Sundaram
> MBS Platform Engineering
>
> Mastercard
>
>
>
> -Original Message-
> From: Shawn Heisey [mailto:apa...@elyograg.org]
> Sent: Tuesday, January 2, 2018 5:33 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solrcloud with Master/Slave
>
> On 1/2/2018 3:32 PM, Sundaram, Dinesh wrote:
>> I have spun up single solrcloud node on 2 servers.
>
> This makes no sense.  If you have two servers, then you probably have more 
> than a single node.
>
>> tried to synch up the data b/w those servers via zookeeper
>
> This is not done with zookeeper.  SolrCloud should handle it automatically.  
> SolrCloud uses the zookeeper database to *coordinate* keeping machines in 
> sync, but it's Solr that does the work, not zookeeper.
>
> This makes even less sense when taken in context with the previous sentence.  
> If you only have a single node, then you can't possibly sync between them.
>
>> but didn’t work well due to out of memory issues, ensemble issues 
>> with multiple ports connectivity. So had to move to Master slave 
>> replication b/w those 2 solrcloud nodes. I couldn’t find any issues 
>> so far. Is this advisable? Because I’m wondering that looks like 
>> mixing up solrcloud and master/slave replication.
>
> If you're getting OOME problems, then whatever program threw the OOME most 
> likely needs more heap.  Or you need to take steps to reduce the amount of 
> heap that's required.  Note that this second option might not actually be 
> possible ... increasing the heap is probably the only option you have.  Since 
> version 5.0, Solr has shipped with the default heap set to 512MB, which is 
> extremely small.  Most users need to increase it.
>
> You can't mix master-slave replication and SolrCloud.  SolrCloud takes over 
> the replication feature for its own purposes.  Trying to mix these is going 
> to cause you problems.  You may not run into the problems immediately, but it 
> is likely that you would run into a problem eventually.  Data loss would be 
> possible.
>
> The latest versions of Solr have new SolrCloud replication types that closely 
> mimic the old master-slave replication.
>
> Perhaps you should start over and describe what you've actually seen -- 
> exactly what you've done and configured, and how the results differed from 
> your expectations.  Precise commands entered will be helpful.
>
> Thanks,
> Shawn
>
>
> CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for 
> the use of the intended recipient and may contain information that is 
> privileged, confidential or exempt from disclosure under applicable

Re: Solrcloud with Master/Slave

2018-01-04 Thread Erick Erickson
Yes you do use ZooKeeper. Starting Solr with the -cloud option but _without_
ZK_HOST defined (or the -z parameter) starts an internal ZooKeeper on port
9983 (by default). This is evidenced by the fact that the admin UI has a
"cloud" link along the left. In essence you have two separate clusters,
each cluster
just happens to exist on the same machine.

Why bother with SolrCloud? Just configure old-style master/slave.
SolrCloud is buying you nothing and running internal ZooKeepers is consuming
resources for no good purpose.

SolrCloud would help you if you set up a proper cluster with ZooKeeper
and just had both of your nodes in the same cluster, one with replicas. That
buys you HA/DR, NRT on both leader and follower etc.

Up to you of course, but it's really hard to see what the purpose of
running the
way you are is.

Best,
Erick

On Thu, Jan 4, 2018 at 11:38 AM, Sundaram, Dinesh <
dinesh.sunda...@mastercard.com> wrote:

> I want to keep both collections in sync always. This is really working
> fine without any issue so far.  My problem is pretty straight forward.
>
> I'm starting two solr instances on two servers using the below command. I
> believe this command is for solrcloud mode. If so then I have that shared
> replication handler config also in in my _default/solrconfig.xml on one
> instance so that the slave instance will synch with master. I don’t use
> zookeeper at all. Just replication handler setting in solrconfig.xml. is
> this good for longtime? If not please help me understand the issues.
>
> bin/solr start -cloud -p 8983 -noprompt
>
>
>
> Dinesh Sundaram
> MBS Platform Engineering
>
> Mastercard
>
>
>
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Thursday, January 4, 2018 10:10 AM
> To: solr-user 
> Subject: Re: Solrcloud with Master/Slave
>
> Whoa. I don't think you should be doing this at all. This really appears
> to be an XY problem. You're asking "how to do X" without telling us what
> the problem you're trying to solve is (the Y). _Why_ do you want to set
> things up this way? A one-time synchronization or to keep both collections
> in sync?
>
>
> Cross Data Center Replication (CDCR) is designed to keep two separate
> collections in sync on an ongoing basis.
>
> If this is a one-time deal, you can manually issue a replication API
> "fetchindex" command. What I'd do in that case is set up your collection B
> with each shard having exactly one replica (i.e. a leader and no
> followers). Do the fetch and verify that your new collection is as you want
> it then ADDREPLICA to build out your redundancy.
>
> Best,
> Erick
>
> On Thu, Jan 4, 2018 at 8:01 AM, Sundaram, Dinesh <
> dinesh.sunda...@mastercard.com> wrote:
> > Thanks Shawn for your prompt response. Assume I have solrcloud A server
> with 1 node runs on 8983 port and solrcloud B server with 1 node runs on
> 8983, here I want to synch up the collection between solrcloud A and B
> using the below replication handler. Is this advisable to use at the
> solrcloud B ?
> >
> > 
> > 
> > https://urldefense.proofpoint.com/v2/
> url?u=http-3A__solrcloudA-3A8983_solr_-24-257Bsolr.core.
> name-257D_replication&d=DwIFaQ&c=uc5ZRXl8dGLM1RMQwf7xTCjRqXF0jm
> CF6SP0bDlmMmY&r=gCFZFMR7y0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m=
> qBCZxvmkOHW9jt8JM8dVSQJuulIJp3Xk2hXvC5bL7DM&s=xGP-
> 8z2aGBFGrtjIbBMFB6f2cfE4bukyOctAVK_HkyI&e=
> > 00:00:20
> > 
> > 
> >
> >
> >
> > Dinesh Sundaram
> > MBS Platform Engineering
> >
> > Mastercard
> >
> >
> >
> > -Original Message-
> > From: Shawn Heisey [mailto:apa...@elyograg.org]
> > Sent: Tuesday, January 2, 2018 5:33 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Solrcloud with Master/Slave
> >
> > On 1/2/2018 3:32 PM, Sundaram, Dinesh wrote:
> >> I have spun up single solrcloud node on 2 servers.
> >
> > This makes no sense.  If you have two servers, then you probably have
> more than a single node.
> >
> >> tried to synch up the data b/w those servers via zookeeper
> >
> > This is not done with zookeeper.  SolrCloud should handle it
> automatically.  SolrCloud uses the zookeeper database to *coordinate*
> keeping machines in sync, but it's Solr that does the work, not zookeeper.
> >
> > This makes even less sense when taken in context with the previous
> sentence.  If you only have a single node, then you can't possibly sync
> between them.
> >
> >> but didn’t work well due to out of memory issues, ensemble issues
> >> with multiple ports connectivity. So had to move to Master slave
> >> replication b/w those 2 solrcloud nodes. I couldn’t find any issues
> >> so far. Is this advisable? Because I’m wondering that looks like
> >> mixing up solrcloud and master/slave replication.
> >
> > If you're getting OOME problems, then whatever program threw the OOME
> most likely needs more heap.  Or you need to take steps to reduce the
> amount of heap that's required.  Note that this second option might not
> actually be possible ... increasing the

Negative Core Node Numbers

2018-01-04 Thread Chris Ulicny
Hi,

In 7.1, how does solr determine the numbers that are assigned to the
replicas? I'm familiar with the earlier naming conventions from 6.3, but I
wanted to know if there was supposed to be any connection between the
"_n##" suffix and the number assigned to the "core_node##" name since they
don't seem to follow the old convention. As an example node from
clusterstatus for a testcollection with replication factor 2.

"core_node91":{
"core":"testcollection_shard22_replica_n84",
"base_url":"http://host:8080/solr";,
"node_name":"host:8080_solr",
"state":"active",
"type":"NRT",
"leader":"true"}

Along the same lines, when creating the testcollection with 200 shards and
replication factor of 2, I am also getting nodes that have negative numbers
assigned to them which looks a lot like an int overflow issue. From the
cluster status:

  "shard157":{
"range":"47ae-48f4",
"state":"active",
"replicas":{
  "core_node1675945628":{
"core":"testcollection _shard157_replica_n-1174535610",
"base_url":"http://host1:8080/solr";,
"node_name":"host1:8080_solr",
"state":"active",
"type":"NRT"},
  "core_node1642259614":{
"core":"testcollection _shard157_replica_n-1208090040",
"base_url":"http://host2:8080/solr";,
"node_name":"host2:8080_solr",
"state":"active",
"type":"NRT",
"leader":"true"}}}

This keeps happening even when the collection is successfully deleted (no
directories or files left on disk), the entire cluster is shutdown, and the
zookeeper chroot path cleared out of all content. The only thing that
happened prior to this cycle was a single failed collection creation which
seemed to clean itself up properly, after which everything was shutdown and
cleaned from zookeeper as well.

Is there something else that is keeping track of those values that wasn't
cleared out? Or is this now the expected behavior for the numerical
assignments to replicas?

Thanks,
Chris


RE: Solrcloud with Master/Slave

2018-01-04 Thread Sundaram, Dinesh
Ok thanks for your valuable reply. I want to see admin console so that I can 
monitor the collection details, that is the reason going to cloud mode. But 
here I need replication without zookeeper so had to choose regular master/slave 
replication. Am  I mixing 2 different synchup procedures or this also okay?


Dinesh Sundaram
MBS Platform Engineering

Mastercard



-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Thursday, January 4, 2018 2:06 PM
To: solr-user 
Subject: Re: Solrcloud with Master/Slave

Yes you do use ZooKeeper. Starting Solr with the -cloud option but _without_ 
ZK_HOST defined (or the -z parameter) starts an internal ZooKeeper on port
9983 (by default). This is evidenced by the fact that the admin UI has a 
"cloud" link along the left. In essence you have two separate clusters, each 
cluster just happens to exist on the same machine.

Why bother with SolrCloud? Just configure old-style master/slave.
SolrCloud is buying you nothing and running internal ZooKeepers is consuming 
resources for no good purpose.

SolrCloud would help you if you set up a proper cluster with ZooKeeper and just 
had both of your nodes in the same cluster, one with replicas. That buys you 
HA/DR, NRT on both leader and follower etc.

Up to you of course, but it's really hard to see what the purpose of running 
the way you are is.

Best,
Erick

On Thu, Jan 4, 2018 at 11:38 AM, Sundaram, Dinesh < 
dinesh.sunda...@mastercard.com> wrote:

> I want to keep both collections in sync always. This is really working
> fine without any issue so far.  My problem is pretty straight forward.
>
> I'm starting two solr instances on two servers using the below
> command. I believe this command is for solrcloud mode. If so then I
> have that shared replication handler config also in in my
> _default/solrconfig.xml on one instance so that the slave instance
> will synch with master. I don’t use zookeeper at all. Just replication
> handler setting in solrconfig.xml. is this good for longtime? If not please 
> help me understand the issues.
>
> bin/solr start -cloud -p 8983 -noprompt
>
>
>
> Dinesh Sundaram
> MBS Platform Engineering
>
> Mastercard
>
>
>
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Thursday, January 4, 2018 10:10 AM
> To: solr-user 
> Subject: Re: Solrcloud with Master/Slave
>
> Whoa. I don't think you should be doing this at all. This really
> appears to be an XY problem. You're asking "how to do X" without
> telling us what the problem you're trying to solve is (the Y). _Why_
> do you want to set things up this way? A one-time synchronization or
> to keep both collections in sync?
>
>
> Cross Data Center Replication (CDCR) is designed to keep two separate
> collections in sync on an ongoing basis.
>
> If this is a one-time deal, you can manually issue a replication API
> "fetchindex" command. What I'd do in that case is set up your
> collection B with each shard having exactly one replica (i.e. a leader
> and no followers). Do the fetch and verify that your new collection is
> as you want it then ADDREPLICA to build out your redundancy.
>
> Best,
> Erick
>
> On Thu, Jan 4, 2018 at 8:01 AM, Sundaram, Dinesh <
> dinesh.sunda...@mastercard.com> wrote:
> > Thanks Shawn for your prompt response. Assume I have solrcloud A
> > server
> with 1 node runs on 8983 port and solrcloud B server with 1 node runs
> on 8983, here I want to synch up the collection between solrcloud A
> and B using the below replication handler. Is this advisable to use at
> the solrcloud B ?
> >
> > 
> > 
> > https://urldefense.proofpoint.com/v2/
> url?u=http-3A__solrcloudA-3A8983_solr_-24-257Bsolr.core.
> name-257D_replication&d=DwIFaQ&c=uc5ZRXl8dGLM1RMQwf7xTCjRqXF0jm
> CF6SP0bDlmMmY&r=gCFZFMR7y0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m=
> qBCZxvmkOHW9jt8JM8dVSQJuulIJp3Xk2hXvC5bL7DM&s=xGP-
> 8z2aGBFGrtjIbBMFB6f2cfE4bukyOctAVK_HkyI&e=
> > 00:00:20
> > 
> > 
> >
> >
> >
> > Dinesh Sundaram
> > MBS Platform Engineering
> >
> > Mastercard
> >
> >
> >
> > -Original Message-
> > From: Shawn Heisey [mailto:apa...@elyograg.org]
> > Sent: Tuesday, January 2, 2018 5:33 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Solrcloud with Master/Slave
> >
> > On 1/2/2018 3:32 PM, Sundaram, Dinesh wrote:
> >> I have spun up single solrcloud node on 2 servers.
> >
> > This makes no sense.  If you have two servers, then you probably
> > have
> more than a single node.
> >
> >> tried to synch up the data b/w those servers via zookeeper
> >
> > This is not done with zookeeper.  SolrCloud should handle it
> automatically.  SolrCloud uses the zookeeper database to *coordinate*
> keeping machines in sync, but it's Solr that does the work, not zookeeper.
> >
> > This makes even less sense when taken in context with the previous
> sentence.  If you only have a single node, then you can't possibly
> sync between them.
> >
> >> but didn’t work well due to out 

Re: Negative Core Node Numbers

2018-01-04 Thread Anshum Gupta
Hi Chris,

The core node numbers should be cleared out when the collection is deleted. Is 
that something you see consistently ?

P.S: I just tried creating a collection with 1 shard and 200 replicas and saw 
the core node numbers as expected. On deleting and recreating the collection, I 
saw that the counter was reset. Just to be clear, I tried this on master.

-Anshum



> On Jan 4, 2018, at 12:16 PM, Chris Ulicny  wrote:
> 
> Hi,
> 
> In 7.1, how does solr determine the numbers that are assigned to the
> replicas? I'm familiar with the earlier naming conventions from 6.3, but I
> wanted to know if there was supposed to be any connection between the
> "_n##" suffix and the number assigned to the "core_node##" name since they
> don't seem to follow the old convention. As an example node from
> clusterstatus for a testcollection with replication factor 2.
> 
> "core_node91":{
>"core":"testcollection_shard22_replica_n84",
>"base_url":"http://host:8080/solr";,
>"node_name":"host:8080_solr",
>"state":"active",
>"type":"NRT",
>"leader":"true"}
> 
> Along the same lines, when creating the testcollection with 200 shards and
> replication factor of 2, I am also getting nodes that have negative numbers
> assigned to them which looks a lot like an int overflow issue. From the
> cluster status:
> 
>  "shard157":{
>"range":"47ae-48f4",
>"state":"active",
>"replicas":{
>  "core_node1675945628":{
>"core":"testcollection _shard157_replica_n-1174535610",
>"base_url":"http://host1:8080/solr";,
>"node_name":"host1:8080_solr",
>"state":"active",
>"type":"NRT"},
>  "core_node1642259614":{
>"core":"testcollection _shard157_replica_n-1208090040",
>"base_url":"http://host2:8080/solr";,
>"node_name":"host2:8080_solr",
>"state":"active",
>"type":"NRT",
>"leader":"true"}}}
> 
> This keeps happening even when the collection is successfully deleted (no
> directories or files left on disk), the entire cluster is shutdown, and the
> zookeeper chroot path cleared out of all content. The only thing that
> happened prior to this cycle was a single failed collection creation which
> seemed to clean itself up properly, after which everything was shutdown and
> cleaned from zookeeper as well.
> 
> Is there something else that is keeping track of those values that wasn't
> cleared out? Or is this now the expected behavior for the numerical
> assignments to replicas?
> 
> Thanks,
> Chris



signature.asc
Description: Message signed with OpenPGP


trivia question: why q=*:* doesn't return same result as q.alt=*:*

2018-01-04 Thread Nawab Zada Asad Iqbal
Hi,

In my SearchHandler solrconfig, i have q.alt=*:* . This allows me to run
queries which only have `fq` filters and no `q`.

If I remove q.alt from the solrconfig and specify `q=*:*` in the query
parameters, it does not give any results. I also tried `q=*` but of no
avail.

Is there some good reason for this behavior? Since I already know a work
around, this question is only for my curiosity.


Thanks
Nawab


RE: problem with Solr Sorting by score and distance together

2018-01-04 Thread Deepak Udapudi
Hi Shawn,

Thanks for the response.

In the problem example in the below email I had used a hypothetical example for 
my query.

Actually, we are trying to search for the name and specialty combination(for 
ex:- paul orthodontist) of the dentist sorted by the highest score and distance 
(in case of same dentists matching the free text criteria).

Below are the Solr logs.

2018-01-05 00:13:05.835 INFO  (qtp1348949648-14) [   x:provider_collection] 
o.a.s.c.S.Request [provider_collection]  webapp=/solr path=/select 
params={q=distance:{!geofilt+sfield%3Dlocation+pt%3D37.564143,-122.004179+d%3D60.0}&fl=*,distance:mul(geodist(location,37.5641425,-122.004179),0.621371)&start=0&fq=facilityName:"orthodontist"+OR+facilityName:*orthodontist*+OR+facilityName:"paul"+OR+facilityName:*paul*+OR+facilityName:*paul+orthodontist*+OR+facilityName:"paul+orthodontist"+OR+firstName:"orthodontist"+OR+firstName:*orthodontist*+OR+firstName:"paul"+OR+firstName:*paul*+OR+firstName:*paul+orthodontist*+OR+firstName:"paul+orthodontist"+OR+fullName:"orthodontist"+OR+fullName:*orthodontist*+OR+fullName:"paul"+OR+fullName:*paul*+OR+fullName:*paul+orthodontist*+OR+fullName:"paul+orthodontist"+OR+groupPracticeNpi:"orthodontist"+OR+groupPracticeNpi:*orthodontist*+OR+groupPracticeNpi:"paul"+OR+groupPracticeNpi:*paul*+OR+groupPracticeNpi:*paul+orthodontist*+OR+groupPracticeNpi:"paul+orthodontist"+OR+keywords:"orthodontist"+OR+keywords:*orthodontist*+OR+keywords:"paul"+OR+keywords:*paul*+OR+keywords:*paul+orthodontist*+OR+keywords:"paul+orthodontist"+OR+lastName:"orthodontist"+OR+lastName:*orthodontist*+OR+lastName:"paul"+OR+lastName:*paul*+OR+lastName:*paul+orthodontist*+OR+lastName:"paul+orthodontist"+OR+licenseNumber:"orthodontist"+OR+licenseNumber:*orthodontist*+OR+licenseNumber:"paul"+OR+licenseNumber:*paul*+OR+licenseNumber:*paul+orthodontist*+OR+licenseNumber:"paul+orthodontist"+OR+npi:"orthodontist"+OR+npi:*orthodontist*+OR+npi:"paul"+OR+npi:*paul*+OR+npi:*paul+orthodontist*+OR+npi:"paul+orthodontist"+OR+officeName:"orthodontist"+OR+officeName:*orthodontist*+OR+officeName:"paul"+OR+officeName:*paul*+OR+officeName:*paul+orthodontist*+OR+officeName:"paul+orthodontist"+OR+practiceLocationLanguages:"orthodontist"+OR+practiceLocationLanguages:*orthodontist*+OR+practiceLocationLanguages:"paul"+OR+practiceLocationLanguages:*paul*+OR+practiceLocationLanguages:*paul+orthodontist*+OR+practiceLocationLanguages:"paul+orthodontist"+OR+practiceLocationNpi:"orthodontist"+OR+practiceLocationNpi:*orthodontist*+OR+practiceLocationNpi:"paul"+OR+practiceLocationNpi:*paul*+OR+practiceLocationNpi:*paul+orthodontist*+OR+practiceLocationNpi:"paul+orthodontist"+OR+providerLanguages:"orthodontist"+OR+providerLanguages:*orthodontist*+OR+providerLanguages:"paul"+OR+providerLanguages:*paul*+OR+providerLanguages:*paul+orthodontist*+OR+providerLanguages:"paul+orthodontist"+OR+specialty:"orthodontist"+OR+specialty:*orthodontist*+OR+specialty:"paul"+OR+specialty:*paul*+OR+specialty:*paul+orthodontist*+OR+specialty:"paul+orthodontist"&sort=geodist(location,37.564143,-122.004179)+asc,score+desc&rows=10&wt=javabin&version=2}
 hits=577 status=0 QTime=284

2018-01-05 00:13:06.886 INFO  (qtp1348949648-17) [   x:provider_collection] 
o.a.s.c.S.Request [provider_collection]  webapp=/solr path=/admin/ping 
params={wt=javabin&version=2} hits=304592 status=0 QTime=0
2018-01-05 00:13:06.886 INFO  (qtp1348949648-17) [   x:provider_collection] 
o.a.s.c.S.Request [provider_collection]  webapp=/solr path=/admin/ping 
params={wt=javabin&version=2} status=0 QTime=0
2018-01-05 00:13:06.888 INFO  (qtp1348949648-16) [   x:provider_collection] 
o.a.s.c.S.Request [provider_collection]  webapp=/solr path=/admin/ping 
params={wt=javabin&version=2} hits=304592 status=0 QTime=0
2018-01-05 00:13:06.888 INFO  (qtp1348949648-16) [   x:provider_collection] 
o.a.s.c.S.Request [provider_collection]  webapp=/solr path=/admin/ping 
params={wt=javabin&version=2} status=0 QTime=0
2018-01-05 00:13:06.891 INFO  (qtp1348949648-19) [   x:yelp_collection] 
o.a.s.c.S.Request [yelp_collection]  webapp=/solr path=/admin/ping 
params={wt=javabin&version=2} hits=13 status=0 QTime=0
2018-01-05 00:13:06.891 INFO  (qtp1348949648-19) [   x:yelp_collection] 
o.a.s.c.S.Request [yelp_collection]  webapp=/solr path=/admin/ping 
params={wt=javabin&version=2} status=0 QTime=0

Request you to go through the above logs and provide us the recommendation to 
go forward.

Regards,
Deepak

-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Thursday, January 04, 2018 5:17 AM
To: solr-user@lucene.apache.org
Subject: Re: problem with Solr Sorting by score and distance together

On 1/3/2018 6:16 PM, Deepak Udapudi wrote:
> Assume that,  I am searching for car care centers. Solr collection has the 
> data for all the major car care centers. As an example I search for Firestone 
> car care centers in a 5 miles radius. In the search results I am supposed to 
> receive the firestone car care centers list wi

Re: Negative Core Node Numbers

2018-01-04 Thread Chris Ulicny
Thanks Anshum,

They don't seem to be consistently numbered on any particular collection
creation, but the same numbers will be reused (eventually). After about 3
or 4 tries, I got the same numbered replica on the same machine, so
something is being cleared out. The numbers are never consecutive though,
they start around 1, seem to be relatively sequential with gaps until about
120 or so, and then are all over the place. One other thing that seems to
be consistent on each new collection: the numbers at the end of
"core_node#" never appear as the number at the end of
"testcollection_shard1_replica_n#". Parts of the cluster state are below.

 "shard1":{
"range":"8000-8146", "state":"active", "replicas":{
  "core_node2":{"core":"testcollection_shard1_replica_n1",
"base_url":"http://host5:8080/solr";, "node_name":"host5:8080_solr",
"state":"active","type":"NRT", "leader":"true"},
  "core_node4":{"core":"testcollection_shard1_replica_n3",
"base_url":"http://host3:8080/solr";, "node_name":"host3:8080_solr",
"state":"active","type":"NRT"}}},
"shard2":{
"range":"8147-828e", "state":"active", "replicas":{
  "core_node6":{"core":"testcollection_shard2_replica_n5",
"base_url":"http://host1:8080/solr";, "node_name":"host1:8080_solr",
"state":"active","type":"NRT"},
  "core_node8":{"core":"testcollection_shard2_replica_n7",
"base_url":"http://host2:8080/solr";, "node_name":"host2:8080_solr",
"state":"active","type":"NRT", "leader":"true"}}}
...
"shard170":{
"range":"5851-5998", "state":"active", "replicas":{

"core_node800109264":{"core":"testcollection_shard170_replica_n-2046950790
<(204)%20695-0790>", "base_url":"http://host2:8080/solr";,
"node_name":"host2:8080_solr","state":"active", "type":"NRT",
"leader":"true"},

"core_node766423250":{"core":"testcollection_shard170_replica_n-2080505220",
"base_url":"http://host4:8080/solr";,
"node_name":"host4:8080_solr","state":"active", "type":"NRT"}}}
...

Is there a way to view the counter in a deployed environment, or is it only
accessible through debugging solr?

The setup I've been trying was 200 shards with 2 replicas each, but trying
to create a collection with 1 shard and 200 replicas of it results in the
same situation with abnormal numbers.

A few other details on the setup: 5 solr nodes (v7.1.0), 3 zookeeper nodes
(v3.4.11), Ubuntu 16.04, all hosts (zk & solr) are machines in Google's
Cloud environment.


On Thu, Jan 4, 2018 at 5:53 PM Anshum Gupta  wrote:

> Hi Chris,
>
> The core node numbers should be cleared out when the collection is
> deleted. Is that something you see consistently ?
>
> P.S: I just tried creating a collection with 1 shard and 200 replicas and
> saw the core node numbers as expected. On deleting and recreating the
> collection, I saw that the counter was reset. Just to be clear, I tried
> this on master.
>
> -Anshum
>
>
>
> On Jan 4, 2018, at 12:16 PM, Chris Ulicny  wrote:
>
> Hi,
>
> In 7.1, how does solr determine the numbers that are assigned to the
> replicas? I'm familiar with the earlier naming conventions from 6.3, but I
> wanted to know if there was supposed to be any connection between the
> "_n##" suffix and the number assigned to the "core_node##" name since they
> don't seem to follow the old convention. As an example node from
> clusterstatus for a testcollection with replication factor 2.
>
> "core_node91":{
>"core":"testcollection_shard22_replica_n84",
>"base_url":"http://host:8080/solr";,
>"node_name":"host:8080_solr",
>"state":"active",
>"type":"NRT",
>"leader":"true"}
>
> Along the same lines, when creating the testcollection with 200 shards and
> replication factor of 2, I am also getting nodes that have negative numbers
> assigned to them which looks a lot like an int overflow issue. From the
> cluster status:
>
>  "shard157":{
>"range":"47ae-48f4",
>"state":"active",
>"replicas":{
>  "core_node1675945628":{
>"core":"testcollection _shard157_replica_n-1174535610",
>"base_url":"http://host1:8080/solr";,
>"node_name":"host1:8080_solr",
>"state":"active",
>"type":"NRT"},
>  "core_node1642259614":{
>"core":"testcollection _shard157_replica_n-1208090040",
>"base_url":"http://host2:8080/solr";,
>"node_name":"host2:8080_solr",
>"state":"active",
>"type":"NRT",
>"leader":"true"}}}
>
> This keeps happening even when the collection is successfully deleted (no
> directories or files left on disk), the entire cluster is shutdown, and the
> zookeeper chroot path cleared out of all content. The only thing that
> happened prior to this cycle was a single failed collection creation which
> s

Re: trivia question: why q=*:* doesn't return same result as q.alt=*:*

2018-01-04 Thread Erick Erickson
Hmm, seems odd. What happens when you attach &debug=query? I'm curious how
the parsed queries differ.

On Jan 4, 2018 15:14, "Nawab Zada Asad Iqbal"  wrote:

> Hi,
>
> In my SearchHandler solrconfig, i have q.alt=*:* . This allows me to run
> queries which only have `fq` filters and no `q`.
>
> If I remove q.alt from the solrconfig and specify `q=*:*` in the query
> parameters, it does not give any results. I also tried `q=*` but of no
> avail.
>
> Is there some good reason for this behavior? Since I already know a work
> around, this question is only for my curiosity.
>
>
> Thanks
> Nawab
>


Re: Solrcloud with Master/Slave

2018-01-04 Thread Erick Erickson
As I said before, you _are_using ZooKeeper by starting your servers with
the -cloud option. Just leave that off and you won't be and can use
master/slave freely.

Best,
Erick

On Thu, Jan 4, 2018 at 12:21 PM, Sundaram, Dinesh <
dinesh.sunda...@mastercard.com> wrote:

> Ok thanks for your valuable reply. I want to see admin console so that I
> can monitor the collection details, that is the reason going to cloud mode.
> But here I need replication without zookeeper so had to choose regular
> master/slave replication. Am  I mixing 2 different synchup procedures or
> this also okay?
>
>
> Dinesh Sundaram
> MBS Platform Engineering
>
> Mastercard
>
>
>
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Thursday, January 4, 2018 2:06 PM
> To: solr-user 
> Subject: Re: Solrcloud with Master/Slave
>
> Yes you do use ZooKeeper. Starting Solr with the -cloud option but
> _without_ ZK_HOST defined (or the -z parameter) starts an internal
> ZooKeeper on port
> 9983 (by default). This is evidenced by the fact that the admin UI has a
> "cloud" link along the left. In essence you have two separate clusters,
> each cluster just happens to exist on the same machine.
>
> Why bother with SolrCloud? Just configure old-style master/slave.
> SolrCloud is buying you nothing and running internal ZooKeepers is
> consuming resources for no good purpose.
>
> SolrCloud would help you if you set up a proper cluster with ZooKeeper and
> just had both of your nodes in the same cluster, one with replicas. That
> buys you HA/DR, NRT on both leader and follower etc.
>
> Up to you of course, but it's really hard to see what the purpose of
> running the way you are is.
>
> Best,
> Erick
>
> On Thu, Jan 4, 2018 at 11:38 AM, Sundaram, Dinesh <
> dinesh.sunda...@mastercard.com> wrote:
>
> > I want to keep both collections in sync always. This is really working
> > fine without any issue so far.  My problem is pretty straight forward.
> >
> > I'm starting two solr instances on two servers using the below
> > command. I believe this command is for solrcloud mode. If so then I
> > have that shared replication handler config also in in my
> > _default/solrconfig.xml on one instance so that the slave instance
> > will synch with master. I don’t use zookeeper at all. Just replication
> > handler setting in solrconfig.xml. is this good for longtime? If not
> please help me understand the issues.
> >
> > bin/solr start -cloud -p 8983 -noprompt
> >
> >
> >
> > Dinesh Sundaram
> > MBS Platform Engineering
> >
> > Mastercard
> >
> >
> >
> > -Original Message-
> > From: Erick Erickson [mailto:erickerick...@gmail.com]
> > Sent: Thursday, January 4, 2018 10:10 AM
> > To: solr-user 
> > Subject: Re: Solrcloud with Master/Slave
> >
> > Whoa. I don't think you should be doing this at all. This really
> > appears to be an XY problem. You're asking "how to do X" without
> > telling us what the problem you're trying to solve is (the Y). _Why_
> > do you want to set things up this way? A one-time synchronization or
> > to keep both collections in sync?
> >
> >
> > Cross Data Center Replication (CDCR) is designed to keep two separate
> > collections in sync on an ongoing basis.
> >
> > If this is a one-time deal, you can manually issue a replication API
> > "fetchindex" command. What I'd do in that case is set up your
> > collection B with each shard having exactly one replica (i.e. a leader
> > and no followers). Do the fetch and verify that your new collection is
> > as you want it then ADDREPLICA to build out your redundancy.
> >
> > Best,
> > Erick
> >
> > On Thu, Jan 4, 2018 at 8:01 AM, Sundaram, Dinesh <
> > dinesh.sunda...@mastercard.com> wrote:
> > > Thanks Shawn for your prompt response. Assume I have solrcloud A
> > > server
> > with 1 node runs on 8983 port and solrcloud B server with 1 node runs
> > on 8983, here I want to synch up the collection between solrcloud A
> > and B using the below replication handler. Is this advisable to use at
> > the solrcloud B ?
> > >
> > > 
> > > 
> > > https://urldefense.proofpoint.com/v2/
> > url?u=http-3A__solrcloudA-3A8983_solr_-24-257Bsolr.core.
> > name-257D_replication&d=DwIFaQ&c=uc5ZRXl8dGLM1RMQwf7xTCjRqXF0jm
> > CF6SP0bDlmMmY&r=gCFZFMR7y0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m=
> > qBCZxvmkOHW9jt8JM8dVSQJuulIJp3Xk2hXvC5bL7DM&s=xGP-
> > 8z2aGBFGrtjIbBMFB6f2cfE4bukyOctAVK_HkyI&e=
> > > 00:00:20
> > > 
> > > 
> > >
> > >
> > >
> > > Dinesh Sundaram
> > > MBS Platform Engineering
> > >
> > > Mastercard
> > >
> > >
> > >
> > > -Original Message-
> > > From: Shawn Heisey [mailto:apa...@elyograg.org]
> > > Sent: Tuesday, January 2, 2018 5:33 PM
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: Solrcloud with Master/Slave
> > >
> > > On 1/2/2018 3:32 PM, Sundaram, Dinesh wrote:
> > >> I have spun up single solrcloud node on 2 servers.
> > >
> > > This makes no sense.  If you have two servers, then you probably
> > > have
> > more than 

Re: trivia question: why q=*:* doesn't return same result as q.alt=*:*

2018-01-04 Thread Nawab Zada Asad Iqbal
Thanks Erik
Here is the output,

http://localhost:8983/solr/filesearch/select?fq=id:1193&q.alt=*:*&debugQuery=true


   - parsedquery: "+MatchAllDocsQuery(*:*)",



http://localhost:8983/solr/filesearch/select?fq=id:1193&q=*:*&debugQuery=true


   - parsedquery: "+DisjunctionMaxQuery((user_email:*:* | user_name:*:* |
   tags:*:* | (name_shingle_zh-cn:, , name_shingle_zh-cn:, ,) | id:*:*)~0.01)
   DisjunctionMaxQuery(((name_shingle_zh-cn:", , , ,"~100)^100.0 |
   tags:*:*)~0.01)",



I find it perplexing as the default values for qf and pf are very different
from above so I am not sure where these fields are coming from (although
they are all valid fields)
e.g. following query uses the my expected set of pf and qf.

http://localhost:8983/solr/filesearch/select?fq=id:1193&q=hello&debugQuery=true



   - parsedquery: "+DisjunctionMaxQuery(((name_token:hello)^60.0 |
   user_email:hello | (name_combined:hello)^10.0 | (name_zh-cn:hello)^10.0 |
   name_shingle:hello | comments:hello | user_name:hello | description:hello |
   file_content_zh-cn:hello | file_content_de:hello | tags:hello |
   file_content_it:hell | file_content_fr:hello | file_content_es:hell |
   file_content_en:hello | id:hello)~0.01)
   DisjunctionMaxQuery((description:hello | (name_shingle:hello)^100.0 |
   comments:hello | tags:hello)~0.01)",





On Thu, Jan 4, 2018 at 5:22 PM, Erick Erickson 
wrote:

> Hmm, seems odd. What happens when you attach &debug=query? I'm curious how
> the parsed queries differ.
>
> On Jan 4, 2018 15:14, "Nawab Zada Asad Iqbal"  wrote:
>
> > Hi,
> >
> > In my SearchHandler solrconfig, i have q.alt=*:* . This allows me to run
> > queries which only have `fq` filters and no `q`.
> >
> > If I remove q.alt from the solrconfig and specify `q=*:*` in the query
> > parameters, it does not give any results. I also tried `q=*` but of no
> > avail.
> >
> > Is there some good reason for this behavior? Since I already know a work
> > around, this question is only for my curiosity.
> >
> >
> > Thanks
> > Nawab
> >
>


Re: trivia question: why q=*:* doesn't return same result as q.alt=*:*

2018-01-04 Thread Erik Hatcher
defType=???  Probably dismax.  It doesn’t do *:* like edismax or lucene.  

> On Jan 4, 2018, at 20:39, Nawab Zada Asad Iqbal  wrote:
> 
> Thanks Erik
> Here is the output,
> 
> http://localhost:8983/solr/filesearch/select?fq=id:1193&q.alt=*:*&debugQuery=true
> 
> 
>   - parsedquery: "+MatchAllDocsQuery(*:*)",
> 
> 
> 
> http://localhost:8983/solr/filesearch/select?fq=id:1193&q=*:*&debugQuery=true
> 
> 
>   - parsedquery: "+DisjunctionMaxQuery((user_email:*:* | user_name:*:* |
>   tags:*:* | (name_shingle_zh-cn:, , name_shingle_zh-cn:, ,) | id:*:*)~0.01)
>   DisjunctionMaxQuery(((name_shingle_zh-cn:", , , ,"~100)^100.0 |
>   tags:*:*)~0.01)",
> 
> 
> 
> I find it perplexing as the default values for qf and pf are very different
> from above so I am not sure where these fields are coming from (although
> they are all valid fields)
> e.g. following query uses the my expected set of pf and qf.
> 
> http://localhost:8983/solr/filesearch/select?fq=id:1193&q=hello&debugQuery=true
> 
> 
> 
>   - parsedquery: "+DisjunctionMaxQuery(((name_token:hello)^60.0 |
>   user_email:hello | (name_combined:hello)^10.0 | (name_zh-cn:hello)^10.0 |
>   name_shingle:hello | comments:hello | user_name:hello | description:hello |
>   file_content_zh-cn:hello | file_content_de:hello | tags:hello |
>   file_content_it:hell | file_content_fr:hello | file_content_es:hell |
>   file_content_en:hello | id:hello)~0.01)
>   DisjunctionMaxQuery((description:hello | (name_shingle:hello)^100.0 |
>   comments:hello | tags:hello)~0.01)",
> 
> 
> 
> 
> 
> On Thu, Jan 4, 2018 at 5:22 PM, Erick Erickson 
> wrote:
> 
>> Hmm, seems odd. What happens when you attach &debug=query? I'm curious how
>> the parsed queries differ.
>> 
>>> On Jan 4, 2018 15:14, "Nawab Zada Asad Iqbal"  wrote:
>>> 
>>> Hi,
>>> 
>>> In my SearchHandler solrconfig, i have q.alt=*:* . This allows me to run
>>> queries which only have `fq` filters and no `q`.
>>> 
>>> If I remove q.alt from the solrconfig and specify `q=*:*` in the query
>>> parameters, it does not give any results. I also tried `q=*` but of no
>>> avail.
>>> 
>>> Is there some good reason for this behavior? Since I already know a work
>>> around, this question is only for my curiosity.
>>> 
>>> 
>>> Thanks
>>> Nawab
>>> 
>> 


CommonGramsFilter

2018-01-04 Thread Nawab Zada Asad Iqbal
Hi,

I am looking at this documentation and wondering if it would be better to
optionally skip indexing of original stopwords.

https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html#
FilterDescriptions-CommonGramsFilter

http://localhost:8983/solr/filesearch/select?q=not%20to%20or%20be&debugQuery=true


   - parsedquery: "+(-DisjunctionMaxQuery((commongram_field2:to)~0.01)
   DisjunctionMaxQuery((commongram_field2:be)~0.01))~1",



Other parameters are:


   - params: {
  - mm: " 1<-0% ",
  - q.alt: "*:*",
  - ps: "100",
  - echoParams: "all",
  - sort: "score desc",
  - rows: "35",
  - version: "2.2",
  - q: "not to or be",
  - tie: "0.01",
  - defType: "edismax",
  - qf: "commongram_field2",
  - sow: "false",
  - wt: "json",
  - debugQuery: "true"
  }


And it doesn't match my document, which has following fields:


   - id: "9191",
   - commongram_field2: "not to or be",



Commongram is defined as:




  











  
  








  



I am not sure what I am missing. I have also set sow=false so that the
whole query string is sent to field's analysis chain instead of sending
word by word. But that didnt' seem to help.

Thanks
Nawab


Re: CommonGramsFilter

2018-01-04 Thread Nawab Zada Asad Iqbal
After some debugging, it  seems that the search works if the query is
phrase search (i.e, enclosed in quotes)

http://localhost:8983/solr/filesearch/select?q=%22not%20to%20or%20be%22&debugQuery=true

This works both in case of sow=true or false.

Is it mandatory to use phrase search to properly pass the stopwords to the
CommonGramsFilter?





On Thu, Jan 4, 2018 at 6:08 PM, Nawab Zada Asad Iqbal 
wrote:

> Hi,
>
> I am looking at this documentation and wondering if it would be better to
> optionally skip indexing of original stopwords.
>
> https://lucene.apache.org/solr/guide/6_6/filter-descriptions
> .html#FilterDescriptions-CommonGramsFilter
>
> http://localhost:8983/solr/filesearch/select?q=not%20to%
> 20or%20be&debugQuery=true
>
>
>- parsedquery: "+(-DisjunctionMaxQuery((commongram_field2:to)~0.01)
>DisjunctionMaxQuery((commongram_field2:be)~0.01))~1",
>
>
>
> Other parameters are:
>
>
>- params: {
>   - mm: " 1<-0% ",
>   - q.alt: "*:*",
>   - ps: "100",
>   - echoParams: "all",
>   - sort: "score desc",
>   - rows: "35",
>   - version: "2.2",
>   - q: "not to or be",
>   - tie: "0.01",
>   - defType: "edismax",
>   - qf: "commongram_field2",
>   - sow: "false",
>   - wt: "json",
>   - debugQuery: "true"
>   }
>
>
> And it doesn't match my document, which has following fields:
>
>
>- id: "9191",
>- commongram_field2: "not to or be",
>
>
>
> Commongram is defined as:
>
>  stored="true" omitPositions="false"/>
>
>  positionIncrementGap="100">
>   
> 
> 
>  generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" preserveOriginal="0"
> splitOnCaseChange="1" splitOnNumerics="1" stemEnglishPossessive="0"/>
> 
>  pattern="^(\p{Punct}*)(.*?)(\p{Punct}*)$" replacement="$2"/>
> 
> 
>  words="stopwords.txt" ignoreCase="true"/>
> 
>  maxTokenCount="1" consumeAllTokens="false"/>
> 
>   
>   
> 
> 
>  generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" preserveOriginal="0"
> splitOnCaseChange="1" splitOnNumerics="1" stemEnglishPossessive="0"/>
>  pattern="^(\p{Punct}*)(.*?)(\p{Punct}*)$" replacement="$2"/>
> 
> 
>  words="stopwords.txt" ignoreCase="true"/>
> 
>   
> 
>
>
> I am not sure what I am missing. I have also set sow=false so that the
> whole query string is sent to field's analysis chain instead of sending
> word by word. But that didnt' seem to help.
>
> Thanks
> Nawab
>


Re: problem with Solr Sorting by score and distance together

2018-01-04 Thread Susheel Kumar
Hi Deepak,  As Shawn mentioned, switch your q and fq values above like

q=facilityName:"orthodontist"+OR+facilityName:*orthodontist*
+OR+facilityName:"paul"+OR+facilityName:*paul*+OR+facilityName:*paul+
orthodontist*+OR+facilityName:"paul+orthodontist"+OR+
firstName:"orthodontist"+OR+firstName:*orthodontist*+OR+
firstName:"paul"+OR+firstName:*paul*+OR+firstName:*paul+
orthodontist*+OR+firstName:..
...&fq={!geofilt+sfield%3Dlocation+pt%3D37.564143,-122.004179+d%3D60.0}

Also looking your query you would be better off with using catch all field
when you are trying to find same text in multiple fields

Thnx


On Thu, Jan 4, 2018 at 7:26 PM, Deepak Udapudi  wrote:

> Hi Shawn,
>
> Thanks for the response.
>
> In the problem example in the below email I had used a hypothetical
> example for my query.
>
> Actually, we are trying to search for the name and specialty
> combination(for ex:- paul orthodontist) of the dentist sorted by the
> highest score and distance (in case of same dentists matching the free text
> criteria).
>
> Below are the Solr logs.
>
> 2018-01-05 00:13:05.835 INFO  (qtp1348949648-14) [
>  x:provider_collection] o.a.s.c.S.Request [provider_collection]
> webapp=/solr path=/select params={q=distance:{!geofilt+
> sfield%3Dlocation+pt%3D37.564143,-122.004179+d%3D60.0}&
> fl=*,distance:mul(geodist(location,37.5641425,-122.
> 004179),0.621371)&start=0&fq=facilityName:"orthodontist"+
> OR+facilityName:*orthodontist*+OR+facilityName:"paul"+OR+
> facilityName:*paul*+OR+facilityName:*paul+orthodontist*+OR+facilityName:
> "paul+orthodontist"+OR+firstName:"orthodontist"+OR+
> firstName:*orthodontist*+OR+firstName:"paul"+OR+firstName:
> *paul*+OR+firstName:*paul+orthodontist*+OR+firstName:"
> paul+orthodontist"+OR+fullName:"orthodontist"+OR+
> fullName:*orthodontist*+OR+fullName:"paul"+OR+fullName:*
> paul*+OR+fullName:*paul+orthodontist*+OR+fullName:"paul+orthodontist"+OR+
> groupPracticeNpi:"orthodontist"+OR+groupPracticeNpi:*orthodontist*+OR+
> groupPracticeNpi:"paul"+OR+groupPracticeNpi:*paul*+OR+
> groupPracticeNpi:*paul+orthodontist*+OR+groupPracticeNpi:"paul+
> orthodontist"+OR+keywords:"orthodontist"+OR+keywords:*
> orthodontist*+OR+keywords:"paul"+OR+keywords:*paul*+OR+
> keywords:*paul+orthodontist*+OR+keywords:"paul+orthodontist"+OR+lastName:"
> orthodontist"+OR+lastName:*orthodontist*+OR+lastName:"
> paul"+OR+lastName:*paul*+OR+lastName:*paul+orthodontist*+
> OR+lastName:"paul+orthodontist"+OR+licenseNumber:"orthodontist"+
> OR+licenseNumber:*orthodontist*+OR+licenseNumber:"paul"+OR+
> licenseNumber:*paul*+OR+licenseNumber:*paul+orthodontist*+OR+
> licenseNumber:"paul+orthodontist"+OR+npi:"orthodontist"+OR+npi:*
> orthodontist*+OR+npi:"paul"+OR+npi:*paul*+OR+npi:*paul+
> orthodontist*+OR+npi:"paul+orthodontist"+OR+officeName:"
> orthodontist"+OR+officeName:*orthodontist*+OR+officeName:"
> paul"+OR+officeName:*paul*+OR+officeName:*paul+orthodontist*
> +OR+officeName:"paul+orthodontist"+OR+practiceLocationLanguages:"
> orthodontist"+OR+practiceLocationLanguages:*orthodontist*+OR+
> practiceLocationLanguages:"paul"+OR+practiceLocationLanguages:*paul*+OR+
> practiceLocationLanguages:*paul+orthodontist*+OR+
> practiceLocationLanguages:"paul+orthodontist"+OR+practiceLocationNpi:"
> orthodontist"+OR+practiceLocationNpi:*orthodontist*+OR+
> practiceLocationNpi:"paul"+OR+practiceLocationNpi:*paul*+OR+
> practiceLocationNpi:*paul+orthodontist*+OR+practiceLocationNpi:"paul+
> orthodontist"+OR+providerLanguages:"orthodontist"+OR+providerLanguages:*
> orthodontist*+OR+providerLanguages:"paul"+OR+providerLanguages:*paul*+OR+
> providerLanguages:*paul+orthodontist*+OR+providerLanguages:"paul+
> orthodontist"+OR+specialty:"orthodontist"+OR+specialty:*
> orthodontist*+OR+specialty:"paul"+OR+specialty:*paul*+OR+
> specialty:*paul+orthodontist*+OR+specialty:"paul+
> orthodontist"&sort=geodist(location,37.564143,-122.
> 004179)+asc,score+desc&rows=10&wt=javabin&version=2} hits=577 status=0
> QTime=284
>
> 2018-01-05 00:13:06.886 INFO  (qtp1348949648-17) [
>  x:provider_collection] o.a.s.c.S.Request [provider_collection]
> webapp=/solr path=/admin/ping params={wt=javabin&version=2} hits=304592
> status=0 QTime=0
> 2018-01-05 00:13:06.886 INFO  (qtp1348949648-17) [
>  x:provider_collection] o.a.s.c.S.Request [provider_collection]
> webapp=/solr path=/admin/ping params={wt=javabin&version=2} status=0 QTime=0
> 2018-01-05 00:13:06.888 INFO  (qtp1348949648-16) [
>  x:provider_collection] o.a.s.c.S.Request [provider_collection]
> webapp=/solr path=/admin/ping params={wt=javabin&version=2} hits=304592
> status=0 QTime=0
> 2018-01-05 00:13:06.888 INFO  (qtp1348949648-16) [
>  x:provider_collection] o.a.s.c.S.Request [provider_collection]
> webapp=/solr path=/admin/ping params={wt=javabin&version=2} status=0 QTime=0
> 2018-01-05 00:13:06.891 INFO  (qtp1348949648-19) [   x:yelp_collection]
> o.a.s.c.S.Request [yelp_collection]  webapp=/solr path=/admin/ping
> params={wt=javabin&version=2} hits=13 status=0 QTime=0
> 2018-01

Solr - custom ordering

2018-01-04 Thread Vineet Mangla
Hi,

 

We have a Solr cloud core where "jobid" is our primary key. We have a use
case where we have a list of 15000 jobids in a particular order in an
external system. We are calling solr with these 15000 jobids as filter query
and in result, we want all the jobids after filtering in the same order of
input. Is this possible in Solr?

 

 

 

 

Thanks & Regards

Vineet Mangla | Project Lead

 



BOLD Technology Systems Pvt. Ltd.

(formerly LiveCareer)

Aykon Tower, Plot No. 4, Sector - 135, Noida-201301

URL- -   www.bold.com | Cell: +91 (965) 088 0606