Re: No space left on device - When I execute suggester component.

2017-12-20 Thread Shawn Heisey

On 12/20/2017 12:21 AM, Fiz Newyorker wrote:

I tried df -h , during suggest.build command.

Size.   Used   Avail Use%  Mounted on

  63G   17G 44G  28% /ngs/app


That cannot be the entire output of that command.  Here's what I get 
when I do it:


root@smeagol:~# df -h
Filesystem  Size  Used Avail Use% Mounted on
udev 12G 0   12G   0% /dev
tmpfs   2.4G  251M  2.2G  11% /run
/dev/sda5   220G   15G  194G   8% /
tmpfs12G  412K   12G   1% /dev/shm
tmpfs   5.0M 0  5.0M   0% /run/lock
tmpfs12G 0   12G   0% /sys/fs/cgroup
/dev/sda147G  248M   45G   1% /boot
tmpfs   2.4G   84K  2.4G   1% /run/user/1000
tmpfs   2.4G 0  2.4G   0% /run/user/141
tmpfs   2.4G 0  2.4G   0% /run/user/0

If the disk has enough free space, then there is probably something else 
at work, like a filesystem quota for the user that is running Solr, or 
some other kind of limitation that has been configured.


Thanks,
Shawn


Keep indexed records

2017-12-20 Thread Shashi Roushan
Hello All,

I want to keep indexed records live in solr, during data import when sql
query not returning any record. We also need to clean= true, because when
SQL query return records then solr should be reindexed.
Only avoid reindexing, when SQL query not return any rows.

Please suggests.

Regards,
Shashi Roushan


Re: Keep indexed records

2017-12-20 Thread Emir Arnautović
Hi Shashi,
IMO it would be best if you put that logic on your controller where you start 
import. If you are doing that through admin console, the only solution I am 
aware of is to write your custom component.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 20 Dec 2017, at 11:45, Shashi Roushan  wrote:
> 
> Hello All,
> 
> I want to keep indexed records live in solr, during data import when sql
> query not returning any record. We also need to clean= true, because when
> SQL query return records then solr should be reindexed.
> Only avoid reindexing, when SQL query not return any rows.
> 
> Please suggests.
> 
> Regards,
> Shashi Roushan



DocTransformer: Float cannot be cast to org.apache.lucene.document.StoredField

2017-12-20 Thread Markus Jelsma
Hello,

Recently i had to make yet another DocTransformer. It ran fine on my local 
machine, this is what i get in production, on freshly reindexed data.

2017-12-20 12:12:58.987 ERROR (qtp329611835-17) [c:documents s:shard2 
r:core_node1 x:documents_shard2_replica2] o.a.s.s.HttpSolrCall 
null:java.lang.ClassCastException: java.lang.Float cannot be cast to 
org.apache.lucene.document.StoredField
at 
io.openindex.lunar.response.transform.ScoreNormalizingTransformer.transform(ScoreNormalizingTransformer.java:80)
at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:120)
at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:57)
at 
org.apache.solr.response.BinaryResponseWriter$Resolver.writeResultsBody(BinaryResponseWriter.java:126)
at 
org.apache.solr.response.BinaryResponseWriter$Resolver.writeResults(BinaryResponseWriter.java:145)
at 
org.apache.solr.response.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:89)

It trips over this line. I need to get the float from the document.
 StoredField value = (StoredField)doc.get(field);

I have had this before sometimes with another DocTransformer, i never solved 
the problem it went away instead.

Any ideas?

Many many thanks,
Markus


Re: DocTransformer: Float cannot be cast to org.apache.lucene.document.StoredField

2017-12-20 Thread Emir Arnautović
Hi Markus,
You are trying to cast to stored field without checking if that is actually 
StoredField. What you can do is check first if StoredField or Float or… and 
cast to appropriate value.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 20 Dec 2017, at 14:09, Markus Jelsma  wrote:
> 
> Hello,
> 
> Recently i had to make yet another DocTransformer. It ran fine on my local 
> machine, this is what i get in production, on freshly reindexed data.
> 
> 2017-12-20 12:12:58.987 ERROR (qtp329611835-17) [c:documents s:shard2 
> r:core_node1 x:documents_shard2_replica2] o.a.s.s.HttpSolrCall 
> null:java.lang.ClassCastException: java.lang.Float cannot be cast to 
> org.apache.lucene.document.StoredField
>at 
> io.openindex.lunar.response.transform.ScoreNormalizingTransformer.transform(ScoreNormalizingTransformer.java:80)
>at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:120)
>at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:57)
>at 
> org.apache.solr.response.BinaryResponseWriter$Resolver.writeResultsBody(BinaryResponseWriter.java:126)
>at 
> org.apache.solr.response.BinaryResponseWriter$Resolver.writeResults(BinaryResponseWriter.java:145)
>at 
> org.apache.solr.response.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:89)
> 
> It trips over this line. I need to get the float from the document.
> StoredField value = (StoredField)doc.get(field);
> 
> I have had this before sometimes with another DocTransformer, i never solved 
> the problem it went away instead.
> 
> Any ideas?
> 
> Many many thanks,
> Markus



Authentication Plugin

2017-12-20 Thread Chris Ulicny
Hi all,

We've got a solrcloud cluster set up on 6.3.0 with the BasicAuthentication
plugin enabled. All of the hosts are time synchronized using ntp and are on
the same network switch.

We're periodically experiencing issues where follower replicas are put into
down states by the leader in the case of requests that failed due to
invalid timestamps. To minimize the issue we've increased the pkiauth.ttl
value to 1, and that seems to have taken care of most of the
occurrences.

As vague as the question is, is there anything specific with solr that we
could look into that would affect the requests having invalid keys?

We are working on tracking ntp's performance in case there was some sort of
lapse, but everything we've seen puts the hosts within around 20
milliseconds of each other at worst.

Possibly related but only noticed yesterday. A request for recovery was
sent from a leader to a follower replica and it didn't seem to have an
authorization header, and the wrong user was chosen.

2017-12-19 23:10:44.764 INFO  (qtp759156157-8224123) [   ]
o.a.s.s.RuleBasedAuthorizationPlugin This resource is configured to have a
permission {
  "name":"core-admin-edit",
  "role":"admin"}, The principal [principal: solrwriter] does not have the
right role
2017-12-19 23:10:44.765 INFO  (qtp759156157-8224123) [   ]
o.a.s.s.HttpSolrCall USER_REQUIRED auth header null context :
userPrincipal: [[principal: solrwriter]] type: [ADMIN], collections: [],
Path: [/admin/cores] path : /admin/cores params
:core=Feeds_shard11_replica2&action=REQUESTRECOVERY&wt=javabin&version=2

How does solr determine what user/authentication to use for inter-node
requests? Are there any of the predefined permissions that we shouldn't
have assigned to a user that are causing this?

Thanks,
Chris


RE: DocTransformer: Float cannot be cast to org.apache.lucene.document.StoredField

2017-12-20 Thread Markus Jelsma
Are you telling my that SolrDocument.get(key) can return both StoredField or 
the actual class of the value?

The code ran fine locally. There i got a StoredField and had to use 
numericValue() to get my float.

Thanks,
Markus

 
 
-Original message-
> From:Emir Arnautović 
> Sent: Wednesday 20th December 2017 14:49
> To: solr-user@lucene.apache.org
> Subject: Re: DocTransformer: Float cannot be cast to 
> org.apache.lucene.document.StoredField
> 
> Hi Markus,
> You are trying to cast to stored field without checking if that is actually 
> StoredField. What you can do is check first if StoredField or Float or… and 
> cast to appropriate value.
> 
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> 
> 
> 
> > On 20 Dec 2017, at 14:09, Markus Jelsma  wrote:
> > 
> > Hello,
> > 
> > Recently i had to make yet another DocTransformer. It ran fine on my local 
> > machine, this is what i get in production, on freshly reindexed data.
> > 
> > 2017-12-20 12:12:58.987 ERROR (qtp329611835-17) [c:documents s:shard2 
> > r:core_node1 x:documents_shard2_replica2] o.a.s.s.HttpSolrCall 
> > null:java.lang.ClassCastException: java.lang.Float cannot be cast to 
> > org.apache.lucene.document.StoredField
> >at 
> > io.openindex.lunar.response.transform.ScoreNormalizingTransformer.transform(ScoreNormalizingTransformer.java:80)
> >at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:120)
> >at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:57)
> >at 
> > org.apache.solr.response.BinaryResponseWriter$Resolver.writeResultsBody(BinaryResponseWriter.java:126)
> >at 
> > org.apache.solr.response.BinaryResponseWriter$Resolver.writeResults(BinaryResponseWriter.java:145)
> >at 
> > org.apache.solr.response.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:89)
> > 
> > It trips over this line. I need to get the float from the document.
> > StoredField value = (StoredField)doc.get(field);
> > 
> > I have had this before sometimes with another DocTransformer, i never 
> > solved the problem it went away instead.
> > 
> > Any ideas?
> > 
> > Many many thanks,
> > Markus
> 
> 


Re: DocTransformer: Float cannot be cast to org.apache.lucene.document.StoredField

2017-12-20 Thread Emir Arnautović
I did not check the code, but that is what error is suggesting. Can you check 
if field definition is the same locally and on other Solr. Since Solr can use 
doc values as stored, I would guess that it is not always StoredField that is 
returned.

Regards,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 20 Dec 2017, at 15:52, Markus Jelsma  wrote:
> 
> Are you telling my that SolrDocument.get(key) can return both StoredField or 
> the actual class of the value?
> 
> The code ran fine locally. There i got a StoredField and had to use 
> numericValue() to get my float.
> 
> Thanks,
> Markus
> 
> 
> 
> -Original message-
>> From:Emir Arnautović 
>> Sent: Wednesday 20th December 2017 14:49
>> To: solr-user@lucene.apache.org
>> Subject: Re: DocTransformer: Float cannot be cast to 
>> org.apache.lucene.document.StoredField
>> 
>> Hi Markus,
>> You are trying to cast to stored field without checking if that is actually 
>> StoredField. What you can do is check first if StoredField or Float or… and 
>> cast to appropriate value.
>> 
>> HTH,
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>> 
>> 
>> 
>>> On 20 Dec 2017, at 14:09, Markus Jelsma  wrote:
>>> 
>>> Hello,
>>> 
>>> Recently i had to make yet another DocTransformer. It ran fine on my local 
>>> machine, this is what i get in production, on freshly reindexed data.
>>> 
>>> 2017-12-20 12:12:58.987 ERROR (qtp329611835-17) [c:documents s:shard2 
>>> r:core_node1 x:documents_shard2_replica2] o.a.s.s.HttpSolrCall 
>>> null:java.lang.ClassCastException: java.lang.Float cannot be cast to 
>>> org.apache.lucene.document.StoredField
>>>   at 
>>> io.openindex.lunar.response.transform.ScoreNormalizingTransformer.transform(ScoreNormalizingTransformer.java:80)
>>>   at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:120)
>>>   at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:57)
>>>   at 
>>> org.apache.solr.response.BinaryResponseWriter$Resolver.writeResultsBody(BinaryResponseWriter.java:126)
>>>   at 
>>> org.apache.solr.response.BinaryResponseWriter$Resolver.writeResults(BinaryResponseWriter.java:145)
>>>   at 
>>> org.apache.solr.response.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:89)
>>> 
>>> It trips over this line. I need to get the float from the document.
>>>StoredField value = (StoredField)doc.get(field);
>>> 
>>> I have had this before sometimes with another DocTransformer, i never 
>>> solved the problem it went away instead.
>>> 
>>> Any ideas?
>>> 
>>> Many many thanks,
>>> Markus
>> 
>> 



RE: DocTransformer: Float cannot be cast to org.apache.lucene.document.StoredField

2017-12-20 Thread Markus Jelsma
Ah of course, it worked before i enabled docValues for that field.

Got it working again!
Thanks!
 
-Original message-
> From:Emir Arnautović 
> Sent: Wednesday 20th December 2017 16:02
> To: solr-user@lucene.apache.org
> Subject: Re: DocTransformer: Float cannot be cast to 
> org.apache.lucene.document.StoredField
> 
> I did not check the code, but that is what error is suggesting. Can you check 
> if field definition is the same locally and on other Solr. Since Solr can use 
> doc values as stored, I would guess that it is not always StoredField that is 
> returned.
> 
> Regards,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> 
> 
> 
> > On 20 Dec 2017, at 15:52, Markus Jelsma  wrote:
> > 
> > Are you telling my that SolrDocument.get(key) can return both StoredField 
> > or the actual class of the value?
> > 
> > The code ran fine locally. There i got a StoredField and had to use 
> > numericValue() to get my float.
> > 
> > Thanks,
> > Markus
> > 
> > 
> > 
> > -Original message-
> >> From:Emir Arnautović 
> >> Sent: Wednesday 20th December 2017 14:49
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: DocTransformer: Float cannot be cast to 
> >> org.apache.lucene.document.StoredField
> >> 
> >> Hi Markus,
> >> You are trying to cast to stored field without checking if that is 
> >> actually StoredField. What you can do is check first if StoredField or 
> >> Float or… and cast to appropriate value.
> >> 
> >> HTH,
> >> Emir
> >> --
> >> Monitoring - Log Management - Alerting - Anomaly Detection
> >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >> 
> >> 
> >> 
> >>> On 20 Dec 2017, at 14:09, Markus Jelsma  
> >>> wrote:
> >>> 
> >>> Hello,
> >>> 
> >>> Recently i had to make yet another DocTransformer. It ran fine on my 
> >>> local machine, this is what i get in production, on freshly reindexed 
> >>> data.
> >>> 
> >>> 2017-12-20 12:12:58.987 ERROR (qtp329611835-17) [c:documents s:shard2 
> >>> r:core_node1 x:documents_shard2_replica2] o.a.s.s.HttpSolrCall 
> >>> null:java.lang.ClassCastException: java.lang.Float cannot be cast to 
> >>> org.apache.lucene.document.StoredField
> >>>   at 
> >>> io.openindex.lunar.response.transform.ScoreNormalizingTransformer.transform(ScoreNormalizingTransformer.java:80)
> >>>   at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:120)
> >>>   at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:57)
> >>>   at 
> >>> org.apache.solr.response.BinaryResponseWriter$Resolver.writeResultsBody(BinaryResponseWriter.java:126)
> >>>   at 
> >>> org.apache.solr.response.BinaryResponseWriter$Resolver.writeResults(BinaryResponseWriter.java:145)
> >>>   at 
> >>> org.apache.solr.response.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:89)
> >>> 
> >>> It trips over this line. I need to get the float from the document.
> >>>StoredField value = (StoredField)doc.get(field);
> >>> 
> >>> I have had this before sometimes with another DocTransformer, i never 
> >>> solved the problem it went away instead.
> >>> 
> >>> Any ideas?
> >>> 
> >>> Many many thanks,
> >>> Markus
> >> 
> >> 
> 
> 


Re: Solr 7.1 Solrcloud dynamic/automatic replicas

2017-12-20 Thread Erick Erickson
The internal method is ZkController.generateNodeName(), although it's
fairly simple, there are bunches of samples in ZkControllerTest

But yeah, it requires that you know your hostname and port, and the
context is "solr".

On Tue, Dec 19, 2017 at 8:04 PM, Greg Roodt  wrote:
> Ok, thanks. I'll take a look into using the ADDREPLICA API.
>
> I've found a few examples of the znode format. It seems to be IP:PORT_solr
> (where I presume _solr is the name of the context or something?).
>
> Is there a way to discover what a znode is? i.e. Can my new node determine
> what it's znode is? Or is my only option to use the IP:PORT_solr convention?
>
>
>
>
> On 20 December 2017 at 11:33, Erick Erickson 
> wrote:
>
>> Yes, ADDREPLICA is mostly equivalent, it's also supported going forward
>>
>> LegacyCloud should work temporarily, I'd change it going forward though.
>>
>> Finally, you'll want to add a "node" parameter to insure your replica is
>> placed on the exact node you want, see the livenodes znode for the
>> format...
>>
>> On Dec 19, 2017 16:06, "Greg Roodt"  wrote:
>>
>> > Thanks for the reply. So it sounds like the method that I'm using to
>> > automatically add replicas on Solr 6.2 is not recommended and not going
>> to
>> > be supported in future versions.
>> >
>> > A couple of follow up questions then:
>> > * Do you know if running with legacyCloud=true will make this behaviour
>> > work "for now" until I can find a better way of doing this?
>> > * Will it be enough for my newly added nodes to then startup solr (with
>> > correct ZK_HOST) and call the ADDREPLICA API as follows?
>> > ```
>> > curl http://localhost:port
>> > /solr/admin/collections?action=ADDREPLICA&collection=blah&shard=*shard1*
>> > ```
>> > That seems mostly equivalent to writing that core.properties file that I
>> am
>> > using in 6.2
>> >
>> >
>> >
>> >
>> >
>> > On 20 December 2017 at 09:34, Shawn Heisey  wrote:
>> >
>> > > On 12/19/2017 3:06 PM, Greg Roodt wrote:
>> > > > Thanks for your reply Erick.
>> > > >
>> > > > This is what I'm doing at the moment with Solr 6.2 (I was mistaken,
>> > > before
>> > > > I said 6.1).
>> > > >
>> > > > 1. A new instance comes online
>> > > > 2. Systemd starts solr with a custom start.sh script
>> > > > 3. This script creates a core.properties file that looks like this:
>> > > > ```
>> > > > name=blah
>> > > > shard=shard1
>> > > > ```
>> > > > 4. Script starts solr via the jar.
>> > > > ```
>> > > > java -DzkHost=... -jar start.jar
>> > > > ```
>> > >
>> > > The way that we would expect this to be normally done is a little
>> > > different.  Adding a node to the cloud normally will NOT copy any
>> > > indexes.  You have basically tricked SolrCloud into adding the replica
>> > > automatically by creating a core before Solr starts.  SolrCloud
>> > > incorporates the new core into the cluster according to the info that
>> > > you have put in core.properties, notices that it has no index, and
>> > > replicates it from the existing leader.
>> > >
>> > > Normally, what we would expect for adding a new node is this:
>> > >
>> > >  * Run the service installer script on the new machine
>> > >  * Add a ZK_HOST variable to /etc/default/solr.in.sh
>> > >  * Use "service solr restart"to get Solr to join the cloud
>> > >  * Call the ADDREPLICA action on the Collections API
>> > >
>> > > The reason that your method works is that currently, the "truth" about
>> > > the cluster is a mixture of what's in ZooKeeper and what's actually
>> > > present on each Solr instance.
>> > >
>> > > There is an effort to change this so that ZooKeeper is the sole source
>> > > of truth, and if a core is found that the ZK database doesn't know
>> > > about, it won't be started, because it's not a known part of the
>> > > cluster.  If this goal is realized in a future version of Solr, then
>> the
>> > > method you're currently using is not going to work like it does at the
>> > > moment.  I do not know how much of this has been done, but I know that
>> > > there have been people working on it.
>> > >
>> > > Thanks,
>> > > Shawn
>> > >
>> > >
>> >
>>


Are the entries in managed-schema order dependent?

2017-12-20 Thread Michael Joyner

Hey all,

I'm wanting to update our managed-schemas to include the latest options 
available in the 6.6.2 branch. (point types for one)


I would like to be able to sort them and diff them (production vs dist 
supplied) to create a simple patch that can be reviewed, edited if 
necessary, and then applied to the production schemas.


I'm thinking this approach would be least human error prone, but, the 
schemas would need to be diffable and I can only see this as doable if 
they are sorted so that common parts diff out. I only see this approach 
easily workable if the entries aren't order dependent. (Presuming I can 
get all the various schema settings to fit neatly on single lines...).


Or does there exist a list of schema entries added along different point 
releases?


-Mike/NewsRx



Re: Are the entries in managed-schema order dependent?

2017-12-20 Thread Erick Erickson
The schema is not order dependent, I freely mix-n-match the fieldType,
copyField and field definitions for instance.



On Wed, Dec 20, 2017 at 8:29 AM, Michael Joyner  wrote:
> Hey all,
>
> I'm wanting to update our managed-schemas to include the latest options
> available in the 6.6.2 branch. (point types for one)
>
> I would like to be able to sort them and diff them (production vs dist
> supplied) to create a simple patch that can be reviewed, edited if
> necessary, and then applied to the production schemas.
>
> I'm thinking this approach would be least human error prone, but, the
> schemas would need to be diffable and I can only see this as doable if they
> are sorted so that common parts diff out. I only see this approach easily
> workable if the entries aren't order dependent. (Presuming I can get all the
> various schema settings to fit neatly on single lines...).
>
> Or does there exist a list of schema entries added along different point
> releases?
>
> -Mike/NewsRx
>


Build suggester in different directory (not /tmp).

2017-12-20 Thread Matthew Roth
Hi List,

I am building a few suggester's and I am receiving the error that I have no
space left on device.



No space left on device

java.io.IOException: No space left on device at
sun.nio.ch.FileDispatcherImpl.write0(Native Method) at
...



At first this threw me. df showed I had over 100 G free. the /data dir the
suggester is being constructed from is only 4G. On a subsequent run I
notice that the suggester is first being built in /tmp. When setting up the
LVM I only allotted 2g's to that directory and I prefer to keep it that
way. Is there a way to build the suggester's in an alternative dir? I am
not seeing anything in the documentation (
https://lucene.apache.org/solr/guide/6_6/suggester.html)

I should note that I am using solr 6.6.0

Best,
Matt


Re: No space left on device - When I execute suggester component.

2017-12-20 Thread Matthew Roth
Oh, this seems relevant to my recent post to the list. My problem is that
the suggester's are first being built in /tmp and moved to /var. tmp has a
total of 2g's free whereas /var has near 100G.

Perhaps you are running into the same problem I am in this regard? How does
your /tmp dir look when building?

Matt


On Wed, Dec 20, 2017 at 2:59 AM, Shawn Heisey  wrote:

> On 12/20/2017 12:21 AM, Fiz Newyorker wrote:
>
>> I tried df -h , during suggest.build command.
>>
>> Size.   Used   Avail Use%  Mounted on
>>
>>   63G   17G 44G  28% /ngs/app
>>
>
> That cannot be the entire output of that command.  Here's what I get when
> I do it:
>
> root@smeagol:~# df -h
> Filesystem  Size  Used Avail Use% Mounted on
> udev 12G 0   12G   0% /dev
> tmpfs   2.4G  251M  2.2G  11% /run
> /dev/sda5   220G   15G  194G   8% /
> tmpfs12G  412K   12G   1% /dev/shm
> tmpfs   5.0M 0  5.0M   0% /run/lock
> tmpfs12G 0   12G   0% /sys/fs/cgroup
> /dev/sda147G  248M   45G   1% /boot
> tmpfs   2.4G   84K  2.4G   1% /run/user/1000
> tmpfs   2.4G 0  2.4G   0% /run/user/141
> tmpfs   2.4G 0  2.4G   0% /run/user/0
>
> If the disk has enough free space, then there is probably something else
> at work, like a filesystem quota for the user that is running Solr, or some
> other kind of limitation that has been configured.
>
> Thanks,
> Shawn
>


Re: Build suggester in different directory (not /tmp).

2017-12-20 Thread Matthew Roth
I have an incomplete solution. I was trying to build three suggester's at
once. If I added the ?suggest.dictionary= parameter and built one at
a time it worked out fine. However, this means I will need to set
buildOnCommit and buildOnStartup to false. This is less than ideal.
Building in a different directory would still be preferable.


Best,
Matt

On Wed, Dec 20, 2017 at 12:05 PM, Matthew Roth  wrote:

> Hi List,
>
> I am building a few suggester's and I am receiving the error that I have
> no space left on device.
>
>
> 
> No space left on device
> 
> java.io.IOException: No space left on device at
> sun.nio.ch.FileDispatcherImpl.write0(Native Method) at
> ...
>
>
>
> At first this threw me. df showed I had over 100 G free. the /data dir
> the suggester is being constructed from is only 4G. On a subsequent run I
> notice that the suggester is first being built in /tmp. When setting up
> the LVM I only allotted 2g's to that directory and I prefer to keep it that
> way. Is there a way to build the suggester's in an alternative dir? I am
> not seeing anything in the documentation (https://lucene.apache.org/
> solr/guide/6_6/suggester.html)
>
> I should note that I am using solr 6.6.0
>
> Best,
> Matt
>


Re: Are the entries in managed-schema order dependent?

2017-12-20 Thread Michael Joyner

Thanks!


On 12/20/2017 11:37 AM, Erick Erickson wrote:

The schema is not order dependent, I freely mix-n-match the fieldType,
copyField and field definitions for instance.



On Wed, Dec 20, 2017 at 8:29 AM, Michael Joyner  wrote:

Hey all,

I'm wanting to update our managed-schemas to include the latest options
available in the 6.6.2 branch. (point types for one)

I would like to be able to sort them and diff them (production vs dist
supplied) to create a simple patch that can be reviewed, edited if
necessary, and then applied to the production schemas.

I'm thinking this approach would be least human error prone, but, the
schemas would need to be diffable and I can only see this as doable if they
are sorted so that common parts diff out. I only see this approach easily
workable if the entries aren't order dependent. (Presuming I can get all the
various schema settings to fit neatly on single lines...).

Or does there exist a list of schema entries added along different point
releases?

-Mike/NewsRx





Re: Are the entries in managed-schema order dependent?

2017-12-20 Thread Alexandre Rafalovitch
Actually, I think Solr does rearrange everything to its liking
(alphabetical?) when it rewrites managed-schema. So, if the
definitions are added via API, the order will be deterministic.

That's what I believe though, I can't remember testing it exhaustively
with physically rearranged types.

Regards,
   Alex

On 20 December 2017 at 11:37, Erick Erickson  wrote:
> The schema is not order dependent, I freely mix-n-match the fieldType,
> copyField and field definitions for instance.
>
>
>
> On Wed, Dec 20, 2017 at 8:29 AM, Michael Joyner  wrote:
>> Hey all,
>>
>> I'm wanting to update our managed-schemas to include the latest options
>> available in the 6.6.2 branch. (point types for one)
>>
>> I would like to be able to sort them and diff them (production vs dist
>> supplied) to create a simple patch that can be reviewed, edited if
>> necessary, and then applied to the production schemas.
>>
>> I'm thinking this approach would be least human error prone, but, the
>> schemas would need to be diffable and I can only see this as doable if they
>> are sorted so that common parts diff out. I only see this approach easily
>> workable if the entries aren't order dependent. (Presuming I can get all the
>> various schema settings to fit neatly on single lines...).
>>
>> Or does there exist a list of schema entries added along different point
>> releases?
>>
>> -Mike/NewsRx
>>


Re: Build suggester in different directory (not /tmp).

2017-12-20 Thread Erick Erickson
bq: this means I will need to set buildOnCommit and buildOnStartup to false.

Be _very_ careful with these settings. Building your suggester can read the
stored field(s) from _every_ document in your index to build which can
take a very long time (perhaps hours). You'd pay that penalty every time
you started Solr or committed docs. I almost guarantee that buildOnCommit
will be unsatisfactory.

This is one of those things that works fine for testing a small corpus but
can fall over when you scale up.

As for why the suggester gets built in /tmp, perhaps Mike McCandless has
magic to control that, nice find and thanks for sharing it!

Best,
Erick

On Wed, Dec 20, 2017 at 9:27 AM, Matthew Roth  wrote:
> I have an incomplete solution. I was trying to build three suggester's at
> once. If I added the ?suggest.dictionary= parameter and built one at
> a time it worked out fine. However, this means I will need to set
> buildOnCommit and buildOnStartup to false. This is less than ideal.
> Building in a different directory would still be preferable.
>
>
> Best,
> Matt
>
> On Wed, Dec 20, 2017 at 12:05 PM, Matthew Roth  wrote:
>
>> Hi List,
>>
>> I am building a few suggester's and I am receiving the error that I have
>> no space left on device.
>>
>>
>> 
>> No space left on device
>> 
>> java.io.IOException: No space left on device at
>> sun.nio.ch.FileDispatcherImpl.write0(Native Method) at
>> ...
>>
>>
>>
>> At first this threw me. df showed I had over 100 G free. the /data dir
>> the suggester is being constructed from is only 4G. On a subsequent run I
>> notice that the suggester is first being built in /tmp. When setting up
>> the LVM I only allotted 2g's to that directory and I prefer to keep it that
>> way. Is there a way to build the suggester's in an alternative dir? I am
>> not seeing anything in the documentation (https://lucene.apache.org/
>> solr/guide/6_6/suggester.html)
>>
>> I should note that I am using solr 6.6.0
>>
>> Best,
>> Matt
>>


Re: No space left on device - When I execute suggester component.

2017-12-20 Thread Fiz Newyorker
Hi Shawn/Erick/Matt,

I agree with you.  When I execute the command df -h I am getting the
complete list of nfs mount info and Size and available space.  I just
shared one liner out of it..


One more thing I observed whenever I run  suggest.build.

*http://rn.com:8989/solr/LW_Data/ *
suggest?suggest=true&suggest.build=true&suggest.dictionary=fuzzySuggester&wt=json&suggest.q=wills&&indent=on&suggest.cfq=memory


The following files are created in TMP folder of the Machine. This is
causing the Issue.

2017-12-20 18:20:08.280 INFO  (qtp401424608-14) [   x:LW_Data]
o.a.s.h.c.SuggestComponent SuggestComponent prepare with :
suggest.build=true&indent=on&suggest.q=wills&suggest.count=10&suggest=true&suggest.dictionary=fuzzySuggester&wt=json

2017-12-20 18:20:08.280 INFO  (qtp401424608-14) [   x:LW_Data]
o.a.s.s.s.SolrSuggester SolrSuggester.build(fuzzySuggester)

2017-12-20 18:25:25.451 ERROR (qtp401424608-14) [   x:LW_Data]
o.a.s.h.RequestHandlerBase java.io.IOException: No space left on device

at sun.nio.ch.FileDispatcherImpl.write0(Native Method)

at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60)

 since my tmp folder is just 2GB .


*My Question is Why does AutoSuggest build command  creates File in tmp
folder ? Is this Usual behavior ? *


-rw---. 1 razord   raz   554379579 Dec 20 18:40 suggester_input_0.tmp

-rw---. 1 razord   ran   107963828 Dec 20 18:40 suggester_sort_1.tmp

-rw---. 1 razord   raz   106162743 Dec 20 18:40 suggester_sort_2.tmp

-rw---. 1 razord   raz   105870752 Dec 20 18:40 suggester_sort_3.tmp

-rw---. 1 razord   raz   105851613 Dec 20 18:41 suggester_sort_4.tmp

-rw---. 1 razord   raw   87888769 Dec 20 18:41 suggester_sort_5.tmp



Thanks

FIZ.

On Wed, Dec 20, 2017 at 9:12 AM, Matthew Roth  wrote:

> Oh, this seems relevant to my recent post to the list. My problem is that
> the suggester's are first being built in /tmp and moved to /var. tmp has a
> total of 2g's free whereas /var has near 100G.
>
> Perhaps you are running into the same problem I am in this regard? How does
> your /tmp dir look when building?
>
> Matt
>
>
> On Wed, Dec 20, 2017 at 2:59 AM, Shawn Heisey  wrote:
>
> > On 12/20/2017 12:21 AM, Fiz Newyorker wrote:
> >
> >> I tried df -h , during suggest.build command.
> >>
> >> Size.   Used   Avail Use%  Mounted on
> >>
> >>   63G   17G 44G  28% /ngs/app
> >>
> >
> > That cannot be the entire output of that command.  Here's what I get when
> > I do it:
> >
> > root@smeagol:~# df -h
> > Filesystem  Size  Used Avail Use% Mounted on
> > udev 12G 0   12G   0% /dev
> > tmpfs   2.4G  251M  2.2G  11% /run
> > /dev/sda5   220G   15G  194G   8% /
> > tmpfs12G  412K   12G   1% /dev/shm
> > tmpfs   5.0M 0  5.0M   0% /run/lock
> > tmpfs12G 0   12G   0% /sys/fs/cgroup
> > /dev/sda147G  248M   45G   1% /boot
> > tmpfs   2.4G   84K  2.4G   1% /run/user/1000
> > tmpfs   2.4G 0  2.4G   0% /run/user/141
> > tmpfs   2.4G 0  2.4G   0% /run/user/0
> >
> > If the disk has enough free space, then there is probably something else
> > at work, like a filesystem quota for the user that is running Solr, or
> some
> > other kind of limitation that has been configured.
> >
> > Thanks,
> > Shawn
> >
>


Re: Build suggester in different directory (not /tmp).

2017-12-20 Thread Shawn Heisey
On 12/20/2017 10:05 AM, Matthew Roth wrote:
> I am building a few suggester's and I am receiving the error that I have no
> space left on device.



> At first this threw me. df showed I had over 100 G free. the /data dir the
> suggester is being constructed from is only 4G. On a subsequent run I
> notice that the suggester is first being built in /tmp. When setting up the
> LVM I only allotted 2g's to that directory and I prefer to keep it that
> way.

The code is utilizing the "java.io.tmpdir" system property to determine
a temporary directory location to use for the build, before it is put in
the final location.  On POSIX platforms, this will default to /tmp.

If you are starting Solr manually, then you would just need to add the
following parameter to the bin/solr commandline (including the quotes)
to change this location:

-a "-Djava.io.tmpdir=/other/tmp/path"

If you've installed Solr as a service, then I do not think there's any
easy way to adjust this property, other than manually editing bin/solr
to add the -D option to the startup commandline.  We'll need an
enhancement issue in Jira to modify the script so it can set
java.io.tmpdir from an environment variable.

Note that adjusting this property may result in other things that Solr
creates being moved away from /tmp.

Since most POSIX operating systems will automatically delete old files
in /tmp, it's always possible that when you move Java's temp directory,
you'll end up with cruft in the new location that never gets deleted. 
Developers do generally try to clean up temporary files, but sometimes
things go wrong that weren't anticipated.  If that does happen and a
temporary file is created by Lucene/Solr that doesn't get deleted, then
I would consider that a bug that should be fixed.

On Windows systems, Java asks the OS where the temp directory is.  The
info I've found says that the TMP environment variable will override
this location for Windows, but not for other platforms.

Thanks,
Shawn



Re: No space left on device - When I execute suggester component.

2017-12-20 Thread Erick Erickson
It's kind of scary how often serendipity plays it's part. See the thread titled:

"Build suggester in different directory (not /tmp)."

Which basically says that the suggester is being built in /tmp which
may be limited.

And yes, that's where it gets built by default, although the thread I
mentioned has some
potential work-arounds and a suggestion to make this configurable that
would have
to be implemented by (you may see a Solr JIRA to that effect generated soon).

Of course you could make your /tmp volume bigger, but that may not be easy.

Erick

On Wed, Dec 20, 2017 at 11:46 AM, Fiz Newyorker  wrote:
> Hi Shawn/Erick/Matt,
>
> I agree with you.  When I execute the command df -h I am getting the
> complete list of nfs mount info and Size and available space.  I just
> shared one liner out of it..
>
>
> One more thing I observed whenever I run  suggest.build.
>
> *http://rn.com:8989/solr/LW_Data/ *
> suggest?suggest=true&suggest.build=true&suggest.dictionary=fuzzySuggester&wt=json&suggest.q=wills&&indent=on&suggest.cfq=memory
>
>
> The following files are created in TMP folder of the Machine. This is
> causing the Issue.
>
> 2017-12-20 18:20:08.280 INFO  (qtp401424608-14) [   x:LW_Data]
> o.a.s.h.c.SuggestComponent SuggestComponent prepare with :
> suggest.build=true&indent=on&suggest.q=wills&suggest.count=10&suggest=true&suggest.dictionary=fuzzySuggester&wt=json
>
> 2017-12-20 18:20:08.280 INFO  (qtp401424608-14) [   x:LW_Data]
> o.a.s.s.s.SolrSuggester SolrSuggester.build(fuzzySuggester)
>
> 2017-12-20 18:25:25.451 ERROR (qtp401424608-14) [   x:LW_Data]
> o.a.s.h.RequestHandlerBase java.io.IOException: No space left on device
>
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>
> at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60)
>
>  since my tmp folder is just 2GB .
>
>
> *My Question is Why does AutoSuggest build command  creates File in tmp
> folder ? Is this Usual behavior ? *
>
>
> -rw---. 1 razord   raz   554379579 Dec 20 18:40 suggester_input_0.tmp
>
> -rw---. 1 razord   ran   107963828 Dec 20 18:40 suggester_sort_1.tmp
>
> -rw---. 1 razord   raz   106162743 Dec 20 18:40 suggester_sort_2.tmp
>
> -rw---. 1 razord   raz   105870752 Dec 20 18:40 suggester_sort_3.tmp
>
> -rw---. 1 razord   raz   105851613 Dec 20 18:41 suggester_sort_4.tmp
>
> -rw---. 1 razord   raw   87888769 Dec 20 18:41 suggester_sort_5.tmp
>
>
>
> Thanks
>
> FIZ.
>
> On Wed, Dec 20, 2017 at 9:12 AM, Matthew Roth  wrote:
>
>> Oh, this seems relevant to my recent post to the list. My problem is that
>> the suggester's are first being built in /tmp and moved to /var. tmp has a
>> total of 2g's free whereas /var has near 100G.
>>
>> Perhaps you are running into the same problem I am in this regard? How does
>> your /tmp dir look when building?
>>
>> Matt
>>
>>
>> On Wed, Dec 20, 2017 at 2:59 AM, Shawn Heisey  wrote:
>>
>> > On 12/20/2017 12:21 AM, Fiz Newyorker wrote:
>> >
>> >> I tried df -h , during suggest.build command.
>> >>
>> >> Size.   Used   Avail Use%  Mounted on
>> >>
>> >>   63G   17G 44G  28% /ngs/app
>> >>
>> >
>> > That cannot be the entire output of that command.  Here's what I get when
>> > I do it:
>> >
>> > root@smeagol:~# df -h
>> > Filesystem  Size  Used Avail Use% Mounted on
>> > udev 12G 0   12G   0% /dev
>> > tmpfs   2.4G  251M  2.2G  11% /run
>> > /dev/sda5   220G   15G  194G   8% /
>> > tmpfs12G  412K   12G   1% /dev/shm
>> > tmpfs   5.0M 0  5.0M   0% /run/lock
>> > tmpfs12G 0   12G   0% /sys/fs/cgroup
>> > /dev/sda147G  248M   45G   1% /boot
>> > tmpfs   2.4G   84K  2.4G   1% /run/user/1000
>> > tmpfs   2.4G 0  2.4G   0% /run/user/141
>> > tmpfs   2.4G 0  2.4G   0% /run/user/0
>> >
>> > If the disk has enough free space, then there is probably something else
>> > at work, like a filesystem quota for the user that is running Solr, or
>> some
>> > other kind of limitation that has been configured.
>> >
>> > Thanks,
>> > Shawn
>> >
>>


Re: Filtering Solr pivot facet values

2017-12-20 Thread Arun Rangarajan
Hello Solr Gurus,

Sorry to bother you again on this. Is there no way in Solr to filter pivot
facets?
[Or did I attract the wrath of the group by posting the question first on
StackOverflow? :-)]

Thanks once again.

On Mon, Dec 18, 2017 at 10:59 AM, Arun Rangarajan 
wrote:

> Solr version: 6.6.0
>
> There are two multi-valued string fields in my schema:
> * interests
> * hierarchy.
>
> Goal is to run a pivot facet query on both these fields, but only for
> specific values of `interests` field. This query:
>
> ```
> /select
> ?wt=json
> &rows=0
> &q=interests:(hockey OR soccer)
> &facet=true
> &facet.pivot=interests,hierarchy
> ```
>
> selects the correct documents, but since `interests` is a multi-valued
> field, it gives the required counts for the interested values (hockey,
> soccer), but also gives the counts for other values of `interests` in the
> matching documents.
>
> How to filter the pivot facet counts only for the values of `interests`
> field specified in the 'q' param i.e. hockey and soccer in the example.
> Essentially, is there an equivalent of https://lucene.apache.org/
> solr/guide/6_6/faceting.html#Faceting-Limitingfacetwithcertainterms for
> pivot facet query? Or are there alternate formats like JSON faceting that
> may help here?
>
> (Full disclosure: I asked the question on StackOverflow and got no
> response so far: https://stackoverflow.com/questions/47838619/
> filtering-solr-pivot-facet-values )
>
> Thanks.
>


RE: Trouble with mm and SynonymQuery and KeywordRepeatFilter

2017-12-20 Thread Markus Jelsma
Hello - any ideas to share on this topic?

Many thanks,
Markus

 
 
-Original message-
> From:Markus Jelsma 
> Sent: Tuesday 19th December 2017 12:38
> To: Solr-user 
> Subject: Trouble with mm and SynonymQuery and KeywordRepeatFilter
> 
> Hello,
> 
> I have an interesting issue with mm and SynonymQuery and KeywordRepeatFilter. 
> We do query time synonym expansion and use KeywordRepeat for not only finding 
> stemmed tokens. Our synonyms are already preprocessed and contain only 
> stemmed tokens. Synonym file contains: traject,verbind
> 
> So, any non-root stem that ends up in a synonym is actually a search for 
> three terms: +DisjunctionMaxQuery(((title_nl:trajecten 
> Synonym(title_nl:traject title_nl:verbind
> 
> But, our default mm requires that two terms must match if the input query 
> consists of two terms: 2<-1 5<-2 6<90%
> 
> So, a simple query looking for a plural (trajecten) will not match a document 
> where the title contains only its singular form: q=trajecten will not match 
> document with title_nl:"een traject"
> 
> Now, my question is, how to deal with this problem? I clearly do not want mm 
> to think i input two terms!
> 
> Many many thanks,
> Markus
> 


Re: Build suggester in different directory (not /tmp).

2017-12-20 Thread Matthew Roth
Thanks Erick,

I'll head your warning. Ultimately, the index will be rather static so I do
not fear much from buildingOnComit. But I think building on startup would
likely be set to false regardless.

Shawn,

Thank you as well. That is very informative regarding java.io.tmpdir. I am
starting this as a service, but I think I can handle making the required
changes.

Best,
Matt

On Wed, Dec 20, 2017 at 2:58 PM, Shawn Heisey  wrote:

> On 12/20/2017 10:05 AM, Matthew Roth wrote:
> > I am building a few suggester's and I am receiving the error that I have
> no
> > space left on device.
>
> 
>
> > At first this threw me. df showed I had over 100 G free. the /data dir
> the
> > suggester is being constructed from is only 4G. On a subsequent run I
> > notice that the suggester is first being built in /tmp. When setting up
> the
> > LVM I only allotted 2g's to that directory and I prefer to keep it that
> > way.
>
> The code is utilizing the "java.io.tmpdir" system property to determine
> a temporary directory location to use for the build, before it is put in
> the final location.  On POSIX platforms, this will default to /tmp.
>
> If you are starting Solr manually, then you would just need to add the
> following parameter to the bin/solr commandline (including the quotes)
> to change this location:
>
> -a "-Djava.io.tmpdir=/other/tmp/path"
>
> If you've installed Solr as a service, then I do not think there's any
> easy way to adjust this property, other than manually editing bin/solr
> to add the -D option to the startup commandline.  We'll need an
> enhancement issue in Jira to modify the script so it can set
> java.io.tmpdir from an environment variable.
>
> Note that adjusting this property may result in other things that Solr
> creates being moved away from /tmp.
>
> Since most POSIX operating systems will automatically delete old files
> in /tmp, it's always possible that when you move Java's temp directory,
> you'll end up with cruft in the new location that never gets deleted.
> Developers do generally try to clean up temporary files, but sometimes
> things go wrong that weren't anticipated.  If that does happen and a
> temporary file is created by Lucene/Solr that doesn't get deleted, then
> I would consider that a bug that should be fixed.
>
> On Windows systems, Java asks the OS where the temp directory is.  The
> info I've found says that the TMP environment variable will override
> this location for Windows, but not for other platforms.
>
> Thanks,
> Shawn
>
>


Re: Build suggester in different directory (not /tmp).

2017-12-20 Thread Erick Erickson
Matthew:

I think you'll be awfully unhappy with buildOnCommit. Say you're
bulk-indexing and committing every 15 seconds

buildOnStartup is problematical as well since it'd rebuild everytime
you bounced Solr even if the index hadn't changed.

Personally I'd alter my indexing process to fire a build command when
it was done.

Or, if you can afford to optimize after _every_ set of updates (say
you only update every day or less often) then buildOnOptimize makes
sense.

Best,
Erick

On Wed, Dec 20, 2017 at 12:40 PM, Matthew Roth  wrote:
> Thanks Erick,
>
> I'll head your warning. Ultimately, the index will be rather static so I do
> not fear much from buildingOnComit. But I think building on startup would
> likely be set to false regardless.
>
> Shawn,
>
> Thank you as well. That is very informative regarding java.io.tmpdir. I am
> starting this as a service, but I think I can handle making the required
> changes.
>
> Best,
> Matt
>
> On Wed, Dec 20, 2017 at 2:58 PM, Shawn Heisey  wrote:
>
>> On 12/20/2017 10:05 AM, Matthew Roth wrote:
>> > I am building a few suggester's and I am receiving the error that I have
>> no
>> > space left on device.
>>
>> 
>>
>> > At first this threw me. df showed I had over 100 G free. the /data dir
>> the
>> > suggester is being constructed from is only 4G. On a subsequent run I
>> > notice that the suggester is first being built in /tmp. When setting up
>> the
>> > LVM I only allotted 2g's to that directory and I prefer to keep it that
>> > way.
>>
>> The code is utilizing the "java.io.tmpdir" system property to determine
>> a temporary directory location to use for the build, before it is put in
>> the final location.  On POSIX platforms, this will default to /tmp.
>>
>> If you are starting Solr manually, then you would just need to add the
>> following parameter to the bin/solr commandline (including the quotes)
>> to change this location:
>>
>> -a "-Djava.io.tmpdir=/other/tmp/path"
>>
>> If you've installed Solr as a service, then I do not think there's any
>> easy way to adjust this property, other than manually editing bin/solr
>> to add the -D option to the startup commandline.  We'll need an
>> enhancement issue in Jira to modify the script so it can set
>> java.io.tmpdir from an environment variable.
>>
>> Note that adjusting this property may result in other things that Solr
>> creates being moved away from /tmp.
>>
>> Since most POSIX operating systems will automatically delete old files
>> in /tmp, it's always possible that when you move Java's temp directory,
>> you'll end up with cruft in the new location that never gets deleted.
>> Developers do generally try to clean up temporary files, but sometimes
>> things go wrong that weren't anticipated.  If that does happen and a
>> temporary file is created by Lucene/Solr that doesn't get deleted, then
>> I would consider that a bug that should be fixed.
>>
>> On Windows systems, Java asks the OS where the temp directory is.  The
>> info I've found says that the TMP environment variable will override
>> this location for Windows, but not for other platforms.
>>
>> Thanks,
>> Shawn
>>
>>


Re: Build suggester in different directory (not /tmp).

2017-12-20 Thread Matthew Roth
Erick,

oh, yes, I think I was misunderstanding buildOnCommit. I presumed it would
run following the completion of my DIH. The behavior you described would be
very problematic!

Thank you for taking the time to point that out!

Best,
Matt

On Wed, Dec 20, 2017 at 3:47 PM, Erick Erickson 
wrote:

> Matthew:
>
> I think you'll be awfully unhappy with buildOnCommit. Say you're
> bulk-indexing and committing every 15 seconds
>
> buildOnStartup is problematical as well since it'd rebuild everytime
> you bounced Solr even if the index hadn't changed.
>
> Personally I'd alter my indexing process to fire a build command when
> it was done.
>
> Or, if you can afford to optimize after _every_ set of updates (say
> you only update every day or less often) then buildOnOptimize makes
> sense.
>
> Best,
> Erick
>
> On Wed, Dec 20, 2017 at 12:40 PM, Matthew Roth  wrote:
> > Thanks Erick,
> >
> > I'll head your warning. Ultimately, the index will be rather static so I
> do
> > not fear much from buildingOnComit. But I think building on startup would
> > likely be set to false regardless.
> >
> > Shawn,
> >
> > Thank you as well. That is very informative regarding java.io.tmpdir. I
> am
> > starting this as a service, but I think I can handle making the required
> > changes.
> >
> > Best,
> > Matt
> >
> > On Wed, Dec 20, 2017 at 2:58 PM, Shawn Heisey 
> wrote:
> >
> >> On 12/20/2017 10:05 AM, Matthew Roth wrote:
> >> > I am building a few suggester's and I am receiving the error that I
> have
> >> no
> >> > space left on device.
> >>
> >> 
> >>
> >> > At first this threw me. df showed I had over 100 G free. the /data dir
> >> the
> >> > suggester is being constructed from is only 4G. On a subsequent run I
> >> > notice that the suggester is first being built in /tmp. When setting
> up
> >> the
> >> > LVM I only allotted 2g's to that directory and I prefer to keep it
> that
> >> > way.
> >>
> >> The code is utilizing the "java.io.tmpdir" system property to determine
> >> a temporary directory location to use for the build, before it is put in
> >> the final location.  On POSIX platforms, this will default to /tmp.
> >>
> >> If you are starting Solr manually, then you would just need to add the
> >> following parameter to the bin/solr commandline (including the quotes)
> >> to change this location:
> >>
> >> -a "-Djava.io.tmpdir=/other/tmp/path"
> >>
> >> If you've installed Solr as a service, then I do not think there's any
> >> easy way to adjust this property, other than manually editing bin/solr
> >> to add the -D option to the startup commandline.  We'll need an
> >> enhancement issue in Jira to modify the script so it can set
> >> java.io.tmpdir from an environment variable.
> >>
> >> Note that adjusting this property may result in other things that Solr
> >> creates being moved away from /tmp.
> >>
> >> Since most POSIX operating systems will automatically delete old files
> >> in /tmp, it's always possible that when you move Java's temp directory,
> >> you'll end up with cruft in the new location that never gets deleted.
> >> Developers do generally try to clean up temporary files, but sometimes
> >> things go wrong that weren't anticipated.  If that does happen and a
> >> temporary file is created by Lucene/Solr that doesn't get deleted, then
> >> I would consider that a bug that should be fixed.
> >>
> >> On Windows systems, Java asks the OS where the temp directory is.  The
> >> info I've found says that the TMP environment variable will override
> >> this location for Windows, but not for other platforms.
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
>


Re: Filtering Solr pivot facet values

2017-12-20 Thread Shawn Heisey
On 12/20/2017 1:31 PM, Arun Rangarajan wrote:
> Sorry to bother you again on this. Is there no way in Solr to filter pivot
> facets?
> [Or did I attract the wrath of the group by posting the question first on
> StackOverflow? :-)]

StackOverflow and this list are pretty much unaware of each other unless
specific mention is made.  I don't care whether you ask on SO or not, or
which one you ask first.

You haven't provided actual output that you're seeing.  Can you provide
actual response output from your queries and describe what you'd rather
see instead?  With that information, we might be able to offer some ideas.

In general, facets should never count documents that are not in the
search results.

Multi-select faceting offers a way to change that general behavior,
though -- tagging specific fq parameters and asking the facet to exclude
those filters.

https://wiki.apache.org/solr/SimpleFacetParameters#Tagging_and_excluding_Filters

Thanks,
Shawn



Re: Trouble with mm and SynonymQuery and KeywordRepeatFilter

2017-12-20 Thread Shawn Heisey
On 12/19/2017 4:38 AM, Markus Jelsma wrote:
> I have an interesting issue with mm and SynonymQuery and KeywordRepeatFilter. 
> We do query time synonym expansion and use KeywordRepeat for not only finding 
> stemmed tokens. Our synonyms are already preprocessed and contain only 
> stemmed tokens. Synonym file contains: traject,verbind
>
> So, any non-root stem that ends up in a synonym is actually a search for 
> three terms: +DisjunctionMaxQuery(((title_nl:trajecten 
> Synonym(title_nl:traject title_nl:verbind
>
> But, our default mm requires that two terms must match if the input query 
> consists of two terms: 2<-1 5<-2 6<90%
>
> So, a simple query looking for a plural (trajecten) will not match a document 
> where the title contains only its singular form: q=trajecten will not match 
> document with title_nl:"een traject"

I would think that doing synonym expansion at index time would remove
any possible confusion about the number of terms at query time.  Queries
that involve synonyms will be slightly less complex, but the index would
be larger, so it's difficult to say whether those kinds of queries would
be any faster or not.

There is one clear disadvantage to index-time synonym expansion: If you
change your synonyms, you have to reindex.

Thanks,
Shawn



Re: Filtering Solr pivot facet values

2017-12-20 Thread Arun Rangarajan
Thanks for your reply, Shawn.

I think multi-select faceting does the opposite of what I want. I want the
facet to include the filters.

Example:

The following 8 documents are the only ones in my Solr core:

[
  {"id": "1", "hierarchy": ["1", "16", "169"], "interests": ["soccer",
"futbol"]},
  {"id": "2", "hierarchy": ["1", "16", "162"], "interests": ["cricket",
"futbol"]},
  {"id": "3", "hierarchy": ["1", "14", "141"], "interests": ["hockey",
"soccer"]},
  {"id": "4", "hierarchy": ["1", "16", "162"], "interests": ["hockey",
"soccer", "tennis"]},
  {"id": "5", "hierarchy": ["1", "14", "142"], "interests": ["badminton"]},
  {"id": "6", "hierarchy": ["1", "14", "147"], "interests": ["soccer"]},
  {"id": "7", "hierarchy": ["1", "16", "168"], "interests": ["hockey",
"soccer", "tennis"]},
  {"id": "8", "hierarchy": ["1", "14", "140"], "interests": ["badminton"]}
]

As you can see, hierarchy and interests are both multi-valued string fields.

I want pivot facet counts for the two fields: hierarchy and interests, but
filtered for only two values of interests field: hockey, soccer.

The query I am running is:

/select
?wt=json
&rows=0
&q=interests:(hockey soccer)
&facet=true
&facet.pivot=hierarchy,interests

This gives the following result for the pivot facets:

"facet_pivot": {
"hierarchy,interests": [
{
  "field": "hierarchy",
  "value": "1",
  "count": 5,
  "pivot": [
{"field": "interests", "value": "soccer", "count": 5},
{"field": "interests", "value": "hockey", "count": 3},
{"field": "interests", "value": "tennis", "count": 2},
{"field": "interests", "value": "futbol", "count": 1}
  ]
},
{
  "field": "hierarchy",
  "value": "16",
  "count": 3,
  "pivot": [
{"field": "interests", "value": "soccer", "count": 3},
{"field": "interests", "value": "hockey", "count": 2},
{"field": "interests", "value": "tennis", "count": 2},
{"field": "interests", "value": "futbol", "count": 1}
  ]
},
...
]
}

The counts for hockey and soccer are correct. But I am also getting the
facet counts for other values of interests (like tennis, futbol, etc.,)
since these values match the query. I understand why this is happening.
This is why I said I want to do something like
https://lucene.apache.org/solr/guide/6_6/faceting.html#Faceting-Limitingfacetwithcertainterms
for facet pivots. Is there a way to do that?

Thanks.



On Wed, Dec 20, 2017 at 1:07 PM, Shawn Heisey  wrote:

> On 12/20/2017 1:31 PM, Arun Rangarajan wrote:
> > Sorry to bother you again on this. Is there no way in Solr to filter
> pivot
> > facets?
> > [Or did I attract the wrath of the group by posting the question first on
> > StackOverflow? :-)]
>
> StackOverflow and this list are pretty much unaware of each other unless
> specific mention is made.  I don't care whether you ask on SO or not, or
> which one you ask first.
>
> You haven't provided actual output that you're seeing.  Can you provide
> actual response output from your queries and describe what you'd rather
> see instead?  With that information, we might be able to offer some ideas.
>
> In general, facets should never count documents that are not in the
> search results.
>
> Multi-select faceting offers a way to change that general behavior,
> though -- tagging specific fq parameters and asking the facet to exclude
> those filters.
>
> https://wiki.apache.org/solr/SimpleFacetParameters#Tagging_
> and_excluding_Filters
>
> Thanks,
> Shawn
>
>


RE: Trouble with mm and SynonymQuery and KeywordRepeatFilter

2017-12-20 Thread Markus Jelsma
Hello,

Yes of course, index time synonyms lessens the query time complexity and will 
solve the mm problem. It also screws IDF and the flexibility of adding synonyms 
on demand. The first we do not want, the second is impossible for us (very 
large main search index).

We are looking for a solution with mm that takes KeywordRepeat, stemming and 
synonym expansion into consideration. To me the current working of mm in this 
case is a bug, i input one term so treat it as one term in mm, regardless of 
expanded query terms.

Any query time ideas to share? I am not well versed with the actual code 
dealing with this specific subject, the code doesn't like me. I am fine if 
someone points me to the code that tells mm about the number of original input 
terms, and what to do. If someone does, please also explain why the change i 
want to make is a bad one, what to be aware of or what to beware of, or what to 
take into account.

Also, am i the only one who regards this behaviour as a bug, or more subtle, a 
weird unexpected behaviour?

Many many thanks!
Markus

-Original message-
> From:Shawn Heisey 
> Sent: Wednesday 20th December 2017 22:39
> To: solr-user@lucene.apache.org
> Subject: Re: Trouble with mm and SynonymQuery and KeywordRepeatFilter
> 
> On 12/19/2017 4:38 AM, Markus Jelsma wrote:
> > I have an interesting issue with mm and SynonymQuery and 
> > KeywordRepeatFilter. We do query time synonym expansion and use 
> > KeywordRepeat for not only finding stemmed tokens. Our synonyms are already 
> > preprocessed and contain only stemmed tokens. Synonym file contains: 
> > traject,verbind
> >
> > So, any non-root stem that ends up in a synonym is actually a search for 
> > three terms: +DisjunctionMaxQuery(((title_nl:trajecten 
> > Synonym(title_nl:traject title_nl:verbind
> >
> > But, our default mm requires that two terms must match if the input query 
> > consists of two terms: 2<-1 5<-2 6<90%
> >
> > So, a simple query looking for a plural (trajecten) will not match a 
> > document where the title contains only its singular form: q=trajecten will 
> > not match document with title_nl:"een traject"
> 
> I would think that doing synonym expansion at index time would remove
> any possible confusion about the number of terms at query time.  Queries
> that involve synonyms will be slightly less complex, but the index would
> be larger, so it's difficult to say whether those kinds of queries would
> be any faster or not.
> 
> There is one clear disadvantage to index-time synonym expansion: If you
> change your synonyms, you have to reindex.
> 
> Thanks,
> Shawn
> 
> 


Re: Trouble with mm and SynonymQuery and KeywordRepeatFilter

2017-12-20 Thread Steve Rowe
Hi Markus,

My suggestion: rewrite your synonyms to include the triggering word in the 
expanded synonyms list.  That way you won’t need KeywordRepeat/RemoveDuplicates 
filters, and mm=100% will work as you expect.

I don’t think this situation is a bug, since mm applies to the built query, not 
to the original query terms.

--
Steve
www.lucidworks.com

> On Dec 20, 2017, at 5:02 PM, Markus Jelsma  wrote:
> 
> Hello,
> 
> Yes of course, index time synonyms lessens the query time complexity and will 
> solve the mm problem. It also screws IDF and the flexibility of adding 
> synonyms on demand. The first we do not want, the second is impossible for us 
> (very large main search index).
> 
> We are looking for a solution with mm that takes KeywordRepeat, stemming and 
> synonym expansion into consideration. To me the current working of mm in this 
> case is a bug, i input one term so treat it as one term in mm, regardless of 
> expanded query terms.
> 
> Any query time ideas to share? I am not well versed with the actual code 
> dealing with this specific subject, the code doesn't like me. I am fine if 
> someone points me to the code that tells mm about the number of original 
> input terms, and what to do. If someone does, please also explain why the 
> change i want to make is a bad one, what to be aware of or what to beware of, 
> or what to take into account.
> 
> Also, am i the only one who regards this behaviour as a bug, or more subtle, 
> a weird unexpected behaviour?
> 
> Many many thanks!
> Markus
> 
> -Original message-
>> From:Shawn Heisey 
>> Sent: Wednesday 20th December 2017 22:39
>> To: solr-user@lucene.apache.org
>> Subject: Re: Trouble with mm and SynonymQuery and KeywordRepeatFilter
>> 
>> On 12/19/2017 4:38 AM, Markus Jelsma wrote:
>>> I have an interesting issue with mm and SynonymQuery and 
>>> KeywordRepeatFilter. We do query time synonym expansion and use 
>>> KeywordRepeat for not only finding stemmed tokens. Our synonyms are already 
>>> preprocessed and contain only stemmed tokens. Synonym file contains: 
>>> traject,verbind
>>> 
>>> So, any non-root stem that ends up in a synonym is actually a search for 
>>> three terms: +DisjunctionMaxQuery(((title_nl:trajecten 
>>> Synonym(title_nl:traject title_nl:verbind
>>> 
>>> But, our default mm requires that two terms must match if the input query 
>>> consists of two terms: 2<-1 5<-2 6<90%
>>> 
>>> So, a simple query looking for a plural (trajecten) will not match a 
>>> document where the title contains only its singular form: q=trajecten will 
>>> not match document with title_nl:"een traject"
>> 
>> I would think that doing synonym expansion at index time would remove
>> any possible confusion about the number of terms at query time.  Queries
>> that involve synonyms will be slightly less complex, but the index would
>> be larger, so it's difficult to say whether those kinds of queries would
>> be any faster or not.
>> 
>> There is one clear disadvantage to index-time synonym expansion: If you
>> change your synonyms, you have to reindex.
>> 
>> Thanks,
>> Shawn
>> 
>> 



Re: Solr 7.1 Solrcloud dynamic/automatic replicas

2017-12-20 Thread Greg Roodt
Thanks again Erick. It looks like I've got this working.

One final question I think:
Is there a way to prevent ADDREPLICA from adding another core if a core for
the collection already exists on the node?

I've noticed that if I call ADDREPLICA twice for the same IP:PORT_solr, I
get multiple cores. I can probably check `clusterstatus`, but I was
wondering if there is another way to make the ADDREPLICA call idempotent.



On 21 December 2017 at 03:27, Erick Erickson 
wrote:

> The internal method is ZkController.generateNodeName(), although it's
> fairly simple, there are bunches of samples in ZkControllerTest
>
> But yeah, it requires that you know your hostname and port, and the
> context is "solr".
>
> On Tue, Dec 19, 2017 at 8:04 PM, Greg Roodt  wrote:
> > Ok, thanks. I'll take a look into using the ADDREPLICA API.
> >
> > I've found a few examples of the znode format. It seems to be
> IP:PORT_solr
> > (where I presume _solr is the name of the context or something?).
> >
> > Is there a way to discover what a znode is? i.e. Can my new node
> determine
> > what it's znode is? Or is my only option to use the IP:PORT_solr
> convention?
> >
> >
> >
> >
> > On 20 December 2017 at 11:33, Erick Erickson 
> > wrote:
> >
> >> Yes, ADDREPLICA is mostly equivalent, it's also supported going
> forward
> >>
> >> LegacyCloud should work temporarily, I'd change it going forward though.
> >>
> >> Finally, you'll want to add a "node" parameter to insure your replica is
> >> placed on the exact node you want, see the livenodes znode for the
> >> format...
> >>
> >> On Dec 19, 2017 16:06, "Greg Roodt"  wrote:
> >>
> >> > Thanks for the reply. So it sounds like the method that I'm using to
> >> > automatically add replicas on Solr 6.2 is not recommended and not
> going
> >> to
> >> > be supported in future versions.
> >> >
> >> > A couple of follow up questions then:
> >> > * Do you know if running with legacyCloud=true will make this
> behaviour
> >> > work "for now" until I can find a better way of doing this?
> >> > * Will it be enough for my newly added nodes to then startup solr
> (with
> >> > correct ZK_HOST) and call the ADDREPLICA API as follows?
> >> > ```
> >> > curl http://localhost:port
> >> > /solr/admin/collections?action=ADDREPLICA&collection=
> blah&shard=*shard1*
> >> > ```
> >> > That seems mostly equivalent to writing that core.properties file
> that I
> >> am
> >> > using in 6.2
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On 20 December 2017 at 09:34, Shawn Heisey 
> wrote:
> >> >
> >> > > On 12/19/2017 3:06 PM, Greg Roodt wrote:
> >> > > > Thanks for your reply Erick.
> >> > > >
> >> > > > This is what I'm doing at the moment with Solr 6.2 (I was
> mistaken,
> >> > > before
> >> > > > I said 6.1).
> >> > > >
> >> > > > 1. A new instance comes online
> >> > > > 2. Systemd starts solr with a custom start.sh script
> >> > > > 3. This script creates a core.properties file that looks like
> this:
> >> > > > ```
> >> > > > name=blah
> >> > > > shard=shard1
> >> > > > ```
> >> > > > 4. Script starts solr via the jar.
> >> > > > ```
> >> > > > java -DzkHost=... -jar start.jar
> >> > > > ```
> >> > >
> >> > > The way that we would expect this to be normally done is a little
> >> > > different.  Adding a node to the cloud normally will NOT copy any
> >> > > indexes.  You have basically tricked SolrCloud into adding the
> replica
> >> > > automatically by creating a core before Solr starts.  SolrCloud
> >> > > incorporates the new core into the cluster according to the info
> that
> >> > > you have put in core.properties, notices that it has no index, and
> >> > > replicates it from the existing leader.
> >> > >
> >> > > Normally, what we would expect for adding a new node is this:
> >> > >
> >> > >  * Run the service installer script on the new machine
> >> > >  * Add a ZK_HOST variable to /etc/default/solr.in.sh
> >> > >  * Use "service solr restart"to get Solr to join the cloud
> >> > >  * Call the ADDREPLICA action on the Collections API
> >> > >
> >> > > The reason that your method works is that currently, the "truth"
> about
> >> > > the cluster is a mixture of what's in ZooKeeper and what's actually
> >> > > present on each Solr instance.
> >> > >
> >> > > There is an effort to change this so that ZooKeeper is the sole
> source
> >> > > of truth, and if a core is found that the ZK database doesn't know
> >> > > about, it won't be started, because it's not a known part of the
> >> > > cluster.  If this goal is realized in a future version of Solr, then
> >> the
> >> > > method you're currently using is not going to work like it does at
> the
> >> > > moment.  I do not know how much of this has been done, but I know
> that
> >> > > there have been people working on it.
> >> > >
> >> > > Thanks,
> >> > > Shawn
> >> > >
> >> > >
> >> >
> >>
>


Re: Solr 7.1 Solrcloud dynamic/automatic replicas

2017-12-20 Thread Erick Erickson
If you specify the node parameter for ADDREPLICA I don't think so, but
as you know you have to understand the topology via CLUSTERSTATUS or
some such.

If you don't specify the "node" parameter, I think if you take a look
at the "Rule-based Replica Placement" here:
https://lucene.apache.org/solr/guide/6_6/rule-based-replica-placement.html.
I have to say you'll have to experiment here, I haven't verified that
ADDREPLICA does the right thing here.

Do note that this functionality will be superseded by the "Policy"
framework but the transition is pretty straightforward.

Best,
Erick

On Wed, Dec 20, 2017 at 3:45 PM, Greg Roodt  wrote:
> Thanks again Erick. It looks like I've got this working.
>
> One final question I think:
> Is there a way to prevent ADDREPLICA from adding another core if a core for
> the collection already exists on the node?
>
> I've noticed that if I call ADDREPLICA twice for the same IP:PORT_solr, I
> get multiple cores. I can probably check `clusterstatus`, but I was
> wondering if there is another way to make the ADDREPLICA call idempotent.
>
>
>
> On 21 December 2017 at 03:27, Erick Erickson 
> wrote:
>
>> The internal method is ZkController.generateNodeName(), although it's
>> fairly simple, there are bunches of samples in ZkControllerTest
>>
>> But yeah, it requires that you know your hostname and port, and the
>> context is "solr".
>>
>> On Tue, Dec 19, 2017 at 8:04 PM, Greg Roodt  wrote:
>> > Ok, thanks. I'll take a look into using the ADDREPLICA API.
>> >
>> > I've found a few examples of the znode format. It seems to be
>> IP:PORT_solr
>> > (where I presume _solr is the name of the context or something?).
>> >
>> > Is there a way to discover what a znode is? i.e. Can my new node
>> determine
>> > what it's znode is? Or is my only option to use the IP:PORT_solr
>> convention?
>> >
>> >
>> >
>> >
>> > On 20 December 2017 at 11:33, Erick Erickson 
>> > wrote:
>> >
>> >> Yes, ADDREPLICA is mostly equivalent, it's also supported going
>> forward
>> >>
>> >> LegacyCloud should work temporarily, I'd change it going forward though.
>> >>
>> >> Finally, you'll want to add a "node" parameter to insure your replica is
>> >> placed on the exact node you want, see the livenodes znode for the
>> >> format...
>> >>
>> >> On Dec 19, 2017 16:06, "Greg Roodt"  wrote:
>> >>
>> >> > Thanks for the reply. So it sounds like the method that I'm using to
>> >> > automatically add replicas on Solr 6.2 is not recommended and not
>> going
>> >> to
>> >> > be supported in future versions.
>> >> >
>> >> > A couple of follow up questions then:
>> >> > * Do you know if running with legacyCloud=true will make this
>> behaviour
>> >> > work "for now" until I can find a better way of doing this?
>> >> > * Will it be enough for my newly added nodes to then startup solr
>> (with
>> >> > correct ZK_HOST) and call the ADDREPLICA API as follows?
>> >> > ```
>> >> > curl http://localhost:port
>> >> > /solr/admin/collections?action=ADDREPLICA&collection=
>> blah&shard=*shard1*
>> >> > ```
>> >> > That seems mostly equivalent to writing that core.properties file
>> that I
>> >> am
>> >> > using in 6.2
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On 20 December 2017 at 09:34, Shawn Heisey 
>> wrote:
>> >> >
>> >> > > On 12/19/2017 3:06 PM, Greg Roodt wrote:
>> >> > > > Thanks for your reply Erick.
>> >> > > >
>> >> > > > This is what I'm doing at the moment with Solr 6.2 (I was
>> mistaken,
>> >> > > before
>> >> > > > I said 6.1).
>> >> > > >
>> >> > > > 1. A new instance comes online
>> >> > > > 2. Systemd starts solr with a custom start.sh script
>> >> > > > 3. This script creates a core.properties file that looks like
>> this:
>> >> > > > ```
>> >> > > > name=blah
>> >> > > > shard=shard1
>> >> > > > ```
>> >> > > > 4. Script starts solr via the jar.
>> >> > > > ```
>> >> > > > java -DzkHost=... -jar start.jar
>> >> > > > ```
>> >> > >
>> >> > > The way that we would expect this to be normally done is a little
>> >> > > different.  Adding a node to the cloud normally will NOT copy any
>> >> > > indexes.  You have basically tricked SolrCloud into adding the
>> replica
>> >> > > automatically by creating a core before Solr starts.  SolrCloud
>> >> > > incorporates the new core into the cluster according to the info
>> that
>> >> > > you have put in core.properties, notices that it has no index, and
>> >> > > replicates it from the existing leader.
>> >> > >
>> >> > > Normally, what we would expect for adding a new node is this:
>> >> > >
>> >> > >  * Run the service installer script on the new machine
>> >> > >  * Add a ZK_HOST variable to /etc/default/solr.in.sh
>> >> > >  * Use "service solr restart"to get Solr to join the cloud
>> >> > >  * Call the ADDREPLICA action on the Collections API
>> >> > >
>> >> > > The reason that your method works is that currently, the "truth"
>> about
>> >> > > the cluster is a mixture of what's in ZooKeeper and what's actually
>> >> > > present on each Solr instance

DocValues for multivalued strings and boolean fields

2017-12-20 Thread S G
Hi,

One of our Solr users is trying to set docValues="true" for multivalued
string fields and boolean-type fields.

I am not sure what the performance impact of that would be.
Can docValues negatively affect performance in any way?

We are using Solr 6.5.1 and also experimenting with 7.1.0

Thanks
SG


Re: recurring Solr warning messages

2017-12-20 Thread Ritesh
Hi,Can someone respond on this please?Or, can you direct me to the right 
contact who may know about these issues.Regards,RiteshFrom: 
"Ritesh"Sent: Tue, 19 Dec 2017 
18:06:13To: Subject: Re: recurring Solr 
warning messagesHello,Can you help on the below issue please?My solr box keep 
on giving warnings about every 30 seconds:
WARN null ServletHandler /solr/sitecore/select
org.apache.solr.common.SolrException: application/x-www-form-urlencoded 
invalid: missing key
Can you please provide more information/resolution on this?Regards,RiteshFrom: 
GitHub Staff Sent: Thu, 14 Dec 2017 00:05:54To: 
Ritesh Subject: Re: recurring Solr warning 
messages


  Hi Ritesh,
I'm sorry, but you've 
reached Support for GitHub. It sounds like you may have been looking for 
support for a specific project that is hosted on GitHub.

https://github.com/apache/lucene-solr
We didn't build that 
project, so we're not able to support it. There are a few places we suggest 
you look when you need help with a project hosted on GitHub:

its SUPPORT or README file, which may include contact 
information or an official website  
  its Issues tab, where you can report bugs 
  the owner's GitHub profile, which may include 
contact information   
Of course, do let us know if 
you have any questions about GitHub itself!
Thanks, 
GitHub Staff

My solr box 
keep on giving warnings about every 30 seconds:
WARN null ServletHandler 
/solr/sitecore/select

org.apache.solr.common.SolrException: application/x-www-form-urlencoded 
invalid: missing key
Can you please provide more 
information/resolution on this? 




Keep Solr Indexing live

2017-12-20 Thread shashiroushan
Hello All,

I am using DIH to import data from SQL to Solr using Url 
"/dataimport?command=full-import&clean=true".
My problem is, When SQL query return zero record then Solr also return zero 
records. But as per my project requirement, Solr indexing should be clean only 
when SQL query return records.
So I cant pass “clean= false”. 

Please suggest.

Regards,
Shashi Roushan