Re: Streaming Expressions: Merge array values? Inverse of cartesianProduct()

2018-06-14 Thread Joel Bernstein
Actually you're second example is probably a straight forward: reduce(select(...), group(...), by="k1") Joel Bernstein http://joelsolr.blogspot.com/ On Thu, Jun 14, 2018 at 7:33 PM, Joel Bernstein wrote: > Take a look at the reduce() function. You'll have to write a custom reduce > operation b

Re: Streaming Expressions: Merge array values? Inverse of cartesianProduct()

2018-06-14 Thread Joel Bernstein
Take a look at the reduce() function. You'll have to write a custom reduce operation but you can follow the example here: https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/io/ops/GroupOperation.java You can plug in your custom reduce operation in t

Re: Exception when processing streaming expression

2018-06-14 Thread Joel Bernstein
We have to check the behavior of the innerJoin. I suspect that its closing the second stream when the first stream his finished. This would cause a broken pipe with the second stream. The export handler has specific code that eats the broken pipe exception so it doesn't end up in the logs. The sele

Re: Suggestions for debugging performance issue

2018-06-14 Thread Shawn Heisey
On 6/12/2018 12:06 PM, Chris Troullis wrote: > The issue we are seeing is with 1 collection in particular, after we set up > CDCR, we are getting extremely slow response times when retrieving > documents. Debugging the query shows QTime is almost nothing, but the > overall responseTime is like 5x w

Re: Changing Field Assignments

2018-06-14 Thread Shawn Heisey
On 6/14/2018 12:10 PM, Terry Steichen wrote: > I don't disagree at all, but have a basic question: How do you easily > transition from a system using a dynamic schema to one using a fixed one? Not sure you need to actually transition.  Just remove the config in solrconfig.xml that causes Solr to i

Solr basic auth

2018-06-14 Thread Dinesh Sundaram
Hi, I have configured basic auth for solrcloud. it works well when i access the solr url directly. i have integrated this solr with test.com domain. now if I access the solr url like test.com/solr it prompts the credentials but I dont want to ask this time since it is known domain. is there any wa

Re: Changing Field Assignments

2018-06-14 Thread Terry Steichen
Shawn, I don't disagree at all, but have a basic question: How do you easily transition from a system using a dynamic schema to one using a fixed one? I'm runnning 6.6.0 in cloud mode (only because it's necessary, as I understand it, to be in cloud mode for the authentication/authorization to wor

Re: Indexing to replica instead leader

2018-06-14 Thread Shawn Heisey
On 6/8/2018 3:56 AM, SOLR4189 wrote: > /When a document is sent to a Solr node for indexing, the system first > determines which Shard that document belongs to, and then which node is > currently hosting the leader for that shard. The document is then forwarded > to the current leader for indexing,

Re: Changing Field Assignments

2018-06-14 Thread Shawn Heisey
On 6/11/2018 2:02 PM, Terry Steichen wrote: > I am using Solr (6.6.0) in the automatic mode (where it discovers > fields).  It's working fine with one exception.  The problem is that > Solr maps the discovered "meta_creation_date" is assigned the type > TrieDateField.  > > Unfortunately, that type

Re: Exception when processing streaming expression

2018-06-14 Thread Christian Spitzlay
What does that mean exactly? If I set the rows parameter to 10 the exception still occurs. AFAICT all this happens internally during the processing of the streaming expression. Why wouldn't the select send the EOF tuple when it reaches the end of the documents? Or why wouldn't the rec

Re: Exception when processing streaming expression

2018-06-14 Thread Susmit
Hi, This may be expected if one of the streams is closed early - does not reach to EOF tuple Sent from my iPhone > On Jun 14, 2018, at 9:53 AM, Christian Spitzlay > wrote: > > Here ist one I stripped down as far as I could: > > innerJoin(sort(search(kmm, > q="sds_endpoint_uuid:(2f927a0b\-f

Streaming Expressions: Merge array values? Inverse of cartesianProduct()

2018-06-14 Thread Christian Spitzlay
Hi, is there a way to merge array values? Something that transforms { "k1": "1", "k2": ["a", "b"] }, { "k1": "2", "k2": ["c", "d"] }, { "k1": "2", "k2": ["e", "f"] } into { "k1": "1", "k2": ["a", "b"] }, { "k1": "2", "k2": ["c", "d", "e", "f"] } And an inverse of ca

Re: Exception when processing streaming expression

2018-06-14 Thread Christian Spitzlay
Here ist one I stripped down as far as I could: innerJoin(sort(search(kmm, q="sds_endpoint_uuid:(2f927a0b\-fe38\-451e\-9103\-580914a77e82)", fl="sds_endpoint_uuid,sds_to_endpoint_uuid", sort="sds_to_endpoint_uuid ASC", qt="/export"), by="sds_endpoint_uuid ASC"), search(kmm, q=ss_search_api_dat

Re: Cost of enabling doc values

2018-06-14 Thread Erick Erickson
My claim is it simply doesn't matter. You either have to have those bytes laying around on disk in the DV case and using OS memory or in the cumulative java heap in the non-dv case. If you're doing one of the three operations I know of no situation where I would _not_ enable docValues. The Lucene

Re: Cost of enabling doc values

2018-06-14 Thread root23
Thanks for the detailed explanation erick. I did a little math as you suggested. Just wanted to see if i am doing it right. So we have around 4 billion docs in production and around 70 nodes. To support the business use case we have around 18 fields on which we have to enable docvalues for sorting

Re: Logging Every document to particular core

2018-06-14 Thread Mikhail Khludnev
You can enable DEBUG level for LogUpdateProcessorFactory category https://github.com/apache/lucene-solr/blob/228a84fd6db3ef5fc1624d69e1c82a1f02c51352/solr/core/src/java/org/apache/solr/update/processor/LogUpdateProcessorFactory.java#L100 On Wed, Jun 13, 2018 at 5:00 PM, govind nitk wrote: > H

Re: Cost of enabling doc values

2018-06-14 Thread Jan Høydahl
Depending on what your documents look like, it could be that enabling docValues would allow you to save space by switching to stored="false" since Solr can fetch the stored value from docValues. I say it depends on your documents and use case since sometimes it may be slower to access a docValue

Re: Solr Suggest Component and OOM

2018-06-14 Thread Alessandro Benedetti
I didn't get any answer to my questions ( unless you meant you have 25 millions of different values for those fields ...) Please read again my answer and elaborate further. Do you problem happen for the 2 different suggesters ? Cheers - --- Alessandro Benedetti Search Consultant

Re: Hardware-Aware Solr Coud Sharding?

2018-06-14 Thread Jan Høydahl
You could also look into the Autoscaling stuff in 7.x which can be programmed to move shards around based on system load and HW specs on the various nodes, so in theory that framework (although still a bit unstable) will suggest moving some replicas from weak nodes over to more powerful ones. If

Re: Can replace the IP with the hostname or some unique identifier for each node in Solr

2018-06-14 Thread Jan Høydahl
See this FAQ https://github.com/docker-solr/docker-solr/blob/master/Docker-FAQ.md#can-i-run-zookeeper-and-solr-clusters-under-docker -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 8. jun. 2018 kl. 14:52 skrev akshat : > > Hi, > > I have deployed Solr in docker swar

Re: Solr Suggest Component and OOM

2018-06-14 Thread Ratnadeep Rakshit
Anyone from the Solr team who can shed some more light? On Tue, Jun 12, 2018 at 8:13 PM, Ratnadeep Rakshit wrote: > I observed that the build works if the data size is below 25M. The moment > the records go beyond that, this OOM error shows up. Solar itself shows 56% > usage of 20GB space during

Re: A good KV store/plugins to go with Solr

2018-06-14 Thread Joel Bernstein
The approach that Alfresco/Solr takes with this is store the original document in filesystem when it indexes content. This way you can be frugal about which fields are stored in the index. Then Alfresco/Solr can retrieve the original document as part of the results using a doc transformer. This ma

Re: A good KV store/plugins to go with Solr

2018-06-14 Thread Jan Høydahl
You could fetch the data from your application directly :;) Also, the Streaming expressions has a jdbc() function but then you will need to know what to query for. It also has a fetch() function which enriches documents with fields from another collection. It would probably be possible to write a

Re: Logging Every document to particular core

2018-06-14 Thread Alessandro Benedetti
Isn't the Transaction Log what you are looking for ? Read this good blog post as a reference : https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ Cheers - --- Alessandro Benedetti Search Consultant, R&D Software Engineer, Direct

Solr 7.2.1 Master-slave replication Issue

2018-06-14 Thread Nitin Kumar
Hi, Facing issue in Solr 7.2.1 Master-slave replication, Master-slave replication is working fine. But if I disable replication from master, Slaves shows no data (numFound=0). Slave in not serving data, it had before replication. I suspect, Index generation is getting updated in slave, which was