Re: Custom post filter with support for 'OR' queries

2019-05-05 Thread alexpusch
Thanks for the quick reply. The real data is an representation of an HTML element "body div.class1 div.b.a", My goal is to match documents by css selector i.e ".class1 .a.b" The field I'm querying on is a tokenzied texts field. The post filter takes the doc value of the field (which is not tokeni

Custom post filter with support for 'OR' queries

2019-05-05 Thread alexpusch
Hi, I'm trying to write my own custom post filter. I'm following the following guide - http://qaware.blogspot.com/2014/11/how-to-write-postfilter-for-solr-49.html My implementation works for a simple query: {!myFilter}query But I need to perform OR queries in addition to my post filter: field:v

Re: Changing merge policy config on production

2017-12-16 Thread alexpusch
Thanks Erick, good point on maxMergedSegmentMB, many of my segments really are max out. My index isn't 800G, but it's not far from it - it's about 250G per server. I have high confidence in Solr and my EC2 i3-2xl instances, so far I got pretty good results. -- Sent from: http://lucene.472066.n3

Re: How to restart solr in docker?

2017-12-16 Thread alexpusch
While I don't know what exact solr image you use I can tell you this: 1. The command of your dockerfile probably starts solr. A Docker container will automatically shutdown if the process that was started by it's command is killed. Meaning you should never 'restart' a process in a container, but r

Re: Changing merge policy config on production

2017-12-16 Thread alexpusch
To be clear - I'm talking about query performance, not indexing performance. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Changing merge policy config on production

2017-12-16 Thread alexpusch
Thanks for the quick answer Erick, I'm hoping to improve performance by reducing the number of segments. Currently I have ~160 segments. Am I wrong thinking it might improve performance? -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Changing merge policy config on production

2017-12-15 Thread alexpusch
Hi, Is it safe to change the mergePolicyFactory config on production servers? Specifically maxMergeAtOnce and segmentsPerTier. How will solr reconcile the current state of the segments with the new config? In case of setting segmentsPerTier to a lower number - will subsequent merges be particulary

Performance issues with 'unique' function in json facets over a high cardinality field

2017-12-12 Thread alexpusch
Hi, I have a surprising performance issue with the 'unique' function in a json facet My setup holds large amount of docs (~1B), despite this large number I only facet on a small result set of a query, only a few docs. The query itself returns as fast as expected, but when I try to do a unique cou

Re: Keeping the index naturally ordered by some field

2017-10-02 Thread alexpusch
The reason I'm interested in this is kind of unique. I'm writing a custom query parser and search component. These components go over the search results and perform some calculation over it. This calculation depends on input sorted by a certain value. In this scenario, regular solr sorting is insuf

Keeping the index naturally ordered by some field

2017-10-01 Thread alexpusch
Hello, We've got a pretty big index (~1B small docs). I'm interested in managing the index so that the search results would be naturally sorted by a certain numeric field, without specifying the actual sort field in query time. My first attempt was using SortingMergePolicyFactory. I've found that

Re: Iterating sorted result docs in a custom search component

2017-03-14 Thread alexpusch
I ended up using ValueSource, and FunctionValues (as used in statsComponent) FieldType fieldType = schemaField.getType(); ValueSource valueSource = fieldType.getValueSource(schemaField, null); FunctionValues values = valueSource.getValues(Collections.emptyMap(), ctx); values.strVal(docId) I hope

Re: Iterating sorted result docs in a custom search component

2017-03-14 Thread alexpusch
Single field. I'm iterating over the results once, and need each doc in memory only for that single iteration. I need different fields from each doc according to the algorithm state. -- View this message in context: http://lucene.472066.n3.nabble.com/Iterating-sorted-result-docs-in-a-custom-sea

Re: Iterating sorted result docs in a custom search component

2017-03-13 Thread alexpusch
As have been said, only the top N results are collected, but in order to find out which of the results are the top one, all the results must be sorted, no? Can't the docs be somehow accessible in that stage? Anyway, I see SortingResponseWriter does its own manual sorting using a priority queue. So

Iterating sorted result docs in a custom search component

2017-03-12 Thread alexpusch
I hope this is the right place to ask about custom search components. I'm writing a custom search component. My aim is iterate over the entire result set and do some aggregate computation. In order to implement my algorithm I require to iterate over the result set in the order declared in the sear