Filter caching

2014-03-31 Thread youknow...@heroicefforts.net
Re-reading the documentation, it seems that Solr caches the results of the fq 
parameter, not lower level field constraints. This would imply that breaking a 
single complex boolean filter into multiple conjunctive fq parameters would 
improve the odds for cache hits.  Is this correct?

fq=(a:foo or b:bar) and c:bah
Vs.
fq=(a:foo or b:bar)&fq=c:bah


Thanks,

-Jess
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Delete by query with soft commit

2014-04-08 Thread youknow...@heroicefforts.net
It appears that UpdateResponse.setCommitWithin is not honored when executing a 
delete query against SolrCloud (SolrJ 4.6).  However, setting the hard commit 
parameter functions as expected.  Is this a known bug?

Thanks, 

-Jess

solrconfig.xml carrot2 params

2013-10-17 Thread youknow...@heroicefforts.net
Would someone help me out with the syntax for setting Tokenizer.documentFields 
in the ClusteringComponent engine definition in solrconfig.xml?  Carrot2 is 
expecting a Collection of Strings.  There's no schema definition for this XML 
file and a big TODO on the Wiki wrt init params.  Every permutation I have 
tried results in an error stating:  Cannot set java.until.Collection field ... 
to java.lang.String.
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Listing collection fields

2013-11-18 Thread youknow...@heroicefforts.net
I'd like to get the complete field list for a collection, including dynamic 
fields.  Is issuing a Luke request still the recommended way for retrieving 
this data?

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Re: Listing collection fields

2013-11-19 Thread youknow...@heroicefforts.net
Thanks.  I have an Xtext DSL doing some config and code generation downstream 
of the data ingestion.  It probably wouldn't be that hard to generate a 
solrconfig.xml, but for now I just want to build in some runtime reconciliation 
to aid in dynamic query generation.  It sounds like Luke is still the best 
approach.

Regards,

-Jess

Shalin Shekhar Mangar  wrote:

>You can use the ListFields method in the new Schema API:
>
>https://cwiki.apache.org/confluence/display/solr/Schema+API#SchemaAPI-ListFields
>
>Note that this will return all configured fields but it doesn't tell
>you the actual dynamic field names in the index. I don't know if we
>have anything better than a luke request for that yet.
>
>On Tue, Nov 19, 2013 at 5:56 AM, youknow...@heroicefforts.net
> wrote:
>> I'd like to get the complete field list for a collection, including
>dynamic fields.  Is issuing a Luke request still the recommended way
>for retrieving this data?
>>
>> --
>> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>
>
>
>-- 
>Regards,
>Shalin Shekhar Mangar.

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Re: Advise on an architecture with lot of cores

2014-10-07 Thread youknow...@heroicefforts.net
"On the other hand,
it [sic] most of the cores are idle most of the time, the 1 core/customer
setup would be give better utilization of the hardware."

This is an important point.  I've seen performance go to hell when 10M, 100M, 
and 1B cloud collections were consolidated in a hardware constrained 
environment.  The data belonged to the same customer and there were good reason 
for this approach.  In our case, we were able to reduce our queries by n-1 
(where n is the number of collections consolidated), but the overall query was 
slower; many seconds vs subsecond.  You won't have that option, but maybe you 
are in a better place wrt hardware. The newer cloud routing may also play an 
important role here (maybe someone else could speak to that).  As you alluded 
earlier, the query generation must be altered to generate a fq security clause 
(operator precedence is important here).

If search performance is a vital part of your company's service offering, then 
it's definitely worth the money to collect representative queries and test on 
alternate hardware before committing your production environment.

Cheers,

-Jess

On October 7, 2014 8:56:46 AM EST, Manoj Bharadwaj  wrote:
>Hi Toke,
>
>Thank you for your insights.
>
>
>> Why do you want to collapse the cores?
>>
>
>Most of the cores are small and a few big ones make up the bulk. Our
>thinking was that it would be as easy to just have one core. Monitoring
>becomes easy as well (we are using a monitoring tool in which there is
>a
>limit on the number of endpoints that can be monitored, and we are
>considering other monitoring solutions including Sematext).
>
>Regards
>Manoj

-- 
Sent from my mobile. Please excuse my brevity.