from:"Andrew Harvey"

Re: Ngram autocompleter and term frequency boosting

2012-01-19 Thread Andrew Harvey

With Solr 4.0 you could use relevance functions to give a query time boost if 
you don't have the information at index time. 

Alternatively you could do term facet based autocomplete which would mean you 
could sort by count rather than any other input. 

Andrew

Sent on the run. 

On 20/01/2012, at 15:45, Otis Gospodnetic  wrote:

> Cuong,
> 
> If when you are indexing your AC suggestions you know "Java Developer" 
> appears twice in the index, why not give it appropriate index-time boost?  
> Wouldn't that work for you?
> 
> 
> Otis
> 
> 
> Performance Monitoring SaaS for Solr - 
> http://sematext.com/spm/solr-performance-monitoring/index.html
> 
> 
> 
> - Original Message -
>> From: Cuong Hoang 
>> To: solr-user@lucene.apache.org
>> Cc: 
>> Sent: Thursday, January 19, 2012 12:01 AM
>> Subject: Ngram autocompleter and term frequency boosting
>> 
>> Hi guys,
>> 
>> I'm trying to build a Ngram-based autocompleter that takes term frequency
>> into account.
>> 
>> Let's say I have the following documents:
>> 
>> D1: title => "Java Developer"
>> D2: title => "Java Programmer"
>> D3: title => "Java Developer"
>> 
>> When the user types in "Java", I want to display
>> 
>> 1. "Java Developer"
>> 2. "Java Programmer"
>> 
>> Basically "Java Developer" ranks first because it appears twice in the
>> index while "Java Programmer" only appears once. Is it possible?
>> 
>> I'm using the following config for "title" field:
>> 
>> > omitNorms="false">
>>   
>> 
>> 
>> > minGramSize="1"
>> maxGramSize="25" side="front"/>
>>   
>>   
>> 
>> 
>>   
>> 
>> 
>> Thanks
>>

Re: Solr Cluster - Is it wise to run optimize() on the master after each update

2012-01-23 Thread Andrew Harvey

We found that optimising too often killed our slave performance. An optimise 
will cause you to merge and ship the whole index rather than just the relevant 
portions when you replicate. 

The change on our slaves in terms of IO and CPU as well as RAM was marked. 

Andrew

Sent on the run. 

On 23/01/2012, at 19:03, Maxim Veksler  wrote:

> I'm planning on having 1 Master and multiple slaves (cloud based, slaves
> are going up / down randomly).
> 
> The slaves should be constantly available, meaning searching performance
> should optimally not be affected by the updates at all.
> It's unclear to me how the Cluster based replication works, does it copy
> the files from the master and updates in place? In which case am I correct
> to assume that except for cache being emptied the search performance in not
> affects?
> 
> Does optimize on the master some how affects the performance of the slaves?
> Is it recommended to run optimize after each update, assuming I'm not
> concerted about locking the master for updates and it's OK if the optimize
> finishes in under 20min?
> 
> Thank you,
> Maxim.

Re: how to correctly facet clothing multiple sizes and colors?

2012-04-09 Thread Andrew Harvey

What we do in our application is exactly what Robert described. We index 
Products, not variants. The variant data (colour, size etc.) is denormalised 
into the product document at index time. We then facet on the variant 
attributes and get product count instead of variant count. 

What you're seeing are correct results. You are indexing 6 documents, as you 
said before. You actually only want to index one document with multi-valued 
fields. 

Hope that's somehow helpful,

Andrew

On 10/04/2012, at 3:01, "Robert Petersen"  wrote:

> You *could* do it by making one and only one solr document for each
> clothing item, then just have the front end render all the sizes and
> colors available for that item as size/color pickers on the product
> page.  You can add all the colors and sized to the one document in the
> index so they are searchable also, but the caveat is that they won't
> show up as a facet.  This is just one simple approach.
> 
> -Original Message-
> From: danjfoley [mailto:d...@micamedia.com] 
> Sent: Saturday, April 07, 2012 7:04 PM
> To: solr-user@lucene.apache.org
> Subject: how to correctly facet clothing multiple sizes and colors?
> 
> I've been searching for a solution to my issue, and this seems to come
> closest to it. But not exactly. 
> 
> I am indexing clothing. Each article of clothing comes in many sizes and
> colors, and can belong to any number of categories. 
> 
> For example take the following: I add 6 documents to solr as follows: 
> 
> product, color, size, category 
> 
> shirt A, red, small, valentines day 
> shirt A, red, large, valentines day 
> shirt A, blue, small, valentines day 
> shirt A, blue, large, valentines day 
> shirt A, green, small, valentines day 
> shirt A, green, large, valentines day 
> 
> I'd like my facet counts to return as follows: 
> 
> color 
> 
> red (1) 
> blue (1) 
> green (1) 
> 
> size 
> 
> small (1) 
> large (1) 
> 
> category 
> 
> valentines day (1) 
> 
> But they come back like this: 
> 
> color: 
> red (2) 
> blue (2) 
> green (2) 
> 
> size: 
> small (2) 
> large (2) 
> 
> category 
> valentines day (6) 
> 
> I see the group.facet parameter in version 4.0 does exactly this.
> However
> how can I make this happen now? There are all sorts of ecommerce systems
> out
> there that facet exactly how i'm asking. i thought solr is supposed to
> be
> the very best fastest search system, yet it doesn't seem to be able to
> facet
> correct for items with multiple values? 
> 
> Am i indexing my data wrong? 
> 
> how can i make this happen?
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/how-to-correctly-facet-clothing-multi
> ple-sizes-and-colors-tp3893747p3893747.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Ngram autocompleter and term frequency boosting

Re: Solr Cluster - Is it wise to run optimize() on the master after each update

Re: how to correctly facet clothing multiple sizes and colors?

3 matches

Site Navigation

Mail list logo

Footer information