A few random questions about solr queries.

2012-05-29 Thread santamaria2
*1)* With faceting, how does facet.query perform in comparison to
facet.field? I'm just wondering this as in my use case, I need to facet over
a field -- which would get me the top n facets for that field, but I also
need to show the count for a "selected filter" which might have a relatively
low count so it doesn't appear in the top n returned facets. So the solution
would be to 'ensure' its presence by adding a 'facet.query=cat:val' in
addition to my facet.field=cat.

I want to do this to quite a few fields.

Related/example-based question:
When I facet over a field, and something gets returned, eg: John Smith (83),
and I also 'ensure' this facet's presence by having it in
facet.query=author:"John Smith", are two different calculations performed?
Or is the facet returned by facet.field also used by facet.query to obtain
the count?



*2) *Is there a performance issue if I have around, say, 20 facet.query
conditions along with 10 facet.fields? 3/10 of those fields have around
100,000 possible values. Remaining have a few hundred each.



*3)* I've rummaged around a bit, looking for info on when to use q vs fq. I
want to clear my doubts for a certain use case.

Where should my date range queries go? In q or fq? The default settings in
my site show results from the past 90 days with buttons to show stuff from
the last month and week as well. But the user is allowed to use a slider to
apply any date range... this is allowed, but it's not /that/ common. 
I definitely use fq for filtering various tags. Choosing a tag is a common
activity.

Should the date range query go in fq? As I mentioned, the default view shows
stuff from the past 90 days. So on each new day does this like invalidate
stuff in the cache? Or is stuff stored in the filtered cache in some way
that makes it easy to fetch stuff from the past 89 days when a query is
performed the next day?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/A-few-random-questions-about-solr-queries-tp3986562.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: A few random questions about solr queries.

2012-05-30 Thread santamaria2
A wee bit of clarification on the 2nd question. I meant relative performance,
ie. would it be much slower to facet over 20 facet.queries & 10 facet.fields
compared to say, 4 facet.queries & facet.fields. I wonder if this makes
sense...

So... is a bump improper etiquette here? >_>

--
View this message in context: 
http://lucene.472066.n3.nabble.com/A-few-random-questions-about-solr-queries-tp3986562p3986977.html
Sent from the Solr - User mailing list archive at Nabble.com.


Is it faster to search over many different fields or one field that combines the values of all those other fields?

2012-06-05 Thread santamaria2
Say I have various categories of 'tags'. I want a keyword search to search
through my index of articles. So I search over:
1) the title.
2) the body
3) about 10 of these tag-categories. Each tag category is multivalued with a
few words per value.

Without considering the affect on 'relevance', and using the standard lucene
query parser, would it be faster to specify each of these 10 fields in q (q
= cat1:keyword OR cat2:keyword OR ... ), or to copyfield the stuff in those
10 fields into one combined field?

Or is it such that I should be slapped in the face for even thinking about
performance in this scenario?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-it-faster-to-search-over-many-different-fields-or-one-field-that-combines-the-values-of-all-those-tp3987766.html
Sent from the Solr - User mailing list archive at Nabble.com.


Wildcard query vs facet.prefix for autocomplete?

2012-07-15 Thread santamaria2
I'm about to implement an autocomplete mechanism for my search box. I've read
about some of the common approaches, but I have a question about wildcard
query vs facet.prefix.

Say I want autocomplete for a title: 'Shadows of the Damned'. I want this to
appear as a suggestion if I type 'sha' or 'dam' or 'the'. I don't care that
it won't appear if I type 'hadows'. 

While indexing, I'd use a whitespace tokenizer and a lowercase filter to
store that title in the index.
Now I'm thinking two approaches for 'dam' typed in the search box:

1) q=title:dam*

2) q=*:*&facet=on&facet.field=title&facet.prefix=dam


So any reason that I should favour one over the other? Speed a factor? The
index has around 200,000 items.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-query-vs-facet-prefix-for-autocomplete-tp3995199.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Wildcard query vs facet.prefix for autocomplete?

2012-07-17 Thread santamaria2
I'll consider using the other methods, but I'd like to know which would be
faster among the two approaches mentioned in my opening post.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-query-vs-facet-prefix-for-autocomplete-tp3995199p3995458.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Wildcard query vs facet.prefix for autocomplete?

2012-07-18 Thread santamaria2
Well silly me... you're right.

On Wed, Jul 18, 2012 at 6:44 PM, Erick Erickson [via Lucene] <
ml-node+s472066n399570...@n3.nabble.com> wrote:

> Well, option 2 won't do you any good, so speed doesn't really matter.
> Your response would have a facet count for "dam", all by itself, something
> like
>
> 2
> 1
>
> etc.
>
> which does not contain anything that lets you reconstruct the title
> for autosuggest.
>
> Best
> Erick
>
> On Tue, Jul 17, 2012 at 3:18 AM, santamaria2 <[hidden 
> email]<http://user/SendEmail.jtp?type=node&node=3995706&i=0>>
> wrote:
> > I'll consider using the other methods, but I'd like to know which would
> be
> > faster among the two approaches mentioned in my opening post.
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/Wildcard-query-vs-facet-prefix-for-autocomplete-tp3995199p3995458.html
>
> > Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Wildcard-query-vs-facet-prefix-for-autocomplete-tp3995199p3995706.html
>  To unsubscribe from Wildcard query vs facet.prefix for autocomplete?, click
> here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3995199&code=YXJhdmluZGEucmFvQGNvbnRpZnkuY29tfDM5OTUxOTl8MTgyMTM4MDg2OQ==>
> .
> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-query-vs-facet-prefix-for-autocomplete-tp3995199p3995707.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard query vs facet.prefix for autocomplete?

2012-07-18 Thread santamaria2
Very interesting! Thanks for sharing, I'll ponder on it.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-query-vs-facet-prefix-for-autocomplete-tp3995199p3995899.html
Sent from the Solr - User mailing list archive at Nabble.com.

Designing an index with multiple entity types, sharing field names across entity-types.

2012-08-08 Thread santamaria2
My question stems from a vague memory of reading somewhere that Solr's search
performance depends on how the total number of 'terms' there are in all in a
field that is searched upon.

I'm setting up an index core for some autocomplete boxes on my site. There
is a search box for each facet group in my results page (suggestions for a
single entity-type), and a 'generic' search box on my header that will
display suggestions for multiple entity-types.

The entity types are: Books, Authors, Categories, Publishers.

Books, Authors --> over 100,000 of each type right now. Will grow larger.
Categories, Publishers --> around 500 of each type. Will grow slowly.

Books & Categories have 'descriptions' which I also want searchable -- with
lower boosts.

In my per-entity search boxes, for autocomplete suggestions for user input
"man", I'd do:
q=(name:man* OR description:man*^0.5)&fq=type:


For my generic search box on top of my page, I would not have fq, but
instead I'd use &group=true&group.field=type.
(type --> {'book', 'author', 'category', 'publisher'})

This seems okay, but I'm just wondering about what I said in my first
paragraph. The number of total terms of a field.

For a lrge index, would it be better to more specific fields?
eg. Instead of a common field 'name', what if I do 'author_name',
'book_name', 'publisher_name', 'category_name', 'book_description',
'category_description'?

Would this be 'faster' to search on?
For my per-entity search boxes, the query changes in an obvious manner. But
this would complicate stuff for my generic-search-box query... for which I
haven't decided on how I'd go about designing a query, yet.

What say thee?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Designing-an-index-with-multiple-entity-types-sharing-field-names-across-entity-types-tp3999727.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Designing an index with multiple entity types, sharing field names across entity-types.

2012-08-08 Thread santamaria2
To clarify a wee bit more. I'm wondering the performance impact on
single-entity queries if I use common field names.
eg. 'name' field for all entity types. 'Author' & 'Book' together make up
for 200,000+ 'name' values. Will this affect anything if I search over
'Category'? Will using fq=type:category save me?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Designing-an-index-with-multiple-entity-types-sharing-field-names-across-entity-types-tp3999727p3999728.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Designing an index with multiple entity types, sharing field names across entity-types.

2012-08-08 Thread santamaria2
*civilized bump*



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Designing-an-index-with-multiple-entity-types-sharing-field-names-across-entity-types-tp3999727p451.html
Sent from the Solr - User mailing list archive at Nabble.com.