Re: Facet search concept problem

2010-02-14 Thread Erik Hatcher
You didn't mention what your faceting parameters were, but what you  
want to do is add a field to every document that specifies its  
source.  So add a "source" (or "type" maybe a better field name) field  
specifying "news", "article", or "blog".  Then facet on that new field.


Erik



On Feb 13, 2010, at 11:45 PM, Ranveer Kumar wrote:


Hi All,

My concept still not clear about facet search.

I am trying to search using facet query. I am indexing data from three
table, following is the detail of table:

table name: news
news_id
news_details

table name : article
article_id
article_details

table name: blog
blog_id
blog_details

I am indexing above tables as:
id
news_id
news_details
article_id
article_details
blog_id
blog_details

Now I want, when user search by "soccer game" and search match in  
all field

news(5), article(4) and blog(2),
then it should be list like:
news(5)
article(4)
blog(2)

currently facet listing like:
soccer(5)
game(6)

please help me..
thanks




Re: problem with edgengramtokenfilter and highlighter

2010-02-14 Thread Joe Calderon

lucene-2266 filed and patch posted.
On 02/13/2010 09:14 PM, Robert Muir wrote:

Joe, can you open a Lucene JIRA issue for this?

I just glanced at the code and it looks like a bug to me.

On Sun, Feb 14, 2010 at 12:07 AM, Joe Calderonwrote:

   

i ran into a problem while using the edgengramtokenfilter, it seems to
report incorrect offsets when generating tokens, more specifically all
the tokens have offset 0 and term length as start and end, this leads
to goofy highlighting behavior when creating edge grams for tokens
beyond the first one, i created a small patch that takes into account
the start of the original token and adds that to the reported
start/end offsets.

 



   




Re: problem with edgengramtokenfilter and highlighter

2010-02-14 Thread Robert Muir
thanks Joe, good catch!

On Sun, Feb 14, 2010 at 2:43 PM, Joe Calderon wrote:

> lucene-2266 filed and patch posted.
>
> On 02/13/2010 09:14 PM, Robert Muir wrote:
>
>> Joe, can you open a Lucene JIRA issue for this?
>>
>> I just glanced at the code and it looks like a bug to me.
>>
>> On Sun, Feb 14, 2010 at 12:07 AM, Joe Calderon> >wrote:
>>
>>
>>
>>> i ran into a problem while using the edgengramtokenfilter, it seems to
>>> report incorrect offsets when generating tokens, more specifically all
>>> the tokens have offset 0 and term length as start and end, this leads
>>> to goofy highlighting behavior when creating edge grams for tokens
>>> beyond the first one, i created a small patch that takes into account
>>> the start of the original token and adds that to the reported
>>> start/end offsets.
>>>
>>>
>>>
>>
>>
>>
>>
>
>


-- 
Robert Muir
rcm...@gmail.com


too often delta imports performance effect

2010-02-14 Thread adeelmahmood

we are trying to setup solr for a website where data gets updated pretty
frequently and I want to have those changes reflected in solr indexes sooner
than nighly delta-imports .. so I am thinking we will probably want to set
it up to have delta imports running every 15 mins or so .. and solr search
will obviously be in use while this is going on .. first of all does solr
works well with adding new data or updating existing data while people are
doing searches in it
secondly are these delta imports are gonna cause any significant performance
degradation in solr search
any help is appreciated
-- 
View this message in context: 
http://old.nabble.com/too-often-delta-imports-performance-effect-tp27587778p27587778.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: too often delta imports performance effect

2010-02-14 Thread Jan Høydahl / Cominvent
Hi,

This all depends on actual volumes, HW, architecture etc.
What exactly is "pretty frequently", how many document updates/adds per 15 
minutes?

Solr is designed to be able to do indexing and search in parallel, so you don't 
need to fear this, unless you are already pushing the limits of what your setup 
can handle. The best way to go is to start out and then optimize when you see 
bottlenecks.

Here is a pointer to Wiki about indexing performance:
http://wiki.apache.org/lucene-java/ImproveIndexingSpeed

--
Jan Høydahl  - search architect
Cominvent AS - www.cominvent.com

On 14. feb. 2010, at 23.56, adeelmahmood wrote:

> 
> we are trying to setup solr for a website where data gets updated pretty
> frequently and I want to have those changes reflected in solr indexes sooner
> than nighly delta-imports .. so I am thinking we will probably want to set
> it up to have delta imports running every 15 mins or so .. and solr search
> will obviously be in use while this is going on .. first of all does solr
> works well with adding new data or updating existing data while people are
> doing searches in it
> secondly are these delta imports are gonna cause any significant performance
> degradation in solr search
> any help is appreciated
> -- 
> View this message in context: 
> http://old.nabble.com/too-often-delta-imports-performance-effect-tp27587778p27587778.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 



Re: too often delta imports performance effect

2010-02-14 Thread adeelmahmood

thank you .. that helps .. actually its not that many updates .. close to 10
fields probably and may be 50 doc updates per 15 .. so i am assuming that by
handling indexing and searching in parallel you mean that if its updating
some data .. it will continue to show old data until new data has been
finalized(committed) or something like that ??


Jan Høydahl / Cominvent wrote:
> 
> Hi,
> 
> This all depends on actual volumes, HW, architecture etc.
> What exactly is "pretty frequently", how many document updates/adds per 15
> minutes?
> 
> Solr is designed to be able to do indexing and search in parallel, so you
> don't need to fear this, unless you are already pushing the limits of what
> your setup can handle. The best way to go is to start out and then
> optimize when you see bottlenecks.
> 
> Here is a pointer to Wiki about indexing performance:
> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
> 
> --
> Jan Høydahl  - search architect
> Cominvent AS - www.cominvent.com
> 
> On 14. feb. 2010, at 23.56, adeelmahmood wrote:
> 
>> 
>> we are trying to setup solr for a website where data gets updated pretty
>> frequently and I want to have those changes reflected in solr indexes
>> sooner
>> than nighly delta-imports .. so I am thinking we will probably want to
>> set
>> it up to have delta imports running every 15 mins or so .. and solr
>> search
>> will obviously be in use while this is going on .. first of all does solr
>> works well with adding new data or updating existing data while people
>> are
>> doing searches in it
>> secondly are these delta imports are gonna cause any significant
>> performance
>> degradation in solr search
>> any help is appreciated
>> -- 
>> View this message in context:
>> http://old.nabble.com/too-often-delta-imports-performance-effect-tp27587778p27587778.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>> 
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/too-often-delta-imports-performance-effect-tp27587778p27588472.html
Sent from the Solr - User mailing list archive at Nabble.com.



schema design - catch all field question

2010-02-14 Thread adeelmahmood

if this is my schema

 
 
 
 
 

with this one being the catch all field


and I am copying all fields into the content field

my question is .. what if instead of that I change the title field to be
text as well and dont copy that into content field but still copy everything
else (all string fields) to content field .. exactly what difference will
that make .. 

-- 
View this message in context: 
http://old.nabble.com/schema-design---catch-all-field-question-tp27588936p27588936.html
Sent from the Solr - User mailing list archive at Nabble.com.



Question on Index Replication

2010-02-14 Thread abhishes

Hello All,

Upon reading the article 

http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr

I have a question around index replication. 

If the query load is very high and I want multiple severs to be able to
search the index. Can multiple servers share one read-only copy of the
index?

so one server (Master) builds the index and it is stored on a SAN. Then
multiple Slave servers point to the same copy of the data and answer user
queries.

In the replication diagram, I see that the index is being copied on each of
the Slave servers. 

This is not desirable because index is read-only (for the slave servers,
because only master updates the index) and copying of indexes can take very
long (depending on index size) and can unnecessarily waste disk space.
-- 
View this message in context: 
http://old.nabble.com/Question-on-Index-Replication-tp27590418p27590418.html
Sent from the Solr - User mailing list archive at Nabble.com.