Re: Facet search concept problem
You didn't mention what your faceting parameters were, but what you want to do is add a field to every document that specifies its source. So add a "source" (or "type" maybe a better field name) field specifying "news", "article", or "blog". Then facet on that new field. Erik On Feb 13, 2010, at 11:45 PM, Ranveer Kumar wrote: Hi All, My concept still not clear about facet search. I am trying to search using facet query. I am indexing data from three table, following is the detail of table: table name: news news_id news_details table name : article article_id article_details table name: blog blog_id blog_details I am indexing above tables as: id news_id news_details article_id article_details blog_id blog_details Now I want, when user search by "soccer game" and search match in all field news(5), article(4) and blog(2), then it should be list like: news(5) article(4) blog(2) currently facet listing like: soccer(5) game(6) please help me.. thanks
Re: problem with edgengramtokenfilter and highlighter
lucene-2266 filed and patch posted. On 02/13/2010 09:14 PM, Robert Muir wrote: Joe, can you open a Lucene JIRA issue for this? I just glanced at the code and it looks like a bug to me. On Sun, Feb 14, 2010 at 12:07 AM, Joe Calderonwrote: i ran into a problem while using the edgengramtokenfilter, it seems to report incorrect offsets when generating tokens, more specifically all the tokens have offset 0 and term length as start and end, this leads to goofy highlighting behavior when creating edge grams for tokens beyond the first one, i created a small patch that takes into account the start of the original token and adds that to the reported start/end offsets.
Re: problem with edgengramtokenfilter and highlighter
thanks Joe, good catch! On Sun, Feb 14, 2010 at 2:43 PM, Joe Calderon wrote: > lucene-2266 filed and patch posted. > > On 02/13/2010 09:14 PM, Robert Muir wrote: > >> Joe, can you open a Lucene JIRA issue for this? >> >> I just glanced at the code and it looks like a bug to me. >> >> On Sun, Feb 14, 2010 at 12:07 AM, Joe Calderon> >wrote: >> >> >> >>> i ran into a problem while using the edgengramtokenfilter, it seems to >>> report incorrect offsets when generating tokens, more specifically all >>> the tokens have offset 0 and term length as start and end, this leads >>> to goofy highlighting behavior when creating edge grams for tokens >>> beyond the first one, i created a small patch that takes into account >>> the start of the original token and adds that to the reported >>> start/end offsets. >>> >>> >>> >> >> >> >> > > -- Robert Muir rcm...@gmail.com
too often delta imports performance effect
we are trying to setup solr for a website where data gets updated pretty frequently and I want to have those changes reflected in solr indexes sooner than nighly delta-imports .. so I am thinking we will probably want to set it up to have delta imports running every 15 mins or so .. and solr search will obviously be in use while this is going on .. first of all does solr works well with adding new data or updating existing data while people are doing searches in it secondly are these delta imports are gonna cause any significant performance degradation in solr search any help is appreciated -- View this message in context: http://old.nabble.com/too-often-delta-imports-performance-effect-tp27587778p27587778.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: too often delta imports performance effect
Hi, This all depends on actual volumes, HW, architecture etc. What exactly is "pretty frequently", how many document updates/adds per 15 minutes? Solr is designed to be able to do indexing and search in parallel, so you don't need to fear this, unless you are already pushing the limits of what your setup can handle. The best way to go is to start out and then optimize when you see bottlenecks. Here is a pointer to Wiki about indexing performance: http://wiki.apache.org/lucene-java/ImproveIndexingSpeed -- Jan Høydahl - search architect Cominvent AS - www.cominvent.com On 14. feb. 2010, at 23.56, adeelmahmood wrote: > > we are trying to setup solr for a website where data gets updated pretty > frequently and I want to have those changes reflected in solr indexes sooner > than nighly delta-imports .. so I am thinking we will probably want to set > it up to have delta imports running every 15 mins or so .. and solr search > will obviously be in use while this is going on .. first of all does solr > works well with adding new data or updating existing data while people are > doing searches in it > secondly are these delta imports are gonna cause any significant performance > degradation in solr search > any help is appreciated > -- > View this message in context: > http://old.nabble.com/too-often-delta-imports-performance-effect-tp27587778p27587778.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: too often delta imports performance effect
thank you .. that helps .. actually its not that many updates .. close to 10 fields probably and may be 50 doc updates per 15 .. so i am assuming that by handling indexing and searching in parallel you mean that if its updating some data .. it will continue to show old data until new data has been finalized(committed) or something like that ?? Jan Høydahl / Cominvent wrote: > > Hi, > > This all depends on actual volumes, HW, architecture etc. > What exactly is "pretty frequently", how many document updates/adds per 15 > minutes? > > Solr is designed to be able to do indexing and search in parallel, so you > don't need to fear this, unless you are already pushing the limits of what > your setup can handle. The best way to go is to start out and then > optimize when you see bottlenecks. > > Here is a pointer to Wiki about indexing performance: > http://wiki.apache.org/lucene-java/ImproveIndexingSpeed > > -- > Jan Høydahl - search architect > Cominvent AS - www.cominvent.com > > On 14. feb. 2010, at 23.56, adeelmahmood wrote: > >> >> we are trying to setup solr for a website where data gets updated pretty >> frequently and I want to have those changes reflected in solr indexes >> sooner >> than nighly delta-imports .. so I am thinking we will probably want to >> set >> it up to have delta imports running every 15 mins or so .. and solr >> search >> will obviously be in use while this is going on .. first of all does solr >> works well with adding new data or updating existing data while people >> are >> doing searches in it >> secondly are these delta imports are gonna cause any significant >> performance >> degradation in solr search >> any help is appreciated >> -- >> View this message in context: >> http://old.nabble.com/too-often-delta-imports-performance-effect-tp27587778p27587778.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> > > > -- View this message in context: http://old.nabble.com/too-often-delta-imports-performance-effect-tp27587778p27588472.html Sent from the Solr - User mailing list archive at Nabble.com.
schema design - catch all field question
if this is my schema with this one being the catch all field and I am copying all fields into the content field my question is .. what if instead of that I change the title field to be text as well and dont copy that into content field but still copy everything else (all string fields) to content field .. exactly what difference will that make .. -- View this message in context: http://old.nabble.com/schema-design---catch-all-field-question-tp27588936p27588936.html Sent from the Solr - User mailing list archive at Nabble.com.
Question on Index Replication
Hello All, Upon reading the article http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr I have a question around index replication. If the query load is very high and I want multiple severs to be able to search the index. Can multiple servers share one read-only copy of the index? so one server (Master) builds the index and it is stored on a SAN. Then multiple Slave servers point to the same copy of the data and answer user queries. In the replication diagram, I see that the index is being copied on each of the Slave servers. This is not desirable because index is read-only (for the slave servers, because only master updates the index) and copying of indexes can take very long (depending on index size) and can unnecessarily waste disk space. -- View this message in context: http://old.nabble.com/Question-on-Index-Replication-tp27590418p27590418.html Sent from the Solr - User mailing list archive at Nabble.com.