date:20110420

Re: How could each core share configuration files

2011-04-20 Thread lboutros

Perhaps this could help : http://lucene.472066.n3.nabble.com/Shared-conf-td2787771.html#a2789447 Ludovic. 2011/4/20 kun xiong [via Lucene] < ml-node+2841801-1701787156-383...@n3.nabble.com> > Hi all, > > Currently in my project , most of the core configurations are > same(solrconfig.xml, dataim

RE: Custom Sorting

2011-04-20 Thread Michael Owen

Ok thank you for the discussion. As I thought regard to not possible within performance limits. I think the way to go is to document some more stats at index time, and use them in boost queries. :) Thanks Mike > Date: Tue, 19 Apr 2011 15:12:00 -0400 > Subject: Re: Custom Sorting > From: ericker

Re: TikaEntityProcessor

2011-04-20 Thread firdous_kind86

hi, i asked that :) didnt get that.. what dependencies? i am using solr 1.4 and tika 0.9 i replaced tika-core 0.9 and tika-parsers 0.9 at /contrib/extraction/lib also replaced old version of dataimporthandler-extras by apache-solr-dataimporthandler-extras-3.1.0.jar but still same problem.. som

Selecting (and sorting!) by the min/max value from multiple fields

2011-04-20 Thread jmaslac

Hello, short question is this - is there a way for a search to return a field that is not defined in the schema but is a minimal/maximum value of several (int/float) fields in solrDocument? (and how would that search look like?) Longer explanation. I have products and each of them can have a seve

Re: Selecting (and sorting!) by the min/max value from multiple fields

2011-04-20 Thread Tanguy Moal

Hello, Have you tried reading : http://wiki.apache.org/solr/FunctionQuery#Sort_By_Function From that page I would try something like : http://host:port/solr/select?q=sony&sort=min(min(priceCash,priceCreditCard),priceCoupon)+asc&rows=10&indent=on&debugQuery=on Is that of any help ? -- Tanguy

Saravanan Chinnadurai/Actionimages is out of the office.

2011-04-20 Thread Saravanan . Chinnadurai

I will be out of the office starting 20/04/2011 and will not return until 21/04/2011. Please email to itsta...@actionimages.com for any urgent issues. Action Images is a division of Reuters Limited and your data will therefore be protected in accordance with the Reuters Group Privacy / Data P

RE: How could each core share configuration files

2011-04-20 Thread Ephraim Ofir

I just use soft-links... Ephraim Ofir -Original Message- From: lboutros [mailto:boutr...@gmail.com] Sent: Wednesday, April 20, 2011 10:09 AM To: solr-user@lucene.apache.org Subject: Re: How could each core share configuration files Perhaps this could help : http://lucene.472066.n3.nabb

Re: Selecting (and sorting!) by the min/max value from multiple fields

2011-04-20 Thread jmaslac

Tanguy, thanks for the anwser. Yes I have already tried that but the problem is that min() function is not yet available (it is set for Solr 3.2). :( Btw. in my original post I've asked if the query could in the results return a new field with this computed minimal value - that is redudant, I'm

Re: KStemmer for Solr 3.x +

2011-04-20 Thread Ofer Fort

Seems like it isn't. In my installation (1.4.1) i used LucidKStemFilterFactory, and when switching the solr.war file to the 3.1 one i get: 14:42:31.664 ERROR [pool-1-thread-1]: java.lang.AbstractMethodError: org.apache.lucene.analysis.TokenStream.incrementToken()Z at org.apache.lucene.analy

Re: old searchers not closing after optimize or replication

2011-04-20 Thread Erick Erickson

Does this persist? In other words, if you just watch it for some time, does the disk usage go back to normal? Because it's typical that your index size will temporarily spike after the operations you describe as new searchers are warmed up. During that interval, both the old and new searchers are

Re: old searchers not closing after optimize or replication

2011-04-20 Thread Bernd Fehling

Hi Erik, Am 20.04.2011 13:56, schrieb Erick Erickson: Does this persist? In other words, if you just watch it for some time, does the disk usage go back to normal? Only after restarting the whole solr the disk usage goes back to normal. Because it's typical that your index size will tempora

Re: old searchers not closing after optimize or replication

2011-04-20 Thread Erick Erickson

H, this isn't right. You've pretty much eliminated the obvious things. What does lsof show? I'm assuming it shows the files are being held open by your Solr instance, but it's worth checking. I'm not getting the same behavior, admittedly on a Windows box. The only other thing I can think of is

Solr - Multi Term highlighting issue

2011-04-20 Thread Ramanathapuram, Rajesh

Hello, I am dealing with a highlighting issue in SOLR, I will try to explain the issue. When I search for a single term in solr, it wraps tag around the words I want to highlight, all works well. But if I search multiple term, for most part highlighting works good and then for some of the terms,

Re: old searchers not closing after optimize or replication

2011-04-20 Thread Bernd Fehling

Hi Erik, Am 20.04.2011 15:42, schrieb Erick Erickson: H, this isn't right. You've pretty much eliminated the obvious things. What does lsof show? I'm assuming it shows the files are being held open by your Solr instance, but it's worth checking. Just commited new content 3 times and finall

Re: TikaEntityProcessor

2011-04-20 Thread Andreas Kemkes

I went unsuccessfully down this path - too many incompatibilities among versions - some code changes and recompiling required. See also thread "Solr 1.4.1 and Tika 0.9 - some tests not passing" for remaining issues. You'll have better luck with the newer Solr 3.1 release, which already uses T

Re: TikaEntityProcessor

2011-04-20 Thread firdous_kind86

after reading this post i hoped that i could achieve.. but couldnt find any success in almost a week http://lucene.472066.n3.nabble.com/TikaEntityProcessor-not-working-td856965.html#a867572 -- View this message in context: http://lucene.472066.n3.nabble.com/TikaEntityProcessor-tp2839188p2843084.

Multiple Tags and Facets

2011-04-20 Thread Em

Hello, I watched an online video with Chris Hostsetter from Lucidimagination. He showed the possibility of having some Facets that exclude *all* filter while also having some Facets that take care of some of the set filters while ignoring other filters. Unfortunately the Webinar did not explain h

Re: old searchers not closing after optimize or replication

2011-04-20 Thread Erick Erickson

It looks OK, but still doesn't explain keeping the old files around. What is your in your solrconfig.xml look like? It's possible that you're seeing Solr attempt to keep around several optimized copies of the index, but that still doesn't explain why restarting Solr removes them unless the deletio

Re: Solr - Multi Term highlighting issue

2011-04-20 Thread Erick Erickson

Does your configuration have "hl.mergeContiguous" set to true by any chance? And what happens if you explicitly set this to "false" on your query? Best Erick On Wed, Apr 20, 2011 at 9:43 AM, Ramanathapuram, Rajesh wrote: > Hello, > > I am dealing with a highlighting issue in SOLR, I will try to

HTMLStripCharFilterFactory, highlighting and InvalidTokenOffsetsException

2011-04-20 Thread Robert Gründler

Hi all, i'm getting the following exception when using highlighting for a field containing HTMLStripCharFilterFactory: org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token ... exceeds length of provided text sized 21 It seems this is a know issue: https://issues.apache.or

Re: Creating a TrieDateField (and other Trie fields) from Lucene Java

2011-04-20 Thread Yonik Seeley

On Tue, Apr 19, 2011 at 11:17 PM, Craig Stires wrote: > The barrier I have is that I need to build this offline (without using a > solr server, solrconfig.xml, or schema.xml) This is pretty unusual... can you share your use case? Solr can also be run in embedded mode if you can't run a stand-alon

Re: HTMLStripCharFilterFactory, highlighting and InvalidTokenOffsetsException

2011-04-20 Thread Robert Muir

Hi, there is a proposed patch uploaded to the issue. Maybe you can help by reviewing/testing it? 2011/4/20 Robert Gründler : > Hi all, > > i'm getting the following exception when using highlighting for a field > containing HTMLStripCharFilterFactory: > > org.apache.lucene.search.highlight.Invalid

stemming filter analyzers, any favorites?

2011-04-20 Thread Robert Petersen

Stemming filter analyzers... anyone have any favorites for particular search domains? Just wondering what people are using. I'm using Lucid K Stemmer and having issues. Seems like it misses a lot of common stems. We went to that because of excessively loose matches on the solr.PorterStemFilter

Re: stemming filter analyzers, any favorites?

2011-04-20 Thread Erick Erickson

You can get a better sense of exactly what tranformations occur when if you look at the analysis page (be sure to check the "verbose" checkbox). I'm surprised that "bags" doesn't match "bag", what does the analysis page say? Best Erick On Wed, Apr 20, 2011 at 1:44 PM, Robert Petersen wrote: > S

Bug in solr.KeywordMarkerFilterFactory?

2011-04-20 Thread Demian Katz

I've just started experimenting with the solr.KeywordMarkerFilterFactory in Solr 3.1, and I'm seeing some strange behavior. It seems that every word subsequent to a protected word is also treated as being protected. For testing purposes, I have put the word "spelling" in my protwords.txt. If I

Re: Bug in solr.KeywordMarkerFilterFactory?

2011-04-20 Thread Yonik Seeley

On Wed, Apr 20, 2011 at 2:01 PM, Demian Katz wrote: > I've just started experimenting with the solr.KeywordMarkerFilterFactory in > Solr 3.1, and I'm seeing some strange behavior. It seems that every word > subsequent to a protected word is also treated as being protected. You're right! This

RE: Solr - Multi Term highlighting issue

2011-04-20 Thread Ramanathapuram, Rajesh

Thanks Erick. I tried your suggestion, the issue still exists. http://localhost:8983/searchsolr/mainCore/select?indent=on&version=2.2&q=mec+us+chile&fq=storyid%3DXXX%22&start=0&rows=10&fl=*&qt=standard&wt=standard&explainOther=&hl=on&hl.fl=story%2C+slug&hl.fragsize=10&hl.highlightMultiTe

Re: Bug in solr.KeywordMarkerFilterFactory?

2011-04-20 Thread Robert Muir

No, this is only a bug in analysis.jsp. you can see this by comparing analysis.jsp's "dontstems bees" to using the query debug interface: "dontstems bees" "dontstems bees" PhraseQuery(text:"dontstems bee") text:"dontstems bee" On Wed, Apr 20, 2011 at 2:43 PM, Yonik Seeley wrote: > On We

RE: Bug in solr.KeywordMarkerFilterFactory?

2011-04-20 Thread Demian Katz

That's good news -- thanks for the help (not to mention the reassurance that Solr itself is actually working right)! Hopefully 3.1.1 won't be too far off, though; when the analysis tool lies, life can get very confusing! :-) - Demian > -Original Message- > From: Robert Muir [mailto:rcm

Re: ConcurrentLRUCache$Stats error

2011-04-20 Thread Chris Hostetter

: https://issues.apache.org/jira/browse/SOLR-1797 that issue doesn't seem to have anything to do with the stack trace reported... : > SEVERE: java.util.concurrent.ExecutionException: : > java.lang.NoSuchMethodError: : > org.apache.solr.common.util.ConcurrentLRUCache$Stats.add(Lorg/apache/solr/c

RE: stemming filter analyzers, any favorites?

2011-04-20 Thread Robert Petersen

I have been doing that, and for Bags example the trailing 's' is not being removed by the Kstemmer so if indexing the word bags and searching on bag you get no matches. Why wouldn't the trailing 's' get stemmed off? Kstemmer is dictionary based so bags isn't in the dictionary? That trailing

entity name issue

2011-04-20 Thread tjtong

Hi guys, I have encountered a problem with entity name, see the data config code below. the variable '${ea.a_aid}' was always empty. I suspect it is a namespace issue. Anyone knows how to bypass it? This is on oracle database. I had to use the prefix "myschema.", otherwise, the table name was no

Highest frequency terms for a subset of documents

2011-04-20 Thread Ofer Fort

Hi, I am looking for the best way to find the terms with the highest frequency for a given subset of documents. (terms in the text field) My first thought was to do a count facet search , where the query defines the subset of documents and the facet.field is the text field, this gives me the result

RE: Highest frequency terms for a subset of documents

2011-04-20 Thread Jonathan Rochkind

I think faceting is probably the best way to do that, indeed. It might be slow, but it's kind of set up for exactly that case, I can't imagine any other technique being faster -- there's stuff that has to be done to look up the info you want. BUT, I see your problem: don't use facet.method=en

Re: How to index MS SQL Server column with image type

2011-04-20 Thread Chris Hostetter

: Subject: How to index MS SQL Server column with image type : : Hi all, : : When I index a column(image type) of a table via * : http://localhost:8080/solr/dataimport?command=full-import* : *There is a error like this: String length must be a multiple of four.* For future refrence: full error

Re: Highest frequency terms for a subset of documents

2011-04-20 Thread Ofer Fort

thanks, but that's what i started with, but it took an even longer time and threw this: Approaching too many values for UnInvertedField faceting on field 'text' : bucket size=15560140 Approaching too many values for UnInvertedField faceting on field 'text : bucket size=15619075 Exception during fac

Re: Highest frequency terms for a subset of documents

2011-04-20 Thread Ofer Fort

seems like the facet search is not all that suited for a full text field. ( http://search.lucidimagination.com/search/document/178f1a82ff19070c/solr_severe_error_when_doing_a_faceted_search#16562790cda76197 ) Maybe i should go another direction. I think that the HighFreqTerms approach, just not su

Re: Highest frequency terms for a subset of documents

2011-04-20 Thread Chris Hostetter

: thanks, but that's what i started with, but it took an even longer time and : threw this: : Approaching too many values for UnInvertedField faceting on field 'text' : : bucket size=15560140 : Approaching too many values for UnInvertedField faceting on field 'text : : bucket size=15619075 : Excep

Re: Highest frequency terms for a subset of documents

2011-04-20 Thread Yonik Seeley

On Wed, Apr 20, 2011 at 7:34 PM, Chris Hostetter wrote: > > : thanks, but that's what i started with, but it took an even longer time and > : threw this: > : Approaching too many values for UnInvertedField faceting on field 'text' : > : bucket size=15560140 > : Approaching too many values for UnIn

Re: Highest frequency terms for a subset of documents

2011-04-20 Thread Ofer Fort

Thanks but i've disabled the cache already, since my concern is speed and i'm willing to pay the price (memory), and my subset are not fixed. Does the facet search do any extra work that i don't need, that i might be able to disable (either by a flag or by a code change), Somehow i feel, or rather

Re: How to return score without using _val_

2011-04-20 Thread Yonik Seeley

On Tue, Apr 19, 2011 at 11:41 PM, Bill Bell wrote: > I would like to influence the score but I would rather not mess with the q= > field since I want the query to dismax for Q. > > Something like: > > fq={!type=dismax qf=$qqf v=$qspec}& > fq={!type=dismax qt=dismaxname v=$qname}& > q=_val_:"{!type

Re: Highest frequency terms for a subset of documents

2011-04-20 Thread Ofer Fort

BTW, i'm using solr 1.4.1, does 3.1 or 4.0 contain any performance improvements that will make a difference as far as facet search? thanks again Ofer On Thu, Apr 21, 2011 at 2:45 AM, Ofer Fort wrote: > Thanks > but i've disabled the cache already, since my concern is speed and i'm > willing to p

Re: Highest frequency terms for a subset of documents

2011-04-20 Thread Yonik Seeley

On Wed, Apr 20, 2011 at 7:45 PM, Ofer Fort wrote: > Thanks > but i've disabled the cache already, since my concern is speed and i'm > willing to pay the price (memory) Then you should not disable the cache. >, and my subset are not fixed. > Does the facet search do any extra work that i don't ne

Re: Highest frequency terms for a subset of documents

2011-04-20 Thread Ofer Fort

my documents are user entries, so i'm guessing they vary a lot. Tomorrow i'll try 3.1 and also 4.0, and see if they have an improvement. thanks guys! On Thu, Apr 21, 2011 at 3:02 AM, Yonik Seeley wrote: > On Wed, Apr 20, 2011 at 7:45 PM, Ofer Fort wrote: > > Thanks > > but i've disabled the cach

Solr - upgrade from 1.4.1 to 3.1 - finding AbstractSolrTestCase binaries - help please?

2011-04-20 Thread Bob Sandiford

HI, all. I'm working on upgrading from 1.4.1 to 3.1, and I'm having some troubles with some of the unit test code for our custom Filters. We wrote the tests to extend AbstractSolrTestCase, and I've been reading the thread about the test-harness elements not being present in the 3.1 distributab

RE: Creating a TrieDateField (and other Trie fields) from Lucene Java

2011-04-20 Thread Craig Stires

Hi Yonik, The limitations I need to work within, have to do with the index already being built as part of an existing process. Currently, the Solr server is in read-only mode and receives new indexes daily from a Java application. The Java app runs Lucene/Tika and is indexing resources within t

The issue of import data from database using Solr DIH

2011-04-20 Thread Kevin Xiang

Hi all, I am a new to solr,I am importing data from database using DIH(solr 1.4).One document is made up of two entity,Every entity is a table in database. For example: Table1:have 3 fields; Table2:have 4 fields; If it is Ok,it will be 7 fields. But it is only 4 fields,it seem that solr don't merge

Apache Spam Filter Blocking Messages

2011-04-20 Thread Trey Grainger

Hey (solr-user) Mailing list admin's, I've tried replying to a thread multiple times tonight, and keep getting a bounce-back with this response: Technical details of permanent failure: Google tried to deliver your message, but it was rejected by the recipient domain. We recommend contacting the ot

Re: Apache Spam Filter Blocking Messages

2011-04-20 Thread Marvin Humphrey

On Thu, Apr 21, 2011 at 12:30:29AM -0400, Trey Grainger wrote: > (FREEMAIL_FROM,FS_REPLICA,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL Note the "HTML_MESSAGE" in the list of things SpamAssassin didn't like. > Apparently I s

Need to create dyanamic indexies base on different document workspaces

2011-04-20 Thread Gaurav Shingala

Hi, Is there a way to create different solr indexes for different categories? We have different document workspaces and ideally want each workspace to have its own solr index. Thanks, Gaurav

Re: Need to create dyanamic indexies base on different document workspaces

2011-04-20 Thread Chandan Tamrakar

It depends on your application design how you want your index There is a feature called solr core . http://wiki.apache.org/solr/CoreAdmin You could still have a single index but a field to differentiate the items in index thanks On Thu, Apr 21, 2011 at 10:55 AM, Gaurav Shingala < gaurav.shing

Re: old searchers not closing after optimize or replication

2011-04-20 Thread Bernd Fehling

Hi Erik, 1 0 Due to 44 minutes optimization time we do an optimization once a day during the night. I will try with an smaler index on my development system. Best regards, Bernd Am 20.04.2011 17:50, schrieb Erick Erickson: It looks OK, but still doesn't explain keeping the old files aroun

52 matches

Mail list logo