Re: Question on replication

2010-11-22 Thread Shawn Heisey
On 11/22/2010 5:45 PM, Mark wrote: After I perform a delta-import on my master the slave replicates the whole index which can be quite time consuming. Is there any way for the slave to replicate only partials that have changed? Do I need to change some setting on master not to commit/optimize t

sorl response xsd

2010-11-22 Thread Tri Nguyen
Hi,   I'm trying to look for the solr response xsd.   Is this it here?   https://issues.apache.org/jira/browse/SOLR-17   I'd basically want to know if the data import passed or failed.  I can get the xml string and search for "completed", but would wondering if I can use and xsd to parse the resp

Re:Re:Re: SnapPuller error : Unable to move index file

2010-11-22 Thread kafka0102
Does anyone care about the bug? At 2010-11-22 22:28:39,kafka0102 wrote: >sorry for my unclear question. >My solr's version is 1.4.1,and I maybe hit a solr's bug. >In my case,my slave's using index's directory is index.20101122031000.It was >generated at 2010-11-22 03:10:00 because of some r

Question on replication

2010-11-22 Thread Mark
After I perform a delta-import on my master the slave replicates the whole index which can be quite time consuming. Is there any way for the slave to replicate only partials that have changed? Do I need to change some setting on master not to commit/optimize to get this to work? Thanks

Re: SOLR and secure content

2010-11-22 Thread Savvas-Andreas Moysidis
maybe this older thread on Modeling Access Control might help: http://lucene.472066.n3.nabble.com/Modelling-Access-Control-td1756817.html#a1761482 Regards, -- Savvas On 22 November 2010 18:53, Jos Janssen wrote: > > Hi, > > We plan to make an application layer in PHP which will communicate to

Re: What tokenizer is good for breaking host names

2010-11-22 Thread Ahmet Arslan
> I have a "host" field in my documents which keep the host > from which the page > was crawled. for example, yahoo.com, or sports.yahoo.com. I > want this field to > be searchable so if I search yahoo, I can find > sports.yahoo.com. > > I have used these tokenizers and it does not work: > >

Re: Shingles and Delimiter Help

2010-11-22 Thread Jessy Kate
fantastic, thanks! i'll update the release and keep my fingers crossed. many thanks for the speedy response. jessy On Mon, Nov 22, 2010 at 4:53 PM, Steven A Rowe wrote: > Hi Jessy, > > Several ShingleFilter(Factory) improvements, including the ability to > specify minShingleSize, were introduce

What tokenizer is good for breaking host names

2010-11-22 Thread sara motahari
Hello Solr community, I have a "host" field in my documents which keep the host from which the page was crawled. for example, yahoo.com, or sports.yahoo.com. I want this field to be searchable so if I search yahoo, I can find sports.yahoo.com. I have used these tokenizers and it does not work:

RE: Shingles and Delimiter Help

2010-11-22 Thread Steven A Rowe
Hi Jessy, Several ShingleFilter(Factory) improvements, including the ability to specify minShingleSize, were introduced on the Solr/Lucene 3.x, and so are not available in Solr 1.4.X/Lucene 2.9.X. (This is your #1 issue.) For details about the changes and when they were introduced: http://wiki

Shingles and Delimiter Help

2010-11-22 Thread Jessy Kate
Hello Solr community, I'm using Solr for an app to index documents, with shingles to index n-grams (right now 2- 3- and 4-grams). this is solr 1.4.1 with lucene 2.9.3. i'm having two challenges: 1. the shingles configuration is not respecting the lower limit set in the config file: I still see

Using WhitespaceTokenizer but still wanting to match when all fields are concatenated

2010-11-22 Thread Eric Caron
Problem: Indexed phrase: JetBlue Airlines Ideal matching queries: jetblue, "jet blue" "jetblue airway", "jetblue company" I'd like to be able to use synonyms (to convert airway to airline), stopwords (to drop "company"), strip periods and use ASCII folding, and split on case. I'm close with the f

git repo for branch_3x + SOLR-1873 (Solr Cloud)

2010-11-22 Thread Jeremy Hinegardner
Hi all, I've done an initial backport of SOLR-1873 (Solr Cloud) to branch_3x. I will do merges from branch_3x periodically. Currently this passes all tests. https://github.com/collectiveintellect/lucene-solr/tree/branch_3x-cloud We need a stable Solr Cloud system and this was our best gu

Re: Facet - Range Query issue

2010-11-22 Thread Solr User
Eric, I solved the issue by adding fq parameter in the query. Thank you so much for your reply. Thanks, Murali On Mon, Nov 22, 2010 at 1:51 PM, Erick Erickson wrote: > Well, without seeing the changes you made to the schema, it's hard to tell > much. > Also, could you define "not work"? What, e

Re: SOLR and secure content

2010-11-22 Thread Jos Janssen
Hi, We plan to make an application layer in PHP which will communicate to the solr server. Direct calls will only be made for administration purposes only. regards, jos -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1947970.html Sent fro

Re: Facet - Range Query issue

2010-11-22 Thread Erick Erickson
Well, without seeing the changes you made to the schema, it's hard to tell much. Also, could you define "not work"? What, exactly, fails to do what you expect? But the first question I have is "did you reindex after changing your schema?". And have you checked your index to verify that there valu

Facet - Range Query issue

2010-11-22 Thread Solr User
Hi, I am having issue with querying and using facet. This was working fine earlier: /spell/?q=(sun) AND (pubyear:[1991 TO 2011])&rows=9&facet=true&facet.limit=-1&facet.mincount=1&facet.field=author&facet.field=pubyear&facet.field=format&facet.field=series&facet.field=season&facet.field=imprint&f

Re: Special Characters

2010-11-22 Thread Erick Erickson
Hmmm, good point on WordDelimiterFilterFactory. You're right, that should work. Although there'd still be a problem with J. R. R. never matching jrr. But that wouldn't be solved by Pattern either. I'd try to define the problem away ... good catch Erick On Mon, Nov 22, 2010 at 12:15 PM, Shawn

Re: Can a URL based datasource in DIH return non xml

2010-11-22 Thread lee carroll
Hi Erik, Thank you for the response. Just for completeness of the thread I'm going to process the xhtml off-line. Another approach could be to set up a web service which DIH could call which returned xml from a html parser. However for my purposes its just as easy to use curl and perl and then use

Re: SOLR and secure content

2010-11-22 Thread Savvas-Andreas Moysidis
Hi, Could you elaborate a bit more on how you access Solr? are you making direct Solr calls or is the communication directed through an application layer? On 22 November 2010 11:05, Jos Janssen wrote: > > Hi, > > We are currently investigating how to setup a correct solr server for our > goals.

Re: Special Characters

2010-11-22 Thread Shawn Heisey
On 11/22/2010 7:40 AM, Erick Erickson wrote: As I remember, PatternReplace... isn't in 1.4, so you'd have to move to 3.x or trunk. You could always write a custom class that did what you wanted, it's actually pretty easy. PatternReplaceCharFilterFactory isn't in 1.4, but PatternReplaceFilterFa

DisMaxQParserPlugin and Tokenization

2010-11-22 Thread jan.kurella
Hi, Using the SearchHandler with the deftype=”dismax” option enables the DisMaxQParserPlugin. From investigating it seems, it is just tokenizing by whitespace. Although by looking in the code I could not find the place, where this behavior is enforced? I only found, that for each field the

SOLR and secure content

2010-11-22 Thread Jos Janssen
Hi, We are currently investigating how to setup a correct solr server for our goals. The problem i'm running into is how to design the solr setup so that we can check if a user is authenticated for viewing the document. Let me explain the situation. We have a website with some pages and documen

RE: passing arguments to analyzer/filter at runtime

2010-11-22 Thread jan.kurella
Hi, yes this is one of my four options I am going to evaluate. Why your suggestion might be problematic: We have ca. 12 language sensitive fields and support ca. 200 distinct languages = 2400 fields a multifield/dismax query spanning 2400 fields might become problematic? We will go for this ap

Re: Problem with synonyms

2010-11-22 Thread Yonik Seeley
On Mon, Nov 22, 2010 at 10:29 AM, Yonik Seeley wrote: > On Sat, Nov 20, 2010 at 5:59 AM, sivaprasad > wrote: >> Even after expanding the synonyms also i am unable to get same results. > > What you are trying to do should work with index-time synonym expansion. > Just make sure to remove the syno

RE: Empty value/string matching

2010-11-22 Thread Bob Sandiford
One possibility to consider - if you really need documents with specifically empty or non-defined values (if that's not an oxymoron :)), and you have control over the values you send into the indexing, you could set a special value that means 'no value'. We've done that in a similar vein, using

Re: Problem with synonyms

2010-11-22 Thread Yonik Seeley
On Sat, Nov 20, 2010 at 5:59 AM, sivaprasad wrote: > Even after expanding the synonyms also i am unable to get same results. What you are trying to do should work with index-time synonym expansion. Just make sure to remove the synonym filter at query time (or use a synonym filter w/o multi-word s

Re: Dismax - Boosting

2010-11-22 Thread Ahmet Arslan
> In the past we used /spell and if there is not match then > we use to get a > list of suggestions and then we use to make another call > with the first > suggestion to get search results. After that we show user > both suggestions > for the spelling mistake and results of the first > suggestion.

Re: Problem with synonyms

2010-11-22 Thread sivaprasad
In synonyms.txt file i have the below synonyms. ipod, i-pod, i pod If expand==false during the index time, Is it going to replace all the occurences of "i-pod", "i pod" with "ipod" ? -- View this message in context: http://lucene.472066.n3.nabble.com/Problem-with-synonyms-tp1905051p1946336

Re: passing arguments to analyzer/filter at runtime

2010-11-22 Thread Markus Jelsma
Hi, I wouldn't use a multiValued field for this because you then you would have the same analyzers (and possibly stemmers) for different languages. The usual method is to have fieldTypes for each language (en_text, de_text etc) and then create specific fields that map to them (en_content, de_co

passing arguments to analyzer/filter at runtime

2010-11-22 Thread jan.kurella
Hi, I’m trying to find a solution to search only in a given language. On index time the language is known per string to be tokenized so I would like to write a filter that prefixes each token according to its language. First question: how to pass the language argument to the filter best? I’m go

Re: Special Characters

2010-11-22 Thread Erick Erickson
As I remember, PatternReplace... isn't in 1.4, so you'd have to move to 3.x or trunk. You could always write a custom class that did what you wanted, it's actually pretty easy. Best Erick On Mon, Nov 22, 2010 at 8:37 AM, Solr User wrote: > Hi Eric, > > I use solr version 1.4.0 and below is my

Re:Re: SnapPuller error : Unable to move index file

2010-11-22 Thread kafka0102
sorry for my unclear question. My solr's version is 1.4.1,and I maybe hit a solr's bug. In my case,my slave's using index's directory is index.20101122031000.It was generated at 2010-11-22 03:10:00 because of some reasons(It's not important). And at 2010-11-22 15:10:00,the slave got a replicatio

Jetwick Twitter Search now Open Source

2010-11-22 Thread Peter Karich
Jetwick is now available under the Apache 2 license: http://www.pannous.info/2010/11/jetwick-is-now-open-source/ Regards, Peter. PS: features http://www.pannous.info/products/jetwick-twitter-search/ installation https://github.com/karussell/Jetwick/wiki for devs http://karussell.wordpress.com/

DisMaxQParserPlugin and Tokenization

2010-11-22 Thread jan.kurella
Hi, Using the SearchHandler with the deftype=”dismax” option enables the DisMaxQParserPlugin. From investigating it seems, it is just tokenizing by whitespace. Although by looking in the code I could not find the place, where this behavior is enforced? I only found, that for each field the

Re: Spell-Check Component Functionality

2010-11-22 Thread Grant Ingersoll
On Nov 21, 2010, at 7:14 AM, rajini maski wrote: > If any one know articles or blog on solr spell-check component configuration > type..please let me know..solr-wiki not helping me solve maze.. Might be helpful: http://www.lucidimagination.com/blog/2010/08/31/getting-started-spell-checking-with

Re: How to write custom component

2010-11-22 Thread Grant Ingersoll
On Nov 22, 2010, at 6:21 AM, sivaprasad wrote: > > Hi, > > I want to write a custom component which will be invoked before the query > parser.The out put of this component should go to the query parser. Probably best to start with http://wiki.apache.org/solr/SolrPlugins. Also, have a look at

Re: Special Characters

2010-11-22 Thread Solr User
Hi Eric, I use solr version 1.4.0 and below is my schema.xml It creates 3 tokens j r r tolkien works fine but not jrr tolkien. I will read about PatternReplaceCharFilterFactory and try it. Please let me know if I need to do anything differently. Thanks, Solr User On Mon,

Re: Special Characters

2010-11-22 Thread Erick Erickson
What version of Solr are you using? You can think about PatternReplaceCharFilterFactory if you're using the right version of Solr. But you have other problems than that. Let's claim you get the periods removed. Do you tokenize three tokens or one? I.e. jrr or j r r? In the latter case your sea

Re: sort desc and out of memory exception

2010-11-22 Thread Erick Erickson
Needmorecoffee That link should have been: http://wiki.apache.org/solr/UsingMailingLists Erick On Mon, Nov 22, 2010 at 8:03 AM, Erick Erickson wrote: > Peter's point is that sorting on a tokenized field is meaningless. Say you > index "e

Re: SnapPuller error : Unable to move index file

2010-11-22 Thread Erick Erickson
what op system are you on? what version of Solr? what filesystem? It's really hard to help without more information, you might want to review: http://wiki.apache.org/solr/UsingMailingLists Best Erick 2010/11/22 kafka0102 > my replication got errors like : > Unable to move index file from: > /h

Re: Phrase Search & Multiple Keywords with Double quotes

2010-11-22 Thread Erick Erickson
In general, just escape things. See: http://lucene.apache.org/java/2_4_0/queryparsersyntax.html#Escaping Special Characters But I have to say that you might want to consider carefully whether this is a good ide

Re: sort desc and out of memory exception

2010-11-22 Thread Erick Erickson
Peter's point is that sorting on a tokenized field is meaningless. Say you index "erick xu peter" and it's tokenized. You have three tokens: "erick", "xu", and "peter". What does sorting mean now? Should the document be in the e's? x's? p's? So if you're sorting on a tokenized field, trying to und

Re: Can a URL based datasource in DIH return non xml

2010-11-22 Thread Erick Erickson
DIH does some good stuff, but it doesn't handle bad input very robustly (actually, how could it intuit what "the right thing" is?). I'd consider SolrJ coupled with a "forgiving" HTML parser, e.g. http://sourceforge.net/projects/nekohtml/ Best Erick On Su

Special Characters

2010-11-22 Thread Solr User
Hi, I am searching for j.r.r. tolkien and getting results back but if I search for jrr I am not getting any results. Also not getting any results if I am searching for jrr tolkien. I am using AND as the default operator. The search results should work for both j.r.r. tolkien and jrr tolkien. Wha

Re: Dismax - Boosting

2010-11-22 Thread Solr User
Hi Ahmet, In the past we used /spell and if there is not match then we use to get a list of suggestions and then we use to make another call with the first suggestion to get search results. After that we show user both suggestions for the spelling mistake and results of the first suggestion. I th

How to write custom component

2010-11-22 Thread sivaprasad
Hi, I want to write a custom component which will be invoked before the query parser.The out put of this component should go to the query parser. How can i configure it in solrConfig.xml How can i get SynonymFilterFactory object programmatically. Please share your ideas. Regards, Siva -- Vie

SnapPuller error : Unable to move index file

2010-11-22 Thread kafka0102
my replication got errors like : Unable to move index file from: /home/data/tuba/search-index/eshequn.post.db_post/index.20101122034500/_21.frq to: /home/data/tuba/search-index/eshequn.post.db_post/index.20101122031000/_21.frq I looked at log and found the last slave replication commit before t