Re: foreign characters equivalent in solr search

2009-02-18 Thread radarghost
it may takes too long for Solr 1.4 any other solution for Solr 1.2? anyway thanks for the reply. Koji Sekiguchi-2 wrote: > > CharFilter will solve the problem, but it comes with Solr 1.4. > > https://issues.apache.org/jira/browse/SOLR-822 > > Koji > > AHMET ARSLAN wrote: >> I think best wa

Re: foreign characters equivalent in solr search

2009-02-18 Thread radarghost
thanks we will try that and post the results here but it seems we may get problem with highlight function. Ahmet Arslan wrote: > > I think best way to do this is to modify > org.apache.lucene.index.memory.SynonymTokenFilter and employ this filter > index time. > > if token.termBuffer() has o

Re: Unified search of relational data on Solr?

2009-02-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
do you wish to search on the image names or is it that you only wish to read the image details --Noble On Thu, Feb 19, 2009 at 12:31 PM, Kalidoss MM wrote: > Even in my case, we cant make it flattern, Bcoz we are managing total image > gallery information in Solr, So image gallery contains aroung

Re: Unified search of relational data on Solr?

2009-02-18 Thread Kalidoss MM
Even in my case, we cant make it flattern, Bcoz we are managing total image gallery information in Solr, So image gallery contains aroung 20 images also with image descrption, thumbnail info, width, height, etc also we want to store/update the stats along with image gallery, If we flatten the xml,

Re: Unified search of relational data on Solr?

2009-02-18 Thread Otis Gospodnetic
Hi, Just flatten it - create a single Person + Address entity (document) and index it. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch From: Senthil Kumar To: solr-user@lucene.apache.org Sent: Thursday, February 19, 2009 1:20:23 PM Subject:

Unified search of relational data on Solr?

2009-02-18 Thread Senthil Kumar
Hi, How to index relational data in Solr which can not be merged as a single file for some reasons? We have two kinds of XMLs indexed in Solr, 1_persona 1_addr washington Our aim to get a list of persons living in Washing

Re: utf 8 issue

2009-02-18 Thread revathy arun
Hi Eril, $post_string is xml data i dont see any content for those files when i give *:* .what would that mean? On 2/19/09, Erik Hatcher wrote: > > > On Feb 18, 2009, at 1:53 PM, revathy arun wrote: > >> I am using php curl to post data to solr >> >> container tomcat >> i have uriencoding s

Re: why don't we have a forum for discussion?

2009-02-18 Thread Shashi Kant
Steve - could you not just subscribe to the list from another (off-mobile device) email (Gmail or Yahoo) for example? We discourage using corporate email for subscribing mailing lists precisely for such reasons : volume, spam, malware risks etc. Shashi - Original Message From: Steph

Re: utf 8 issue

2009-02-18 Thread Erik Hatcher
On Feb 18, 2009, at 1:53 PM, revathy arun wrote: I am using php curl to post data to solr container tomcat i have uriencoding set to utf8 in tomcats server.xml file this is how its indexed $header[] = "Content-Type: text/xml; charset=utf-8"; curl_setopt($ch, CURLOPT_URL,$url); curl_set

Re: why don't we have a forum for discussion?

2009-02-18 Thread Peter Wolanin
If some stuff is asked over and over again, it would be great to grab some reasonable responses and add them to the wiki. I've edited it a few times when I've struggled with what's there and found something that wasn't covered or was out of date - even the best forum or mailing list will not repli

Re: why don't we have a forum for discussion?

2009-02-18 Thread Erik Hatcher
On Feb 18, 2009, at 7:31 PM, Chris Hostetter wrote: 2) There is nothing preventing people who want to start alternate online forums for discussing Solr from doing so. (I recently learned there is even a #solr IRC channel that gets moderate use by some members of hte community). I lurk in

Re: why don't we have a forum for discussion?

2009-02-18 Thread Stephen Weiss
Like an earlier poster, my issue isn't on the laptop, it's with my mobile device. The sheer volume of e-mail overwhelms the thing sometimes (right now, for instance). There's really no option for moving the e-mail off to some other folder, it just all goes to one place. Perhaps that mea

Re: why don't we have a forum for discussion?

2009-02-18 Thread Chris Hostetter
: I am just curious why we don't have a forum for discussion or you guys think : it's really necessary to receive lots of crap information about Solr and : nutch in email? I can offer you a forum for discussion anyway. leaving out my personal opinions on SMTP based mailing lists vs HTTP based fo

RE: why don't we have a forum for discussion?

2009-02-18 Thread Smiley, David W.
I definitely agree with the sentiments of other that Nable et. al replace the need for a separate forum given that we have an active mailing list already. Martin, you claimed a forum would meet the need of people asking questions over and over. In my opinion, neither email nor forums are ideal

Re: why don't we have a forum for discussion?

2009-02-18 Thread Mike Klaas
On 18-Feb-09, at 2:09 PM, Stephen Weiss wrote: I third the motion SOLR is the second largest contributor to my e-mail glut (my company's marketing is #1). I often have no idea what area of Solr I'm actually asking about when I have a question, so I would disagree and say a general for

LocalSolr distributed search

2009-02-18 Thread Rajiv2
Hello, I'm currently using LocalSolr in my project and coming across some issues with making the LocalSolrQueryComponent work w/ distributed search. I'm using version LocalSolr 2.0 and Solr 1.3. Can someone point me in the right direction on how to modify this component to work with Distribu

Re: foreign characters equivalent in solr search

2009-02-18 Thread Koji Sekiguchi
CharFilter will solve the problem, but it comes with Solr 1.4. https://issues.apache.org/jira/browse/SOLR-822 Koji AHMET ARSLAN wrote: I think best way to do this is to modify org.apache.lucene.index.memory.SynonymTokenFilter and employ this filter index time. if token.termBuffer() has one

Re: why don't we have a forum for discussion?

2009-02-18 Thread Martin Lamothe
E-mails wouldn't go away with a discussion forum as they have e-mail notifications tooit could compliment this mailing list... some stuff is asked over and over and over ... isn't it? With a forum, it would be possible to say.. go see this post.. .or that thread.. etc... Multi-core could use

Re: why don't we have a forum for discussion?

2009-02-18 Thread Shashi Kant
one man's "crap" is another man's treasure. :-P So how would you decide what is worth posting? If you feel the list is overwhelming your email, set some filters. Shashi - Original Message From: Tony Wang To: solr-user@lucene.apache.org Sent: Wednesday, February 18, 2009 2:06:57 PM S

Re: why don't we have a forum for discussion?

2009-02-18 Thread Matthew Runo
At the risk of sounding "me too"... me too! Email is something I already use throughout the day - it's easy to pop over into the folder I send all the solr-user mail to and quickly scan the subject lines. Nabble is great for searching though.. I only have 12,126 of the solr- user messages

Re: why don't we have a forum for discussion?

2009-02-18 Thread Erik Hatcher
On Feb 18, 2009, at 5:09 PM, Stephen Weiss wrote: But almost anything would be better than the current situation. This list is SOLR's best documentation so I wouldn't want to just stop getting it (and stuff just goes unnoticed in digests), but it could be presented better. A forum with a

Re: why don't we have a forum for discussion?

2009-02-18 Thread Walter Underwood
I really prefer a mailing list. If I had to visit a website to contribute, my participation would go to zero. I might not be typical -- I've been handling a few hundred messages a day for the past twenty five years. wunder (e-mail is the killer app) On 2/18/09 2:09 PM, "Stephen Weiss" wrote: >

Re: why don't we have a forum for discussion?

2009-02-18 Thread Stephen Weiss
I third the motion SOLR is the second largest contributor to my e- mail glut (my company's marketing is #1). I often have no idea what area of Solr I'm actually asking about when I have a question, so I would disagree and say a general forum provides a place to post when you don't reall

Re: Solr training at ApacheCon Europe

2009-02-18 Thread Erik Hatcher
Thanks for the interest from Vernon, Markus, and Jon. Note that beyond ApacheCon's, our company, Lucid Imagination, offers Solr (and Lucene) training. We can customize training to suit your needs. If you'd like more information or to request a training in your organization, e-mail us at

Re: why don't we have a forum for discussion?

2009-02-18 Thread Jon Baer
I don't think "general" discussion forums really help ... it would be great if every major page in the Solr wiki had a discuss link off to somewhere though +1 for that ... Ie: http://wiki.apache.org/solr/SolrRequestHandler http://wiki.apache.org/solr/SolrReplication etc. For me even panning

Re: Reading Core-Specific Config File in a Row Transformer

2009-02-18 Thread wojtekpia
Thanks Shalin. I think you missed the call to .getResourceLoader(), so it should be: context.getSolrCore().getResourceLoader().getInstanceDir() Works great, thanks! Shalin Shekhar Mangar wrote: > > > You can use Context.getSolrCore().getInstanceDir() > > -- View this message in context:

Re: why don't we have a forum for discussion?

2009-02-18 Thread Martin Lamothe
Yep, I second the motion. This mailing list overloads my poor BB curve. -M 2009/2/18 Tony Wang > I am just curious why we don't have a forum for discussion or you guys > think > it's really necessary to receive lots of crap information about Solr and > nutch in email? I can offer you a forum fo

Re: why don't we have a forum for discussion?

2009-02-18 Thread Mike Klaas
On 18-Feb-09, at 11:06 AM, Tony Wang wrote: I am just curious why we don't have a forum for discussion or you guys think it's really necessary to receive lots of crap information about Solr and nutch in email? I can offer you a forum for discussion anyway. If you want to follow solr-user

why don't we have a forum for discussion?

2009-02-18 Thread Tony Wang
I am just curious why we don't have a forum for discussion or you guys think it's really necessary to receive lots of crap information about Solr and nutch in email? I can offer you a forum for discussion anyway. -- Are you RCholic? www.RCholic.com 温 良 恭 俭 让 仁 义 礼 智 信

Good strategy for news in Solr?

2009-02-18 Thread Jon Baer
Ive spent a few months trying different techniques w/ regards to searching just news articles w/ players and can't seem to find the perfect setup. Normally I take into consideration date (frequency + recently published), title (which boosts on relevancy) and general mm in body text (and s

Re: utf 8 issue

2009-02-18 Thread revathy arun
I am using php curl to post data to solr container tomcat i have uriencoding set to utf8 in tomcats server.xml file this is how its indexed $header[] = "Content-Type: text/xml; charset=utf-8"; curl_setopt($ch, CURLOPT_URL,$url); curl_setopt( $ch, CURLOPT_HTTPHEADER, $header ); curl_se

Re: Snowball and protected words

2009-02-18 Thread Walter Underwood
You can define exceptions in the Snowball language and generate a new stemmer. See the examples here: http://snowball.tartarus.org/algorithms/english/stemmer.html wunder On 2/18/09 9:56 AM, "Erik Hatcher" wrote: > > On Feb 18, 2009, at 12:40 PM, Leonardo Dias wrote: >> Is there a way to make

Re: Snowball and protected words

2009-02-18 Thread Erik Hatcher
On Feb 18, 2009, at 12:40 PM, Leonardo Dias wrote: Is there a way to make the snowball algorithm work with a protwords.txt file? Currently, and unfortunately, no - the protected words feature is not available the SnowballPorterFilterFactory.It wouldn't take much effort to bring that c

Snowball and protected words

2009-02-18 Thread Leonardo Dias
Hi there! Is there a way to make the snowball algorithm work with a protwords.txt file? EnglishPorter works fine. It would be great if the snowball algorithm could do the same to avoid searches with irrelevant results. Best, Leonardo

Re: Solr training at ApacheCon Europe

2009-02-18 Thread Markus Jelsma - Buyways B.V.
This is very interesting! Sadly, however, we ought the have finished our Solr implementation by then. I'd prefer the conference to take place a month earlier! At least i'll try to attend and persuade my boss to pay the bill! Cheers On Wed, 2009-02-18 at 11:33 -0500, Erik Hatcher wrote: > Dea

bq type_:true for two types doesn't come up books.

2009-02-18 Thread sunnyfr
Hi, I don't get: I added a bq boost, the point is i've some book which are normal, some which are type_roman or type_comedy and other type but I would like to boost both of this type for every books indexed. So if I do : &bq=type_roman:true^1,5+type_comedy:true^1,5 no video come up but if I do

Re: Solr training at ApacheCon Europe

2009-02-18 Thread Vernon Chapman
Erik Hatcher wrote: Dear Solr Users - I am offering a one day Solr training (titled Solr Boot Camp) at ApacheCon Europe on March 24. The class is designed to cover Solr from start to finish - installing, indexing your content numerous ways (XML, CSV, database, client API, DataImportHandler,

Solr training at ApacheCon Europe

2009-02-18 Thread Erik Hatcher
Dear Solr Users - I am offering a one day Solr training (titled Solr Boot Camp) at ApacheCon Europe on March 24. The class is designed to cover Solr from start to finish - installing, indexing your content numerous ways (XML, CSV, database, client API, DataImportHandler, etc), how to use

Re: foreign characters equivalent in solr search

2009-02-18 Thread AHMET ARSLAN
I think best way to do this is to modify org.apache.lucene.index.memory.SynonymTokenFilter and employ this filter index time. if token.termBuffer() has one those (á, à, â, ä, ã, å) characters you will replace it with its equvalent ascii character (a). Then you will inject this new Token as a S

Re: boost qf weight between 0 and 10

2009-02-18 Thread sunnyfr
Obviously it should be qb and not bf it looks better. Is there everything in the wiki because I read it but I'm still a bit confused about it. sunnyfr wrote: > > Hi, > > I don't get really, I try to boost a field according to another one but > I've a huge weight when I'm using qf boost l

Re: semicolon causing Missing sort order exception

2009-02-18 Thread Yonik Seeley
On Wed, Feb 18, 2009 at 10:07 AM, AHMET ARSLAN wrote: >> What version of Solr are you using? > I am using latest release (apache-solr-1.3.0) As Erik points out, it's the legacy sort syntax. set defType to "lucene" as a default parameter to fix that. -Yonik http://www.lucidimagination.com

boost qf weight between 0 and 10

2009-02-18 Thread sunnyfr
Hi, I don't get really, I try to boost a field according to another one but I've a huge weight when I'm using qf boost like : /select?qt=dismax&fl=*&q="obama meeting"&debugQuery=true&qf=title&bf=product(title,stat_views) I will have : 5803681.0 = (MATCH) sum of: 4.9400806 = weight(title:"obam

Re: Query regarding setTimeAllowed(Integer) and setRows(Integer)

2009-02-18 Thread Walter Underwood
Solr and Lucene are very efficient at basic ranking and retrieval. Sorting and faceted search take more CPU. Most of your speed improvement will come from caching, so set aside some time for cache tuning. You need real query logs for that. wunder On 2/18/09 7:31 AM, "Sean Timm" wrote: > This p

Re: IndexMergeTool produces empty index

2009-02-18 Thread Stuart Sierra
I'm using lucene-core-2.4-dev.jar from the Solr 1.3.0 distribution. Solr doesn't include lucene-misc, so I used lucene-misc-2.4.jar from the Lucene 2.4.0 distribution. But I had the exact same problem when I wrote my own index merge tool using just the Solr distribution jars. -Stuart Sierra On

Re: Query regarding setTimeAllowed(Integer) and setRows(Integer)

2009-02-18 Thread Sean Timm
This page gives lots of performance pointers. http://wiki.apache.org/solr/SolrPerformanceFactors -Sean Jana, Kumar Raja wrote: Thanks Sean. That clears up the timer concept. Is there any other way through which I can make sure that the server time is not wasted? -Original Message- Fr

Re: IndexMergeTool produces empty index

2009-02-18 Thread Erik Hatcher
Are you sure you're using the Lucene JARs (both core and misc) that came with Solr 1.3 and not, as you said, the ones from a Lucene 2.4 distribution? Erik On Feb 18, 2009, at 10:17 AM, Stuart Sierra wrote: Hello, I'm having trouble merging indexes with with IndexMergeTool. I use S

IndexMergeTool produces empty index

2009-02-18 Thread Stuart Sierra
Hello, I'm having trouble merging indexes with with IndexMergeTool. I use Solr 1.3 to build two separate indexes. Then I shut down Solr. The indexes generated by Solr look ok. I can read them with a Lucene IndexSearcher, and even open up the index files and see the text of my documents. Next I

Re: semicolon causing Missing sort order exception

2009-02-18 Thread AHMET ARSLAN
> What version of Solr are you using? I am using latest release (apache-solr-1.3.0)

Re: Input XML duplicate fields uniqueness

2009-02-18 Thread Adi_Jinx
Shalin Shekhar Mangar wrote: > > How about creating a Solr document for each account and adding the recid > and > updt attributes from the record tag? > > -- > Regards, > Shalin Shekhar Mangar. > > Done... In fact I used this idea and its working fine.. Thanks a ton -- View this message

Re: solr 1.4 - boost query from where it finds the word(s)

2009-02-18 Thread sunnyfr
Hi Grant, It doesn't seems to work ? What's wrong with that I done? &bf=product(title^2,stat_views) Thanks Grant Ingersoll-6 wrote: > > You might be able to with FunctionQueries, especially the relatively > new and underpromoted ability that Yonik added to use them to multiply > in scori

Re: semicolon causing Missing sort order exception

2009-02-18 Thread Erik Hatcher
semicolon is legacy syntax in Solr only for specifying a sort. It is not part of the actual query parser syntax, but rather parsed separately. What version of Solr are you using? The semicolon sort support in the query string is supposed to be deprecated/removed, I believe, in the defau

Re: solr 1.4 - boost query from where it finds the word(s)

2009-02-18 Thread sunnyfr
Sorry, which function is it ?? thanks, Grant Ingersoll-6 wrote: > > You might be able to with FunctionQueries, especially the relatively > new and underpromoted ability that Yonik added to use them to multiply > in scoring instead of adding. > > See http://wiki.apache.org/solr/FunctionQue

foreign characters equivalent in solr search

2009-02-18 Thread radarghost
we are using solr 1.2 and dont want to upgrade to 1.3 till official release for Debian. i want solr to search for equivalent of a foreign chracter for getting better results in example: if a user searches for Tiesto which is indexed in this format Tiësto in our solr. we want solr also return res

semicolon causing Missing sort order exception

2009-02-18 Thread AHMET ARSLAN
Today I found something very interesting. When i search a word (solr; lucene) ending with a semicolon plus any character (from solr admin page), solr gives an exception (HTTP Status 400 - Missing sort order). When I escaped semicolon (solr\;lucene) exception gone. I checked Lucene Special Charac

Re: making changes to solr schema after deployed to production

2009-02-18 Thread Grant Ingersoll
It really depends on the change. Typically, adding fields is fine, but of course, it means that you will only be able to search those fields in the new documents. Other changes often require re- indexing. Change the semantics of a field (i.e. changing FieldType) will require re-indexing.

Re: solr 1.3 analyzers

2009-02-18 Thread AHMET ARSLAN
> i see filterfactories for other languages like dutch > ,french,barzialian etc but no tokenizer. in this scenario are we >supposed > to use the standard tokenizer and the corresponding language >filters. Yes. Exactly the same as what Lucene Analyzers do. >Lucene has the analyzers for the same

Re: utf 8 issue

2009-02-18 Thread Erik Hatcher
On Feb 18, 2009, at 7:34 AM, revathy arun wrote: I am trying to index various langauge documents (foroyo,chinese,japanese) .These have been converted from pdf to text using xpdf I am using the standard anlyzer for content analysis ,but i am not able to search anything from some of the files

Re: utf 8 issue

2009-02-18 Thread Gert Brinkmann
revathy arun wrote: > Is there any way to check the encoding of a text/pdf document or convert > them to utf -8 encoding? If you are using pdftotext you could set the enc parameter: pdftotext -enc UTF-8 filename How can you convert PDFs to text via xpdf programmatically? Greetings, Gert

utf 8 issue

2009-02-18 Thread revathy arun
Hi , I am trying to index various langauge documents (foroyo,chinese,japanese) .These have been converted from pdf to text using xpdf I am using the standard anlyzer for content analysis ,but i am not able to search anything from some of the files. My guess is that these documents are not in utf-

make the suggested ignored field multi-valued?

2009-02-18 Thread Peter Wolanin
In the example schema.xml, there is a field type 'ignored' which it is suggested can be used with the wildcard * to prevent errors when a document contains fields that don't match any in the schema. My experience recently in using this is that it does not worked as desired if the unmatched field

solr 1.3 analyzers

2009-02-18 Thread revathy arun
HI , In the solr 1.3 under src/classes/java/analyzers i see only the following language specific tokenizer chinestokenizer cjktokenizer russiantokenizer but i see filterfactories for other languages like dutch ,french,barzialian etc but no tokenizer in this scenario are we supposed to use the s

RE: Updating the solr index

2009-02-18 Thread Sagar Khetkade
thanks a lot Noble. `Sagar > Date: Wed, 18 Feb 2009 14:45:13 +0530> Subject: Re: Updating the solr index> > From: noble.p...@gmail.com> To: solr-user@lucene.apache.org> > The patch > currently does not work . SOLR-828 is supposed to duplicate> this. But this > is a huge change and a patch is

Re: multicore

2009-02-18 Thread revathy arun
I am sorry ,but i did not get what you meant by Integer.MAX_VALUE Does the core take up more RAM than the regular webapps? that is if i were to have 3 webapps ,would the requirement for ram be more in this case or if i were to have 3 cores in a single webapp or would this be the same? Regards

Re: multicore

2009-02-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
there are no limits . It must be Integer.MAX_VALUE the limits are usually decided by the number of file handles the system can open and the amount of RAM cpu you may have On Wed, Feb 18, 2009 at 2:15 PM, revathy arun wrote: > Is there any known limit to number of cores that can be create on a si

Re: Data Directory Sync.

2009-02-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
You will have to shutdown your solr before you can do that. On Wed, Feb 18, 2009 at 1:39 PM, Kalidoss MM wrote: > Hi, > > I think i can use http://wiki.apache.org/solr/MergingSolrIndexes - to > index two different solr index directory?? > Thanks, > kalidoss.m, > > On Thu, Jan 29, 2009 at 8:57 P

Re: Updating the solr index

2009-02-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
The patch currently does not work . SOLR-828 is supposed to duplicate this. But this is a huge change and a patch is still not ready ,so it is pushed to 1.5. On Wed, Feb 18, 2009 at 2:32 PM, Sagar Khetkade wrote: > > Hi, > > The question is about the SOLR path 139 in jira. There the issue is open

multicore

2009-02-18 Thread revathy arun
Is there any known limit to number of cores that can be create on a single webapp. What are possible limiting factors? Regards

Re: Data Directory Sync.

2009-02-18 Thread Kalidoss MM
Hi, I think i can use http://wiki.apache.org/solr/MergingSolrIndexes - to index two different solr index directory?? Thanks, kalidoss.m, On Thu, Jan 29, 2009 at 8:57 PM, Noble Paul നോബിള്‍ नोब्ळ् < noble.p...@gmail.com> wrote: > On Thu, Jan 29, 2009 at 7:27 PM, Kalidoss MM > wrote: > > Hi, >