Re: Data Directory Sync.

2009-02-18 Thread Kalidoss MM
Hi,

I think i can use http://wiki.apache.org/solr/MergingSolrIndexes   - to
index two different solr index directory??
Thanks,
kalidoss.m,

On Thu, Jan 29, 2009 at 8:57 PM, Noble Paul നോബിള്‍ नोब्ळ् <
noble.p...@gmail.com> wrote:

> On Thu, Jan 29, 2009 at 7:27 PM, Kalidoss MM 
> wrote:
> > Hi,
> >
> >   I have a requirement like, There is a running solr and having
> around
> > 10K records indexed in it. Now i have to index another set of 30K
> records?
> >
> >   The 10K data already in live, And i dont have an option to insert
> > that 30K records in live,
> you can index the 30K data to the live Solr .
> >
> >   Is there any way to run the solr in local system and get the 30K
> > records in data directory, and Update/Upgrade the local solr data
> directoy
> > INTO live data directory?
> >
> >   Is there any tools available? Or is there any other method to
> > Sync/combine 2 different data directory and make it to 1 data directory.
> >
> > Thanks,
> > Kalidoss.m,
> >
>
>
>
> --
> --Noble Paul
>


multicore

2009-02-18 Thread revathy arun
Is there any known limit to number of cores that can be create on a single
webapp.
What are possible  limiting factors?

Regards


Re: Updating the solr index

2009-02-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
The patch currently does not work . SOLR-828 is supposed to duplicate
this. But this is a huge change and a patch is still not ready ,so it
is pushed to 1.5.

On Wed, Feb 18, 2009 at 2:32 PM, Sagar Khetkade
 wrote:
>
> Hi,
>
> The question is about the SOLR path 139 in jira. There the issue is open and 
> marked for SOLR 1.5 release. Is the patch available updates the index file 
> for the particular id for solr 1.3?
>
> Regards,
> Sagar Khetkade
> _
> For the freshest Indian Jobs Visit MSN Jobs
> http://www.in.msn.com/jobs



-- 
--Noble Paul


Re: Data Directory Sync.

2009-02-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
You will have to shutdown your solr before you can do that.

On Wed, Feb 18, 2009 at 1:39 PM, Kalidoss MM  wrote:
> Hi,
>
> I think i can use http://wiki.apache.org/solr/MergingSolrIndexes   - to
> index two different solr index directory??
> Thanks,
> kalidoss.m,
>
> On Thu, Jan 29, 2009 at 8:57 PM, Noble Paul നോബിള്‍ नोब्ळ् <
> noble.p...@gmail.com> wrote:
>
>> On Thu, Jan 29, 2009 at 7:27 PM, Kalidoss MM 
>> wrote:
>> > Hi,
>> >
>> >   I have a requirement like, There is a running solr and having
>> around
>> > 10K records indexed in it. Now i have to index another set of 30K
>> records?
>> >
>> >   The 10K data already in live, And i dont have an option to insert
>> > that 30K records in live,
>> you can index the 30K data to the live Solr .
>> >
>> >   Is there any way to run the solr in local system and get the 30K
>> > records in data directory, and Update/Upgrade the local solr data
>> directoy
>> > INTO live data directory?
>> >
>> >   Is there any tools available? Or is there any other method to
>> > Sync/combine 2 different data directory and make it to 1 data directory.
>> >
>> > Thanks,
>> > Kalidoss.m,
>> >
>>
>>
>>
>> --
>> --Noble Paul
>>
>



-- 
--Noble Paul


Re: multicore

2009-02-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
there are no limits . It must be Integer.MAX_VALUE

the limits are usually decided by the number of file handles the
system can open and the amount of RAM cpu you may have

On Wed, Feb 18, 2009 at 2:15 PM, revathy arun  wrote:
> Is there any known limit to number of cores that can be create on a single
> webapp.
> What are possible  limiting factors?
>
> Regards
>



-- 
--Noble Paul


Re: multicore

2009-02-18 Thread revathy arun
I am sorry ,but i did not get what you meant by  Integer.MAX_VALUE

Does the core take up more RAM  than the regular webapps?

that is if i were to have 3 webapps  ,would the requirement for ram be more
in this case or if i were to have 3 cores in a single webapp or would this
be the same?

Regards

On 2/18/09, Noble Paul നോബിള്‍ नोब्ळ्  wrote:
>
> there are no limits . It must be Integer.MAX_VALUE
>
> the limits are usually decided by the number of file handles the
> system can open and the amount of RAM cpu you may have
>
> On Wed, Feb 18, 2009 at 2:15 PM, revathy arun  wrote:
> > Is there any known limit to number of cores that can be create on a
> single
> > webapp.
> > What are possible  limiting factors?
> >
> > Regards
> >
>
>
>
> --
> --Noble Paul
>


RE: Updating the solr index

2009-02-18 Thread Sagar Khetkade

thanks a lot Noble.
 
`Sagar 
> Date: Wed, 18 Feb 2009 14:45:13 +0530> Subject: Re: Updating the solr index> 
> From: noble.p...@gmail.com> To: solr-user@lucene.apache.org> > The patch 
> currently does not work . SOLR-828 is supposed to duplicate> this. But this 
> is a huge change and a patch is still not ready ,so it> is pushed to 1.5.> > 
> On Wed, Feb 18, 2009 at 2:32 PM, Sagar Khetkade>  
> wrote:> >> > Hi,> >> > The question is about the SOLR path 139 in jira. There 
> the issue is open and marked for SOLR 1.5 release. Is the patch available 
> updates the index file for the particular id for solr 1.3?> >> > Regards,> > 
> Sagar Khetkade> > 
> _> > For the 
> freshest Indian Jobs Visit MSN Jobs> > http://www.in.msn.com/jobs> > > > -- > 
> --Noble Paul
_
Find a better job. We have plenty. Visit MSN Jobs
http://www.in.msn.com/jobs

solr 1.3 analyzers

2009-02-18 Thread revathy arun
HI ,

In the solr 1.3 under src/classes/java/analyzers

i see only the following  language specific tokenizer
chinestokenizer
cjktokenizer
russiantokenizer

but i see filterfactories for other languages like dutch ,french,barzialian
etc but no tokenizer
in this scenario are we supposed to use the standard tokenizer and the
corresponding language filters.Lucene has the analyzers for the same.how do
we incorporate the same to solr

Will this be available in future versions?

what is the difference netween normal filter factory and stem filter
factory?

Regards


make the suggested ignored field multi-valued?

2009-02-18 Thread Peter Wolanin
In the example schema.xml, there is a field type 'ignored' which it is
suggested can be used with the wildcard * to prevent errors when a
document contains fields that don't match any in the schema.   My
experience recently in using this is that it does not worked as
desired if the unmatched field is multiValued, and that that suggested
* field should be designated multiValued:

https://issues.apache.org/jira/browse/SOLR-1022

Obviously this has no effect out of the box, since the field is commented out.

-Peter

-- 
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
peter.wola...@acquia.com


utf 8 issue

2009-02-18 Thread revathy arun
Hi ,

I am trying to index various langauge documents (foroyo,chinese,japanese)
.These have been converted from pdf to text using xpdf
I am using the standard anlyzer for content analysis ,but i am not able to
search anything from some of the files.

My guess is that these documents are not in utf-8 encoding and hence solr
does not return result.


Is there any way to check the encoding of a text/pdf document or convert
them to utf -8 encoding?

while indexing i am sending the header for charset as utf-8 .

Any pointers?

Thanks


Re: utf 8 issue

2009-02-18 Thread Gert Brinkmann
revathy arun wrote:

> Is there any way to check the encoding of a text/pdf document or convert
> them to utf -8 encoding?

If you are using pdftotext you could set the enc parameter:

pdftotext -enc UTF-8 filename

How can you convert PDFs to text via xpdf programmatically?

Greetings,
Gert


Re: utf 8 issue

2009-02-18 Thread Erik Hatcher


On Feb 18, 2009, at 7:34 AM, revathy arun wrote:
I am trying to index various langauge documents  
(foroyo,chinese,japanese)

.These have been converted from pdf to text using xpdf
I am using the standard anlyzer for content analysis ,but i am not  
able to

search anything from some of the files.


Please provide us an example of how you are indexing... what requests  
are you sending to Solr?  What client API are you using to interface  
with Solr?


What container are you using?  Jetty?  Tomcat?

My guess is that these documents are not in utf-8 encoding and hence  
solr

does not return result.


Certainly whatever reads in the text from your data source needs to  
know the encoding and use it appropriately.


Is there any way to check the encoding of a text/pdf document or  
convert

them to utf -8 encoding?


I would imagine the conversion could be made to go to UTF8


while indexing i am sending the header for charset as utf-8 .


How are you doing this?


Any pointers?


If you're using Tomcat, you'll need to set the URIEncoding, as  
described here:


  


Erik




Re: solr 1.3 analyzers

2009-02-18 Thread AHMET ARSLAN
> i see filterfactories for other languages like dutch
> ,french,barzialian etc but no tokenizer.  in this scenario are we >supposed 
> to use the standard tokenizer and the corresponding language >filters. 

Yes. Exactly the same as what Lucene Analyzers do.

>Lucene has the analyzers for the same. how do we incorporate the same to >solr 
>Will this be available in future versions?

One can also specify an existing Lucene Analyzer class that has a 
default constructor via the class attribute on the analyzer element

   


> what is the difference netween normal filter factory and stem filter
> factory?

TokenFilters can delete (StopFilter), inject (SynonymFilter), 
modify(StemFilter) a token according to its purpose. There is no distinction 
such as normal filter factory and stem filter factory.


  


Re: making changes to solr schema after deployed to production

2009-02-18 Thread Grant Ingersoll
It really depends on the change.  Typically, adding fields is fine,  
but of course, it means that you will only be able to search those  
fields in the new documents.  Other changes often require re- 
indexing.  Change the semantics of a field (i.e. changing FieldType)  
will require re-indexing.  Think of it like compiling a program.  If  
you change a variable from an int to a float, you will need to  
recompile.



On Feb 17, 2009, at 5:09 PM, Jonathan Haddad wrote:


Preface: This is my first attempt at using solr.

What happens if I need to do a change to a solr schema that's already
in production?  Can fields be added or removed?

Can a type change from an integer to a float?

Thanks in advance,
Jon


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



semicolon causing Missing sort order exception

2009-02-18 Thread AHMET ARSLAN
Today I found something very interesting. When i search a word (solr; lucene) 
ending with a semicolon plus any character (from solr admin page), solr gives 
an exception (HTTP Status 400 - Missing sort order).
When I escaped semicolon (solr\;lucene) exception gone.

I checked Lucene Special Characters that are part of the query syntax. 
Semicolon (;) is not in the list.

I wanted to share this with you.


  


foreign characters equivalent in solr search

2009-02-18 Thread radarghost

we are using solr 1.2 and dont want to upgrade to 1.3 till official release
for Debian.
i want solr to search for equivalent of a foreign chracter for getting
better results

in example:

if a user searches for Tiesto which is indexed in this format Tiësto in our
solr. we want solr also return result
return search result for á, à, â, ä, ã, å where they are in word but that
word has been searched with normal a
e for ë, i for ï, o for ö, and so on

any solution?

hope i could tell what i need with my poor English

thanks


-- 
View this message in context: 
http://www.nabble.com/foreign-characters-equivalent-in-solr-search-tp22079912p22079912.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: solr 1.4 - boost query from where it finds the word(s)

2009-02-18 Thread sunnyfr

Sorry,
which function is it ?? 
thanks,


Grant Ingersoll-6 wrote:
> 
> You might be able to with FunctionQueries, especially the relatively  
> new and underpromoted ability that Yonik added to use them to multiply  
> in scoring instead of adding.
> 
> See http://wiki.apache.org/solr/FunctionQuery
> 
> 
> 
> On Feb 12, 2009, at 10:17 AM, sunnyfr wrote:
> 
>>
>> Hi Grant,
>>
>> Thanks for your quick answer.
>>
>> So there is not a real quick way to increase one field in particular
>> according to another one if the text is find there, otherwise how  
>> can I do
>> that in two queries ?
>>
>> thanks a lot,
>>
>>
>>
>> Grant Ingersoll-6 wrote:
>>>
>>> Hi Sunny,
>>>
>>> As with any relevance issue, one of the first thing I ask before
>>> getting to a solution, is what is the problem you are seeing that
>>> makes you want to change the way things work?
>>>
>>> That being said, the only way you would be able to do this is through
>>> some custom components and I'm pretty sure it would involve having to
>>> run at least two queries (the first which involves SpanQueries),  
>>> but I
>>> might be missing something.
>>>
>>>
>>> -Grant
>>>
>>> On Feb 12, 2009, at 5:00 AM, sunnyfr wrote:
>>>

 Hi everybody,
 Wish you a nice day,

 I've a question, I would like to know if it's possible to boost
 differently
 some field according to where it find the word.

 Will try to make it more clear;

 I've a book core with title, description and tags.

 If word looked for is found in the title, I would like to boost
 differntly
 another field like number of view

> found in the title then : nb_views^2 and rating^1
> found in the description then : nb_views^0.5 and rating^0.2
> found in the tag then : nb_views^1 and rating^0.5

 How can I do that ?

 Even I would love to make something like if after :
 if nb_views between 0 and 50 then nb_views^1.3  if nb_views>100   
 then
 nb_views^2

 Do you have an idea ? What would you reckon ?

 THANKS A LOT GUYS,

 -- 
 View this message in context:
 http://www.nabble.com/solr-1.4---boost-query-from-where-it-finds-the-word%28s%29-tp21972988p21972988.html
 Sent from the Solr - User mailing list archive at Nabble.com.

>>>
>>>
>>>
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/solr-1.4---boost-query-from-where-it-finds-the-word%28s%29-tp21973015p21978064.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/solr-1.4---boost-query-from-where-it-finds-the-word%28s%29-tp21973015p22080195.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: semicolon causing Missing sort order exception

2009-02-18 Thread Erik Hatcher
semicolon is legacy syntax in Solr only for specifying a sort.  It is  
not part of the actual query parser syntax, but rather parsed  
separately.


What version of Solr are you using?  The semicolon sort support in the  
query string is supposed to be deprecated/removed, I believe, in the  
default query parser but looks like it still is checked under certain  
conditions.


Erik

On Feb 18, 2009, at 9:11 AM, AHMET ARSLAN wrote:

Today I found something very interesting. When i search a word  
(solr; lucene) ending with a semicolon plus any character (from solr  
admin page), solr gives an exception (HTTP Status 400 - Missing sort  
order).

When I escaped semicolon (solr\;lucene) exception gone.

I checked Lucene Special Characters that are part of the query  
syntax. Semicolon (;) is not in the list.


I wanted to share this with you.







Re: solr 1.4 - boost query from where it finds the word(s)

2009-02-18 Thread sunnyfr

Hi Grant,

It doesn't seems to work ? What's wrong with that I done? 

&bf=product(title^2,stat_views)
Thanks


Grant Ingersoll-6 wrote:
> 
> You might be able to with FunctionQueries, especially the relatively  
> new and underpromoted ability that Yonik added to use them to multiply  
> in scoring instead of adding.
> 
> See http://wiki.apache.org/solr/FunctionQuery
> 
> 
> 
> On Feb 12, 2009, at 10:17 AM, sunnyfr wrote:
> 
>>
>> Hi Grant,
>>
>> Thanks for your quick answer.
>>
>> So there is not a real quick way to increase one field in particular
>> according to another one if the text is find there, otherwise how  
>> can I do
>> that in two queries ?
>>
>> thanks a lot,
>>
>>
>>
>> Grant Ingersoll-6 wrote:
>>>
>>> Hi Sunny,
>>>
>>> As with any relevance issue, one of the first thing I ask before
>>> getting to a solution, is what is the problem you are seeing that
>>> makes you want to change the way things work?
>>>
>>> That being said, the only way you would be able to do this is through
>>> some custom components and I'm pretty sure it would involve having to
>>> run at least two queries (the first which involves SpanQueries),  
>>> but I
>>> might be missing something.
>>>
>>>
>>> -Grant
>>>
>>> On Feb 12, 2009, at 5:00 AM, sunnyfr wrote:
>>>

 Hi everybody,
 Wish you a nice day,

 I've a question, I would like to know if it's possible to boost
 differently
 some field according to where it find the word.

 Will try to make it more clear;

 I've a book core with title, description and tags.

 If word looked for is found in the title, I would like to boost
 differntly
 another field like number of view

> found in the title then : nb_views^2 and rating^1
> found in the description then : nb_views^0.5 and rating^0.2
> found in the tag then : nb_views^1 and rating^0.5

 How can I do that ?

 Even I would love to make something like if after :
 if nb_views between 0 and 50 then nb_views^1.3  if nb_views>100   
 then
 nb_views^2

 Do you have an idea ? What would you reckon ?

 THANKS A LOT GUYS,

 -- 
 View this message in context:
 http://www.nabble.com/solr-1.4---boost-query-from-where-it-finds-the-word%28s%29-tp21972988p21972988.html
 Sent from the Solr - User mailing list archive at Nabble.com.

>>>
>>>
>>>
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/solr-1.4---boost-query-from-where-it-finds-the-word%28s%29-tp21973015p21978064.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/solr-1.4---boost-query-from-where-it-finds-the-word%28s%29-tp21973015p22080404.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Input XML duplicate fields uniqueness

2009-02-18 Thread Adi_Jinx




Shalin Shekhar Mangar wrote:
> 
> How about creating a Solr document for each account and adding the recid
> and
> updt attributes from the record tag?
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

Done... In fact I used this idea and its working fine.. Thanks a ton
-- 
View this message in context: 
http://www.nabble.com/Input-XML-duplicate-fields-uniqueness-tp22042765p22080507.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: semicolon causing Missing sort order exception

2009-02-18 Thread AHMET ARSLAN
> What version of Solr are you using?
I am using latest release (apache-solr-1.3.0)


  


IndexMergeTool produces empty index

2009-02-18 Thread Stuart Sierra
Hello,
I'm having trouble merging indexes with with IndexMergeTool.

I use Solr 1.3 to build two separate indexes.  Then I shut down Solr.
The indexes generated by Solr look ok.  I can read them with a Lucene
IndexSearcher, and even open up the index files and see the text of my
documents.

Next I run IndexMergeTool from Lucene 2.4, following the instructions at


After I run IndexMergeTool, the "merged" index is empty.  The
directory is created, it has a segments file like a normal Lucene
index, the index just doesn't have any documents in it.

Any suggestions?

-Stuart Sierra


Re: IndexMergeTool produces empty index

2009-02-18 Thread Erik Hatcher
Are you sure you're using the Lucene JARs (both core and misc) that  
came with Solr 1.3 and not, as you said, the ones from a Lucene 2.4  
distribution?


Erik

On Feb 18, 2009, at 10:17 AM, Stuart Sierra wrote:


Hello,
I'm having trouble merging indexes with with IndexMergeTool.

I use Solr 1.3 to build two separate indexes.  Then I shut down Solr.
The indexes generated by Solr look ok.  I can read them with a Lucene
IndexSearcher, and even open up the index files and see the text of my
documents.

Next I run IndexMergeTool from Lucene 2.4, following the  
instructions at



After I run IndexMergeTool, the "merged" index is empty.  The
directory is created, it has a segments file like a normal Lucene
index, the index just doesn't have any documents in it.

Any suggestions?

-Stuart Sierra




Re: Query regarding setTimeAllowed(Integer) and setRows(Integer)

2009-02-18 Thread Sean Timm

This page gives lots of performance pointers.

http://wiki.apache.org/solr/SolrPerformanceFactors

-Sean

Jana, Kumar Raja wrote:

Thanks Sean. That clears up the timer concept.

Is there any other way through which I can make sure that the server
time is not wasted?

-Original Message-
From: Sean Timm [mailto:tim...@aol.com] 
Sent: Wednesday, February 18, 2009 1:00 AM

To: solr-user@lucene.apache.org
Subject: Re: Query regarding setTimeAllowed(Integer) and
setRows(Integer)

Jana, Kumar Raja wrote:
  

2.   If I set SolrQuery.setTimeAllowed(2000) Will this kill query
processing after 2 secs? (I know this question sounds silly but I just
want a confirmation from the experts J 

That is the idea, but only some of the code is within the timer.  So, 
there are cases where a query could exceed the timeAllowed specified 
because the bulk of the work for that particular query is not in the 
actual collect, for example, an expensive range query.


-Sean
  


Re: IndexMergeTool produces empty index

2009-02-18 Thread Stuart Sierra
I'm using lucene-core-2.4-dev.jar from the Solr 1.3.0 distribution.
Solr doesn't include lucene-misc, so I used lucene-misc-2.4.jar from
the Lucene 2.4.0 distribution.

But I had the exact same problem when I wrote my own index merge tool
using just the Solr distribution jars.

-Stuart Sierra


On Wed, Feb 18, 2009 at 10:20 AM, Erik Hatcher
 wrote:
> Are you sure you're using the Lucene JARs (both core and misc) that came
> with Solr 1.3 and not, as you said, the ones from a Lucene 2.4 distribution?
>
>Erik
>
> On Feb 18, 2009, at 10:17 AM, Stuart Sierra wrote:
>
>> Hello,
>> I'm having trouble merging indexes with with IndexMergeTool.
>>
>> I use Solr 1.3 to build two separate indexes.  Then I shut down Solr.
>> The indexes generated by Solr look ok.  I can read them with a Lucene
>> IndexSearcher, and even open up the index files and see the text of my
>> documents.
>>
>> Next I run IndexMergeTool from Lucene 2.4, following the instructions at
>> 
>>
>> After I run IndexMergeTool, the "merged" index is empty.  The
>> directory is created, it has a segments file like a normal Lucene
>> index, the index just doesn't have any documents in it.
>>
>> Any suggestions?
>>
>> -Stuart Sierra
>
>


Re: Query regarding setTimeAllowed(Integer) and setRows(Integer)

2009-02-18 Thread Walter Underwood
Solr and Lucene are very efficient at basic ranking and retrieval.
Sorting and faceted search take more CPU.

Most of your speed improvement will come from caching, so set
aside some time for cache tuning. You need real query logs for
that.

wunder

On 2/18/09 7:31 AM, "Sean Timm"  wrote:

> This page gives lots of performance pointers.
> 
> http://wiki.apache.org/solr/SolrPerformanceFactors
> 
> -Sean
> 
> Jana, Kumar Raja wrote:
>> Thanks Sean. That clears up the timer concept.
>> 
>> Is there any other way through which I can make sure that the server
>> time is not wasted?
>> 
>> -Original Message-
>> From: Sean Timm [mailto:tim...@aol.com]
>> Sent: Wednesday, February 18, 2009 1:00 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Query regarding setTimeAllowed(Integer) and
>> setRows(Integer)
>> 
>> Jana, Kumar Raja wrote:
>>   
>>> 2.   If I set SolrQuery.setTimeAllowed(2000) Will this kill query
>>> processing after 2 secs? (I know this question sounds silly but I just
>>> want a confirmation from the experts J
>>> 
>> That is the idea, but only some of the code is within the timer.  So,
>> there are cases where a query could exceed the timeAllowed specified
>> because the bulk of the work for that particular query is not in the
>> actual collect, for example, an expensive range query.
>> 
>> -Sean
>>   



boost qf weight between 0 and 10

2009-02-18 Thread sunnyfr

Hi,

I don't get really, I try to boost a field according to another one but I've
a huge weight when I'm using qf boost like :

/select?qt=dismax&fl=*&q="obama
meeting"&debugQuery=true&qf=title&bf=product(title,stat_views)

I will have :
5803681.0 = (MATCH) sum of:
  4.9400806 = weight(title:"obama meet" in 8216294), product of:
0.98198587 = queryWeight(title:"obama meet"), product of:
  16.098255 = idf(title: obama=7654 meet=7344)
  0.06099952 = queryNorm
5.0307045 = fieldWeight(title:"obama meet" in 8216294), product of:
  1.0 = tf(phraseFreq=1.0)
  16.098255 = idf(title: obama=7654 meet=7344)
  0.3125 = fieldNorm(field=title, doc=8216294)
  0.40961993 = weight(text:"obama meet"~100^0.2 in 8216294), product of:
0.17883755 = queryWeight(text:"obama meet"~100^0.2), product of:
  0.2 = boost
  14.658932 = idf(text: obama=12446 meet=19052)
  0.06099952 = queryNorm
2.2904582 = fieldWeight(text:"obama meet" in 8216294), product of:
  1.0 = tf(phraseFreq=1.0)
  14.658932 = idf(text: obama=12446 meet=19052)
  0.15625 = fieldNorm(field=text, doc=8216294)
  5803675.5 = (MATCH) FunctionQuery(product(ord(title),sint(stat_views))),
product of:
9.5142968E7 = product(ord(title)=1119329,sint(stat_views)=85)
1.0 = boost
0.06099952 = queryNorm



But this is not equilibrate between this boost in qf and bf, how can I do ?

Thanks a lot

-- 
View this message in context: 
http://www.nabble.com/boost-qf-weight-between-0-and-10-tp22081396p22081396.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: semicolon causing Missing sort order exception

2009-02-18 Thread Yonik Seeley
On Wed, Feb 18, 2009 at 10:07 AM, AHMET ARSLAN  wrote:
>> What version of Solr are you using?
> I am using latest release (apache-solr-1.3.0)

As Erik points out, it's the legacy sort syntax.
set defType to "lucene" as a default parameter to fix that.

-Yonik
http://www.lucidimagination.com


Re: boost qf weight between 0 and 10

2009-02-18 Thread sunnyfr

Obviously it should be qb and not bf  it looks better.
Is there everything in the wiki because I read it but I'm still a bit
confused about it.



sunnyfr wrote:
> 
> Hi,
> 
> I don't get really, I try to boost a field according to another one but
> I've a huge weight when I'm using qf boost like :
> 
> /select?qt=dismax&fl=*&q="obama
> meeting"&debugQuery=true&qf=title&bf=product(title,stat_views)
> 
> I will have :
> 5803681.0 = (MATCH) sum of:
>   4.9400806 = weight(title:"obama meet" in 8216294), product of:
> 0.98198587 = queryWeight(title:"obama meet"), product of:
>   16.098255 = idf(title: obama=7654 meet=7344)
>   0.06099952 = queryNorm
> 5.0307045 = fieldWeight(title:"obama meet" in 8216294), product of:
>   1.0 = tf(phraseFreq=1.0)
>   16.098255 = idf(title: obama=7654 meet=7344)
>   0.3125 = fieldNorm(field=title, doc=8216294)
>   0.40961993 = weight(text:"obama meet"~100^0.2 in 8216294), product of:
> 0.17883755 = queryWeight(text:"obama meet"~100^0.2), product of:
>   0.2 = boost
>   14.658932 = idf(text: obama=12446 meet=19052)
>   0.06099952 = queryNorm
> 2.2904582 = fieldWeight(text:"obama meet" in 8216294), product of:
>   1.0 = tf(phraseFreq=1.0)
>   14.658932 = idf(text: obama=12446 meet=19052)
>   0.15625 = fieldNorm(field=text, doc=8216294)
>   5803675.5 = (MATCH) FunctionQuery(product(ord(title),sint(stat_views))),
> product of:
> 9.5142968E7 = product(ord(title)=1119329,sint(stat_views)=85)
> 1.0 = boost
> 0.06099952 = queryNorm
> 
> 
> 
> But this is not equilibrate between this boost in qf and bf, how can I do
> ?
> 
> Thanks a lot
> 
> 

-- 
View this message in context: 
http://www.nabble.com/boost-qf-weight-between-0-and-10-tp22081396p22081479.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: foreign characters equivalent in solr search

2009-02-18 Thread AHMET ARSLAN
I think best way to do this is to modify 
org.apache.lucene.index.memory.SynonymTokenFilter and employ this filter index 
time.

if token.termBuffer() has one those (á, à, â, ä, ã, å) characters you will 
replace it with its equvalent ascii character (a). Then you will inject this 
new Token as a Synonym.

I don't know is it the best way but it will give you what you want.

--- On Wed, 2/18/09, radarghost  wrote:

> From: radarghost 
> Subject: foreign characters equivalent in solr search
> To: solr-user@lucene.apache.org
> Date: Wednesday, February 18, 2009, 4:28 PM
> we are using solr 1.2 and dont want to upgrade to 1.3 till
> official release
> for Debian.
> i want solr to search for equivalent of a foreign chracter
> for getting
> better results
> 
> in example:
> 
> if a user searches for Tiesto which is indexed in this
> format Tiësto in our
> solr. we want solr also return result
> return search result for á, à, â, ä, ã, å where they
> are in word but that
> word has been searched with normal a
> e for ë, i for ï, o for ö, and so on
> 
> any solution?
> 
> hope i could tell what i need with my poor English
> 
> thanks
> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/foreign-characters-equivalent-in-solr-search-tp22079912p22079912.html
> Sent from the Solr - User mailing list archive at
> Nabble.com.





Solr training at ApacheCon Europe

2009-02-18 Thread Erik Hatcher

Dear Solr Users -

I am offering a one day Solr training (titled Solr Boot Camp) at  
ApacheCon Europe on March 24.  The  class is designed to cover Solr  
from start to finish - installing, indexing your content numerous ways  
(XML, CSV, database, client API, DataImportHandler, etc), how to use  
the key bells and whistles including spell checking, highlighting,  
faceting, and all the way up to what it takes to build an end  
application on your platform.  I include examples for working with  
Solr in various environments, from JSON to XML, to Java, to PHP, to  
Ruby (on Rails).


Also, if you come to the training, stay for the conference, as there  
will be several talks on the Lucene ecosystem:   See http://lucene.apache.org/#09+February+2009+-+Lucene+at+ApacheCon+Europe+2009+in+Amsterdam 
   The ApacheCon conference is the official conference of the Apache  
Software Foundation and is a great place to learn and interact with  
the developers of your favorite Apache projects.


If you have questions or aren't sure if the course is appropriate for  
you, you can email me offlist.


For more info and to register, see 
http://www.eu.apachecon.com/c/aceu2009/sessions/201

Thanks,
Erik
www.lucidimagination.com


Re: Solr training at ApacheCon Europe

2009-02-18 Thread Vernon Chapman

Erik Hatcher wrote:

Dear Solr Users -

I am offering a one day Solr training (titled Solr Boot Camp) at 
ApacheCon Europe on March 24.  The  class is designed to cover Solr 
from start to finish - installing, indexing your content numerous ways 
(XML, CSV, database, client API, DataImportHandler, etc), how to use 
the key bells and whistles including spell checking, highlighting, 
faceting, and all the way up to what it takes to build an end 
application on your platform.  I include examples for working with 
Solr in various environments, from JSON to XML, to Java, to PHP, to 
Ruby (on Rails).


Also, if you come to the training, stay for the conference, as there 
will be several talks on the Lucene ecosystem:   See 
http://lucene.apache.org/#09+February+2009+-+Lucene+at+ApacheCon+Europe+2009+in+Amsterdam   
The ApacheCon conference is the official conference of the Apache 
Software Foundation and is a great place to learn and interact with 
the developers of your favorite Apache projects.


If you have questions or aren't sure if the course is appropriate for 
you, you can email me offlist.


For more info and to register, see 
http://www.eu.apachecon.com/c/aceu2009/sessions/201


Thanks,
Erik
www.lucidimagination.com


Erik ,

   I was just wondering if you planned on doing something similar in 
the U.S.? I would love to attend apachecon but boss probably won't go 
for it.


Thanks

Vernon Chapman
g8tor


bq type_:true for two types doesn't come up books.

2009-02-18 Thread sunnyfr

Hi,

I don't get: I added a bq boost,
the point is i've some book which are normal, some which are type_roman or
type_comedy and other type
but I would like to boost both of this type for every books indexed.

So if I do :
&bq=type_roman:true^1,5+type_comedy:true^1,5
no video come up 
but if I do :
&bq=type_roman:true^1,5+type_comedy:false^1,5 or just one type videos come
up.

I would like to boost if one or the other one is selected but it's clear
that a book can't have both type.

How can I manage this ? 

Thanks a lot,

-- 
View this message in context: 
http://www.nabble.com/bq-type_%3Atrue-for-two-types-doesn%27t-come-up-books.-tp22083323p22083323.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr training at ApacheCon Europe

2009-02-18 Thread Markus Jelsma - Buyways B.V.
This is very interesting!


Sadly, however, we ought the have finished our Solr implementation by
then. I'd prefer the conference to take place a month earlier! At least
i'll try to attend and persuade my boss to pay the bill!


Cheers


On Wed, 2009-02-18 at 11:33 -0500, Erik Hatcher wrote:

> Dear Solr Users -
> 
> I am offering a one day Solr training (titled Solr Boot Camp) at  
> ApacheCon Europe on March 24.  The  class is designed to cover Solr  
> from start to finish - installing, indexing your content numerous ways  
> (XML, CSV, database, client API, DataImportHandler, etc), how to use  
> the key bells and whistles including spell checking, highlighting,  
> faceting, and all the way up to what it takes to build an end  
> application on your platform.  I include examples for working with  
> Solr in various environments, from JSON to XML, to Java, to PHP, to  
> Ruby (on Rails).
> 
> Also, if you come to the training, stay for the conference, as there  
> will be several talks on the Lucene ecosystem:   See 
> http://lucene.apache.org/#09+February+2009+-+Lucene+at+ApacheCon+Europe+2009+in+Amsterdam
>  
> The ApacheCon conference is the official conference of the Apache  
> Software Foundation and is a great place to learn and interact with  
> the developers of your favorite Apache projects.
> 
> If you have questions or aren't sure if the course is appropriate for  
> you, you can email me offlist.
> 
> For more info and to register, see 
> http://www.eu.apachecon.com/c/aceu2009/sessions/201
> 
> Thanks,
>   Erik
>  www.lucidimagination.com


Snowball and protected words

2009-02-18 Thread Leonardo Dias

Hi there!

Is there a way to make the snowball algorithm work with a protwords.txt 
file?


EnglishPorter works fine. It would be great if the snowball algorithm 
could do the same to avoid searches with irrelevant results.


Best,

Leonardo


Re: Snowball and protected words

2009-02-18 Thread Erik Hatcher


On Feb 18, 2009, at 12:40 PM, Leonardo Dias wrote:
Is there a way to make the snowball algorithm work with a  
protwords.txt file?


Currently, and unfortunately, no - the protected words feature is not  
available the SnowballPorterFilterFactory.It wouldn't take much  
effort to bring that capability across though.


Erik




Re: Snowball and protected words

2009-02-18 Thread Walter Underwood
You can define exceptions in the Snowball language and generate
a new stemmer. See the examples here:

http://snowball.tartarus.org/algorithms/english/stemmer.html

wunder

On 2/18/09 9:56 AM, "Erik Hatcher"  wrote:

> 
> On Feb 18, 2009, at 12:40 PM, Leonardo Dias wrote:
>> Is there a way to make the snowball algorithm work with a
>> protwords.txt file?
> 
> Currently, and unfortunately, no - the protected words feature is not
> available the SnowballPorterFilterFactory.It wouldn't take much
> effort to bring that capability across though.
> 
> Erik




Re: utf 8 issue

2009-02-18 Thread revathy arun
I am using php  curl to post data to solr

container tomcat
i have uriencoding set to utf8 in tomcats server.xml file

this is how its indexed

$header[] = "Content-Type: text/xml; charset=utf-8";
  curl_setopt($ch, CURLOPT_URL,$url);
  curl_setopt( $ch, CURLOPT_HTTPHEADER, $header );
  curl_setopt($ch, CURLOPT_POST, 1);
  curl_setopt($ch, CURLOPT_POSTFIELDS,$post_string);
.$data = curl_exec($ch);
..
however the document i am sending does not seem to have the utf8 encoding

regards

On 2/18/09, Erik Hatcher  wrote:
>
>
> On Feb 18, 2009, at 7:34 AM, revathy arun wrote:
>
>> I am trying to index various langauge documents (foroyo,chinese,japanese)
>> .These have been converted from pdf to text using xpdf
>> I am using the standard anlyzer for content analysis ,but i am not able to
>> search anything from some of the files.
>>
>
> Please provide us an example of how you are indexing... what requests are
> you sending to Solr?  What client API are you using to interface with Solr?
>
> What container are you using?  Jetty?  Tomcat?
>
> My guess is that these documents are not in utf-8 encoding and hence solr
>> does not return result.
>>
>
> Certainly whatever reads in the text from your data source needs to know
> the encoding and use it appropriately.
>
> Is there any way to check the encoding of a text/pdf document or convert
>> them to utf -8 encoding?
>>
>
> I would imagine the conversion could be made to go to UTF8
>
> while indexing i am sending the header for charset as utf-8 .
>>
>
> How are you doing this?
>
> Any pointers?
>>
>
> If you're using Tomcat, you'll need to set the URIEncoding, as described
> here:
>
>  <
> http://wiki.apache.org/solr/SolrTomcat#head-20147ee4d9dd5ca83ed264898280ab60457847c4
> >
>
>Erik
>
>
>


Good strategy for news in Solr?

2009-02-18 Thread Jon Baer
Ive spent a few months trying different techniques w/ regards to  
searching just news articles w/ players and can't seem to find the  
perfect setup.


Normally I take into consideration date (frequency + recently  
published), title (which boosts on relevancy) and general mm in body  
text (and score)


Sometimes its more of a preference on how to drill into news (own  
being most recently published) vs. historical where it can be more  
context based ...


Is anyone else toying w/ different setups for searching news in general?

- Jon


why don't we have a forum for discussion?

2009-02-18 Thread Tony Wang
I am just curious why we don't have a forum for discussion or you guys think
it's really necessary to receive lots of crap information about Solr and
nutch in email? I can offer you a forum for discussion anyway.

-- 
Are you RCholic? www.RCholic.com
温 良 恭 俭 让 仁 义 礼 智 信


Re: why don't we have a forum for discussion?

2009-02-18 Thread Mike Klaas


On 18-Feb-09, at 11:06 AM, Tony Wang wrote:

I am just curious why we don't have a forum for discussion or you  
guys think
it's really necessary to receive lots of crap information about Solr  
and

nutch in email? I can offer you a forum for discussion anyway.


If you want to follow solr-user using the web, try nabble:

http://www.nabble.com/Solr---User-f14480.html

-Mike


Re: why don't we have a forum for discussion?

2009-02-18 Thread Martin Lamothe
Yep, I second the motion.
This mailing list overloads my poor BB curve.

-M

2009/2/18 Tony Wang 

> I am just curious why we don't have a forum for discussion or you guys
> think
> it's really necessary to receive lots of crap information about Solr and
> nutch in email? I can offer you a forum for discussion anyway.
>
> --
> Are you RCholic? www.RCholic.com
> 温 良 恭 俭 让 仁 义 礼 智 信
>



-- 
Martin Lamothe
Business Development and Operations
Wiser Web Solutions Inc.
Direct: (613) 262-5558
Toll-free: 1-800-949-4737
E-mail: m.lamo...@wiserweb.com
http://www.wiserweb.com


Re: Reading Core-Specific Config File in a Row Transformer

2009-02-18 Thread wojtekpia

Thanks Shalin. I think you missed the call to .getResourceLoader(), so it
should be:

context.getSolrCore().getResourceLoader().getInstanceDir()

Works great, thanks!


Shalin Shekhar Mangar wrote:
> 
> 
> You can use Context.getSolrCore().getInstanceDir()
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Reading-Core-Specific-Config-File-in-a-Row-Transformer-tp22069449p22086846.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: why don't we have a forum for discussion?

2009-02-18 Thread Jon Baer
I don't think "general" discussion forums really help ... it would be  
great if every major page in the Solr wiki had a discuss link off to  
somewhere though +1 for that ...


Ie:
http://wiki.apache.org/solr/SolrRequestHandler
http://wiki.apache.org/solr/SolrReplication
etc.

For me even panning over discussion history on topics would be helpful.

- Jon

On Feb 18, 2009, at 2:56 PM, Martin Lamothe wrote:


Yep, I second the motion.
This mailing list overloads my poor BB curve.

-M

2009/2/18 Tony Wang 

I am just curious why we don't have a forum for discussion or you  
guys

think
it's really necessary to receive lots of crap information about  
Solr and

nutch in email? I can offer you a forum for discussion anyway.

--
Are you RCholic? www.RCholic.com
温 良 恭 俭 让 仁 义 礼 智 信





--
Martin Lamothe
Business Development and Operations
Wiser Web Solutions Inc.
Direct: (613) 262-5558
Toll-free: 1-800-949-4737
E-mail: m.lamo...@wiserweb.com
http://www.wiserweb.com




Re: Solr training at ApacheCon Europe

2009-02-18 Thread Erik Hatcher

Thanks for the interest from Vernon, Markus, and Jon.


Note that beyond ApacheCon's, our company, Lucid Imagination, offers  
Solr (and Lucene) training.  We can customize training to suit your  
needs.  If you'd like more information or to request a training in  
your organization, e-mail us at train...@lucidimagination.com



And for the community at large, check out our Solr-powered search  
system to search this e-mail list, and all other lucene.apache.org  
lists, and wiki, website, JIRA issues/comments, Java code, and our own  
growing list of articles and blog entries:


  http://www.lucidimagination.com/search

Give it a few minutes, and this very e-mail will be there :)

Erik


On Feb 18, 2009, at 12:16 PM, Markus Jelsma - Buyways B.V. wrote:


This is very interesting!


Sadly, however, we ought the have finished our Solr implementation by
then. I'd prefer the conference to take place a month earlier! At  
least

i'll try to attend and persuade my boss to pay the bill!


Cheers


On Wed, 2009-02-18 at 11:33 -0500, Erik Hatcher wrote:


Dear Solr Users -

I am offering a one day Solr training (titled Solr Boot Camp) at
ApacheCon Europe on March 24.  The  class is designed to cover Solr
from start to finish - installing, indexing your content numerous  
ways

(XML, CSV, database, client API, DataImportHandler, etc), how to use
the key bells and whistles including spell checking, highlighting,
faceting, and all the way up to what it takes to build an end
application on your platform.  I include examples for working with
Solr in various environments, from JSON to XML, to Java, to PHP, to
Ruby (on Rails).

Also, if you come to the training, stay for the conference, as there
will be several talks on the Lucene ecosystem:   See 
http://lucene.apache.org/#09+February+2009+-+Lucene+at+ApacheCon+Europe+2009+in+Amsterdam
   The ApacheCon conference is the official conference of the Apache
Software Foundation and is a great place to learn and interact with
the developers of your favorite Apache projects.

If you have questions or aren't sure if the course is appropriate for
you, you can email me offlist.

For more info and to register, see 
http://www.eu.apachecon.com/c/aceu2009/sessions/201

Thanks,
Erik
www.lucidimagination.com




Re: why don't we have a forum for discussion?

2009-02-18 Thread Stephen Weiss
I third the motion SOLR is the second largest contributor to my e- 
mail glut (my company's marketing is #1).  I often have no idea what  
area of Solr I'm actually asking about when I have a question, so I  
would disagree and say a general forum provides a place to post when  
you don't really understand the internals so well.


But almost anything would be better than the current situation.  This  
list is SOLR's best documentation so I wouldn't want to just stop  
getting it (and stuff just goes unnoticed in digests), but it could be  
presented better.  A forum with a search function and notifications  
would be a big improvement, especially as the community grows.


--
Steve

On Feb 18, 2009, at 3:28 PM, Jon Baer wrote:

I don't think "general" discussion forums really help ... it would  
be great if every major page in the Solr wiki had a discuss link off  
to somewhere though +1 for that ...


Ie:
http://wiki.apache.org/solr/SolrRequestHandler
http://wiki.apache.org/solr/SolrReplication
etc.

For me even panning over discussion history on topics would be  
helpful.


- Jon

On Feb 18, 2009, at 2:56 PM, Martin Lamothe wrote:


Yep, I second the motion.
This mailing list overloads my poor BB curve.

-M

2009/2/18 Tony Wang 

I am just curious why we don't have a forum for discussion or you  
guys

think
it's really necessary to receive lots of crap information about  
Solr and

nutch in email? I can offer you a forum for discussion anyway.

--
Are you RCholic? www.RCholic.com
温 良 恭 俭 让 仁 义 礼 智 信





--
Martin Lamothe
Business Development and Operations
Wiser Web Solutions Inc.
Direct: (613) 262-5558
Toll-free: 1-800-949-4737
E-mail: m.lamo...@wiserweb.com
http://www.wiserweb.com






Re: why don't we have a forum for discussion?

2009-02-18 Thread Walter Underwood
I really prefer a mailing list. If I had to visit a website to
contribute, my participation would go to zero.

I might not be typical -- I've been handling a few hundred
messages a day for the past twenty five years.

wunder (e-mail is the killer app)

On 2/18/09 2:09 PM, "Stephen Weiss"  wrote:

> I third the motion SOLR is the second largest contributor to my e-
> mail glut (my company's marketing is #1).  I often have no idea what
> area of Solr I'm actually asking about when I have a question, so I
> would disagree and say a general forum provides a place to post when
> you don't really understand the internals so well.
> 
> But almost anything would be better than the current situation.  This
> list is SOLR's best documentation so I wouldn't want to just stop
> getting it (and stuff just goes unnoticed in digests), but it could be
> presented better.  A forum with a search function and notifications
> would be a big improvement, especially as the community grows.
> 
> --
> Steve
> 
> On Feb 18, 2009, at 3:28 PM, Jon Baer wrote:
> 
>> I don't think "general" discussion forums really help ... it would
>> be great if every major page in the Solr wiki had a discuss link off
>> to somewhere though +1 for that ...
>> 
>> Ie:
>> http://wiki.apache.org/solr/SolrRequestHandler
>> http://wiki.apache.org/solr/SolrReplication
>> etc.
>> 
>> For me even panning over discussion history on topics would be
>> helpful.
>> 
>> - Jon
>> 
>> On Feb 18, 2009, at 2:56 PM, Martin Lamothe wrote:
>> 
>>> Yep, I second the motion.
>>> This mailing list overloads my poor BB curve.
>>> 
>>> -M
>>> 
>>> 2009/2/18 Tony Wang 
>>> 
 I am just curious why we don't have a forum for discussion or you
 guys
 think
 it's really necessary to receive lots of crap information about
 Solr and
 nutch in email? I can offer you a forum for discussion anyway.
 
 --
 Are you RCholic? www.RCholic.com
 温 良 恭 俭 让 仁 义 礼 智 信
 
>>> -- 
>>> Martin Lamothe
>>> Business Development and Operations
>>> Wiser Web Solutions Inc.
>>> Direct: (613) 262-5558
>>> Toll-free: 1-800-949-4737
>>> E-mail: m.lamo...@wiserweb.com
>>> http://www.wiserweb.com




Re: why don't we have a forum for discussion?

2009-02-18 Thread Erik Hatcher


On Feb 18, 2009, at 5:09 PM, Stephen Weiss wrote:
But almost anything would be better than the current situation.   
This list is SOLR's best documentation so I wouldn't want to just  
stop getting it (and stuff just goes unnoticed in digests), but it  
could be presented better.  A forum with a search function and  
notifications would be a big improvement, especially as the  
community grows.


This mailing list is a treasure trove of great stuff.  And it is quite  
searchable at a number of services, including MarkMail, Nabble, and so  
on: 


Nabble provides a forum-like interface to mailing lists.

For the best (IMNSHO) search for this list (and the Solr wiki, most  
importantly here, among other sources): 


And if you want to see the latest mails for this list (latest on top):
  

We've got notifications and syndication on our short term TODO list  
already.  Any other features you'd like to see in our search system,  
please let me know and we'll almost certainly add as we can.  Adding  
the ability to send mails back through a web forum, though, is  
something I'm not too keen on adding - it opens the door to lots of  
complexities and it'd need to be a strongly requested feature to be  
considered.


Erik



Re: why don't we have a forum for discussion?

2009-02-18 Thread Matthew Runo

At the risk of sounding "me too"... me too!

Email is something I already use throughout the day - it's easy to pop  
over into the folder I send all the solr-user mail to and quickly scan  
the subject lines.


Nabble is great for searching though.. I only have 12,126 of the solr- 
user messages archived locally so far..


Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
mr...@zappos.com - 702-943-7833

On Feb 18, 2009, at 2:16 PM, Walter Underwood wrote:


I really prefer a mailing list. If I had to visit a website to
contribute, my participation would go to zero.

I might not be typical -- I've been handling a few hundred
messages a day for the past twenty five years.

wunder (e-mail is the killer app)

On 2/18/09 2:09 PM, "Stephen Weiss"  wrote:

I third the motion SOLR is the second largest contributor to my  
e-

mail glut (my company's marketing is #1).  I often have no idea what
area of Solr I'm actually asking about when I have a question, so I
would disagree and say a general forum provides a place to post when
you don't really understand the internals so well.

But almost anything would be better than the current situation.  This
list is SOLR's best documentation so I wouldn't want to just stop
getting it (and stuff just goes unnoticed in digests), but it could  
be

presented better.  A forum with a search function and notifications
would be a big improvement, especially as the community grows.

--
Steve

On Feb 18, 2009, at 3:28 PM, Jon Baer wrote:


I don't think "general" discussion forums really help ... it would
be great if every major page in the Solr wiki had a discuss link off
to somewhere though +1 for that ...

Ie:
http://wiki.apache.org/solr/SolrRequestHandler
http://wiki.apache.org/solr/SolrReplication
etc.

For me even panning over discussion history on topics would be
helpful.

- Jon

On Feb 18, 2009, at 2:56 PM, Martin Lamothe wrote:


Yep, I second the motion.
This mailing list overloads my poor BB curve.

-M

2009/2/18 Tony Wang 


I am just curious why we don't have a forum for discussion or you
guys
think
it's really necessary to receive lots of crap information about
Solr and
nutch in email? I can offer you a forum for discussion anyway.

--
Are you RCholic? www.RCholic.com
温 良 恭 俭 让 仁 义 礼 智 信


--
Martin Lamothe
Business Development and Operations
Wiser Web Solutions Inc.
Direct: (613) 262-5558
Toll-free: 1-800-949-4737
E-mail: m.lamo...@wiserweb.com
http://www.wiserweb.com







Re: why don't we have a forum for discussion?

2009-02-18 Thread Shashi Kant
one man's "crap" is another man's treasure. :-P

So how would you decide what is worth posting? 
If you feel the list is overwhelming your email, set some filters.


Shashi


- Original Message 
From: Tony Wang 
To: solr-user@lucene.apache.org
Sent: Wednesday, February 18, 2009 2:06:57 PM
Subject: why don't we have a forum for discussion?

I am just curious why we don't have a forum for discussion or you guys think
it's really necessary to receive lots of crap information about Solr and
nutch in email? I can offer you a forum for discussion anyway.

-- 
Are you RCholic? www.RCholic.com
温 良 恭 俭 让 仁 义 礼 智 信



Re: why don't we have a forum for discussion?

2009-02-18 Thread Martin Lamothe
E-mails wouldn't go away with a discussion forum as they have e-mail
notifications tooit could compliment this  mailing list... some stuff is
asked over and over and over ... isn't it? With a forum, it would be
possible to say.. go see this post.. .or that thread.. etc...

Multi-core could use it's own Topic
Scalling could use it's own too
Indexing
Optimizing Indexes
etc...

Surely some general topics would help organize this evolving body of
knowledge... IMO..

Folks that have graduated the entry levels of setting up Solr could  choose
to subscribe to more advanced topics, this coud help spare some mental
bandwidth for some... on the flip this, novice developers might feel
overwhelmed and shy away from participating much in this mailing list.. a
forum might help strike a better balance of capturing static information,
reduce the volume of e-mails, prevent users from having to setup fitlers to
deal with the solr mailing list, all without taking away from the e-mail
discussions..



On Wed, Feb 18, 2009 at 5:29 PM, Shashi Kant  wrote:

> one man's "crap" is another man's treasure. :-P
>
> So how would you decide what is worth posting?
> If you feel the list is overwhelming your email, set some filters.
>
>
> Shashi
>
>
> - Original Message 
> From: Tony Wang 
> To: solr-user@lucene.apache.org
> Sent: Wednesday, February 18, 2009 2:06:57 PM
> Subject: why don't we have a forum for discussion?
>
> I am just curious why we don't have a forum for discussion or you guys
> think
> it's really necessary to receive lots of crap information about Solr and
> nutch in email? I can offer you a forum for discussion anyway.
>
> --
> Are you RCholic? www.RCholic.com
> 温 良 恭 俭 让 仁 义 礼 智 信
>
>


Re: foreign characters equivalent in solr search

2009-02-18 Thread Koji Sekiguchi

CharFilter will solve the problem, but it comes with Solr 1.4.

https://issues.apache.org/jira/browse/SOLR-822

Koji

AHMET ARSLAN wrote:

I think best way to do this is to modify 
org.apache.lucene.index.memory.SynonymTokenFilter and employ this filter index 
time.

if token.termBuffer() has one those (á, à, â, ä, ã, å) characters you will 
replace it with its equvalent ascii character (a). Then you will inject this 
new Token as a Synonym.

I don't know is it the best way but it will give you what you want.

--- On Wed, 2/18/09, radarghost  wrote:

  

From: radarghost 
Subject: foreign characters equivalent in solr search
To: solr-user@lucene.apache.org
Date: Wednesday, February 18, 2009, 4:28 PM
we are using solr 1.2 and dont want to upgrade to 1.3 till
official release
for Debian.
i want solr to search for equivalent of a foreign chracter
for getting
better results

in example:

if a user searches for Tiesto which is indexed in this
format Tiësto in our
solr. we want solr also return result
return search result for á, à, â, ä, ã, å where they
are in word but that
word has been searched with normal a
e for ë, i for ï, o for ö, and so on

any solution?

hope i could tell what i need with my poor English

thanks


--
View this message in context:
http://www.nabble.com/foreign-characters-equivalent-in-solr-search-tp22079912p22079912.html
Sent from the Solr - User mailing list archive at
Nabble.com.




  

  




LocalSolr distributed search

2009-02-18 Thread Rajiv2

Hello, 

I'm currently using LocalSolr in my project and coming across some
issues with making the LocalSolrQueryComponent work w/ distributed search.
I'm using version LocalSolr 2.0 and Solr 1.3. Can someone point me in the
right direction on how to modify this component to work with Distributed
Search - there seems to be very little documentation on the local solr
website.

Thanks,
Rajiv
-- 
View this message in context: 
http://www.nabble.com/LocalSolr-distributed-search-tp22091124p22091124.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: why don't we have a forum for discussion?

2009-02-18 Thread Mike Klaas


On 18-Feb-09, at 2:09 PM, Stephen Weiss wrote:

I third the motion SOLR is the second largest contributor to my  
e-mail glut (my company's marketing is #1).  I often have no idea  
what area of Solr I'm actually asking about when I have a question,  
so I would disagree and say a general forum provides a place to post  
when you don't really understand the internals so well.


But almost anything would be better than the current situation.   
This list is SOLR's best documentation so I wouldn't want to just  
stop getting it (and stuff just goes unnoticed in digests), but it  
could be presented better.  A forum with a search function and  
notifications would be a big improvement, especially as the  
community grows.


I, for one, do not understand the motivation for a forum.

1. For people who prefer fora, nabble/markmail/etc. provide a forum- 
like view to the discussion.  Posts can be permanently linked to using  
these sites.
2. Many people greatly prefer the mailing list format (obviously, it  
takes a little bit of effort to use mailinglists effectively (e.g.,  
directing the traffic to a folder/tag/etc.)
3. There isn't enough traffic to justify splitting the list into sub- 
lists (or sub-fora)


Fora have the same problems as do mailinglists in terms of people  
asking the same questions.


-Mike


RE: why don't we have a forum for discussion?

2009-02-18 Thread Smiley, David W.
I definitely agree with the sentiments of other that Nable et. al replace the 
need for a separate forum given that we have an active mailing list already.

Martin, you claimed a forum would meet the need of people asking questions over 
and over.  In my opinion, neither email nor forums are ideal for solving that.  
A wiki on the other hand, is.  And Solr has a wiki and I refer to it often.  I 
certainly use Nabble's search too.

~ David Smiley


From: wiser...@gmail.com [wiser...@gmail.com] On Behalf Of Martin Lamothe 
[martin.lamo...@wiserweb.com]
Sent: Wednesday, February 18, 2009 6:38 PM
To: solr-user@lucene.apache.org
Subject: Re: why don't we have a forum for discussion?

E-mails wouldn't go away with a discussion forum as they have e-mail
notifications tooit could compliment this  mailing list... some stuff is
asked over and over and over ... isn't it? With a forum, it would be
possible to say.. go see this post.. .or that thread.. etc...

Multi-core could use it's own Topic
Scalling could use it's own too
Indexing
Optimizing Indexes
etc...

Surely some general topics would help organize this evolving body of
knowledge... IMO..

Folks that have graduated the entry levels of setting up Solr could  choose
to subscribe to more advanced topics, this coud help spare some mental
bandwidth for some... on the flip this, novice developers might feel
overwhelmed and shy away from participating much in this mailing list.. a
forum might help strike a better balance of capturing static information,
reduce the volume of e-mails, prevent users from having to setup fitlers to
deal with the solr mailing list, all without taking away from the e-mail
discussions..



On Wed, Feb 18, 2009 at 5:29 PM, Shashi Kant  wrote:

> one man's "crap" is another man's treasure. :-P
>
> So how would you decide what is worth posting?
> If you feel the list is overwhelming your email, set some filters.
>
>
> Shashi
>
>
> - Original Message 
> From: Tony Wang 
> To: solr-user@lucene.apache.org
> Sent: Wednesday, February 18, 2009 2:06:57 PM
> Subject: why don't we have a forum for discussion?
>
> I am just curious why we don't have a forum for discussion or you guys
> think
> it's really necessary to receive lots of crap information about Solr and
> nutch in email? I can offer you a forum for discussion anyway.
>
> --
> Are you RCholic? www.RCholic.com
> 温 良 恭 俭 让 仁 义 礼 智 信
>
>

Re: why don't we have a forum for discussion?

2009-02-18 Thread Chris Hostetter

: I am just curious why we don't have a forum for discussion or you guys think
: it's really necessary to receive lots of crap information about Solr and
: nutch in email? I can offer you a forum for discussion anyway.

leaving out my personal opinions on SMTP based mailing lists vs HTTP based 
forums, there are some practical issues to consider...

1) it is possible to have a web based front end for 
reading/searching/posting to mailing lists (nabble.com demonstrates this 
quite well).  it is much harder to have a mailing list based front end for 
a web based forum.  so as long as there are people still using email, it 
makes the most sense for email to be the "core" system, and people that 
want a web based forum to use a web based forum UI that proxies to the 
mailing list.

2) There is nothing preventing people who want to start alternate online 
forums for discussing Solr from doing so.  (I recently learned there is 
even a #solr IRC channel that gets moderate use by some members of hte 
community).

3) apache projects are required to have mailing lists.  this is the
"official" method of coordinating development, and where all binding votes 
must take place.  So even if 100% of the Solr community switched to 
using some web based forum software, the mailing list(s) would still need 
to exist.

4) the comments made so far seem to indicate three classes of reasons why 
people are suggesting a forum instead of a mailing list...

4a) inboxes too full -- this is why email apps support filtering

4b) searching the archives -- i'm not sure what to say about this, the 
mailing list archives are pretty easily searchable right now on dozens of 
sites.

4c) browser based posting -- see nabble.com

4d) setting alerts for specific keywords -- this is the other reason why 
email apps support filtering.
 
4e) linkability of past posts -- almost every web based archive of hte 
mailing list supports permalinks for threads.

4f) better sub-classification of posts.  This is really an orthoginal 
issue of if/when it makes sense to created
sub-specialized community discussion channels.  we could have micro-topic 
based email lists (ie: solr-user-multic...@lucene) just as easily as you 
can have micro-topic based forums -- the question is: does that improve 
the community.  This is one of the great holy wars of online community 
forums, dating back to early NNTP newsgroups: when does it make sense to 
create sub-groups.  Considering the current number of posts per/day and 
the subscriber counts, i personally don't think we're anywhere close to 
worrying about splitting up solr-user into smaller community chunks.  
Among other things: it makes it very hard for new community members to 
understand where to start, many conversations can easily evolve to 
encompase multiple "topics", etc...).  For now, i would suggest that 
people only interested in certain topics take advantage of filters in 
their email clients to help "flag" posts they might be interested in -- 
but that's going to be just as error prone as if a new user tried to 
decide whether solr-user-multicore or solr-user-scalability is the right 
place to ask their question about scaling on multicore CPU machines.



-Hoss



Re: why don't we have a forum for discussion?

2009-02-18 Thread Stephen Weiss
Like an earlier poster, my issue isn't on the laptop, it's with my  
mobile device.  The sheer volume of e-mail overwhelms the thing  
sometimes (right now, for instance).  There's really no option for  
moving the e-mail off to some other folder, it just all goes to one  
place.


Perhaps that means I need a better phone, it's just the obvious  
solutions aren't always practical.  Forums can conversely just as  
easily be set up to emulate mailing lists as well...  Our company's  
internal forum works this way.


--
Steve

On Feb 18, 2009, at 7:16 PM, Mike Klaas wrote:




2. Many people greatly prefer the mailing list format (obviously, it  
takes a little bit of effort to use mailinglists effectively (e.g.,  
directing the traffic to a folder/tag/etc.)


Re: why don't we have a forum for discussion?

2009-02-18 Thread Erik Hatcher


On Feb 18, 2009, at 7:31 PM, Chris Hostetter wrote:
2) There is nothing preventing people who want to start alternate  
online
forums for discussing Solr from doing so.  (I recently learned there  
is

even a #solr IRC channel that gets moderate use by some members of hte
community).


I lurk in #solr and #lucene (among others) when I'm online working at  
my desk pretty much most days.  I've directed several folks from there  
to here for specific trickier issues, and have helped several folks  
with easier things.  By all means, join us in #solr - no reason it  
can't be used to take the load off the list for some quicky answers.


Erik



Re: why don't we have a forum for discussion?

2009-02-18 Thread Peter Wolanin
If some stuff is asked over and over again, it would be great to grab
some reasonable responses and add them to the wiki.

I've edited it a few times when I've struggled with what's there and
found something that wasn't covered or was out of date - even the best
forum or mailing list will not replicate an organized and maintained
doc site in terms of ready access to knowledge.

-Peter

2009/2/18 Martin Lamothe :
> E-mails wouldn't go away with a discussion forum as they have e-mail
> notifications tooit could compliment this  mailing list... some stuff is
> asked over and over and over ... isn't it? With a forum, it would be
> possible to say.. go see this post.. .or that thread.. etc...
>
> Multi-core could use it's own Topic
> Scalling could use it's own too
> Indexing
> Optimizing Indexes
> etc...


Re: utf 8 issue

2009-02-18 Thread Erik Hatcher


On Feb 18, 2009, at 1:53 PM, revathy arun wrote:

I am using php  curl to post data to solr

container tomcat
i have uriencoding set to utf8 in tomcats server.xml file

this is how its indexed

$header[] = "Content-Type: text/xml; charset=utf-8";
 curl_setopt($ch, CURLOPT_URL,$url);
 curl_setopt( $ch, CURLOPT_HTTPHEADER, $header );
 curl_setopt($ch, CURLOPT_POST, 1);
 curl_setopt($ch, CURLOPT_POSTFIELDS,$post_string);
.$data = curl_exec($ch);
..
however the document i am sending does not seem to have the utf8  
encoding


What does Solr have stored for the documents?  If you haven't set your  
indexed fields to be stored, go ahead and do so (and restart/reindex)  
for troubleshooting and do a /select?q=*:* to see what got stored for  
the documents you're having trouble finding.  I imagine if you have  
encoding issues, that will show up as mangled stored text that  
couldn't be analyzed properly.


How are you getting $post_string in your code?

Erik



Re: why don't we have a forum for discussion?

2009-02-18 Thread Shashi Kant
Steve - could you not just subscribe to the list from another (off-mobile 
device) email (Gmail or Yahoo) for example?
We discourage using corporate email for subscribing mailing lists precisely for 
such reasons : volume, spam, malware risks etc.

Shashi




- Original Message 
From: Stephen Weiss 
To: solr-user@lucene.apache.org
Sent: Wednesday, February 18, 2009 7:34:30 PM
Subject: Re: why don't we have a forum for discussion?

Like an earlier poster, my issue isn't on the laptop, it's with my mobile 
device.  The sheer volume of e-mail overwhelms the thing sometimes (right now, 
for instance).  There's really no option for moving the e-mail off to some 
other folder, it just all goes to one place.

Perhaps that means I need a better phone, it's just the obvious solutions 
aren't always practical.  Forums can conversely just as easily be set up to 
emulate mailing lists as well...  Our company's internal forum works this way.

--
Steve

On Feb 18, 2009, at 7:16 PM, Mike Klaas wrote:

> 
> 
> 2. Many people greatly prefer the mailing list format (obviously, it takes a 
> little bit of effort to use mailinglists effectively (e.g., directing the 
> traffic to a folder/tag/etc.)



Re: utf 8 issue

2009-02-18 Thread revathy arun
Hi Eril,

$post_string is  xml data
i dont see any content for those files when i give  *:* .what would that
mean?



On 2/19/09, Erik Hatcher  wrote:
>
>
> On Feb 18, 2009, at 1:53 PM, revathy arun wrote:
>
>> I am using php  curl to post data to solr
>>
>> container tomcat
>> i have uriencoding set to utf8 in tomcats server.xml file
>>
>> this is how its indexed
>> 
>> $header[] = "Content-Type: text/xml; charset=utf-8";
>>  curl_setopt($ch, CURLOPT_URL,$url);
>>  curl_setopt( $ch, CURLOPT_HTTPHEADER, $header );
>>  curl_setopt($ch, CURLOPT_POST, 1);
>>  curl_setopt($ch, CURLOPT_POSTFIELDS,$post_string);
>> .$data = curl_exec($ch);
>> ..
>> however the document i am sending does not seem to have the utf8 encoding
>>
>
> What does Solr have stored for the documents?  If you haven't set your
> indexed fields to be stored, go ahead and do so (and restart/reindex) for
> troubleshooting and do a /select?q=*:* to see what got stored for the
> documents you're having trouble finding.  I imagine if you have encoding
> issues, that will show up as mangled stored text that couldn't be analyzed
> properly.
>
> How are you getting $post_string in your code?
>
>Erik
>
>


Unified search of relational data on Solr?

2009-02-18 Thread Senthil Kumar
Hi,

  How to index relational data in Solr which can not be merged as a
single file for some reasons?
  We have two kinds of XMLs indexed in Solr,

   1_persona
   
   
   



   1_addr
   washington


  Our aim to get a list of persons living in Washington. Can anyone
suggest what is the best approach for this and to index relational data in
general?


Senthil Kumar P


Re: Unified search of relational data on Solr?

2009-02-18 Thread Otis Gospodnetic
Hi,

Just flatten it - create a single Person + Address entity (document) and index 
it.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch





From: Senthil Kumar 
To: solr-user@lucene.apache.org
Sent: Thursday, February 19, 2009 1:20:23 PM
Subject: Unified search of relational data on Solr?

Hi,

  How to index relational data in Solr which can not be merged as a
single file for some reasons?
  We have two kinds of XMLs indexed in Solr,

   1_persona
   
   
   



   1_addr
   washington


  Our aim to get a list of persons living in Washington. Can anyone
suggest what is the best approach for this and to index relational data in
general?


Senthil Kumar P


Re: Unified search of relational data on Solr?

2009-02-18 Thread Kalidoss MM
Even in my case, we cant make it flattern, Bcoz we are managing total image
gallery information in Solr, So image gallery contains aroung 20 images also
with image descrption, thumbnail info, width, height, etc also we want to
store/update the stats along with image gallery,

If we flatten the xml, for every visit to the image gallery i need to update
the whole lengh record again into Solr, we have around 30lacs image gallery
also per day around 50K imagegallery stats supposed to update,

So we are thinking of spliting of Image gallery And (Stats, comments) as
separate xml..

1) if any body used parallel Reader (lucene) let me know how this will be
usefull for us,
2) If any body used multicore let me know how this will be useful for us.
3) Is "MultipleIndexes" will be useful or not?
http://wiki.apache.org/solr/MultipleIndexes

Please suggest us,

Thanks,
kalidoss.m,

On Thu, Feb 19, 2009 at 11:24 AM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> Hi,
>
> Just flatten it - create a single Person + Address entity (document) and
> index it.
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
>
> 
> From: Senthil Kumar 
> To: solr-user@lucene.apache.org
> Sent: Thursday, February 19, 2009 1:20:23 PM
> Subject: Unified search of relational data on Solr?
>
> Hi,
>
>  How to index relational data in Solr which can not be merged as a
> single file for some reasons?
>  We have two kinds of XMLs indexed in Solr,
> 
>   1_persona
>   
>   
>   
> 
>
> 
>   1_addr
>   washington
> 
>
>  Our aim to get a list of persons living in Washington. Can anyone
> suggest what is the best approach for this and to index relational data in
> general?
>
>
> Senthil Kumar P
>


Re: Unified search of relational data on Solr?

2009-02-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
do you wish to search on the image names or is it that you only wish
to read the image details
--Noble

On Thu, Feb 19, 2009 at 12:31 PM, Kalidoss MM  wrote:
> Even in my case, we cant make it flattern, Bcoz we are managing total image
> gallery information in Solr, So image gallery contains aroung 20 images also
> with image descrption, thumbnail info, width, height, etc also we want to
> store/update the stats along with image gallery,
>
> If we flatten the xml, for every visit to the image gallery i need to update
> the whole lengh record again into Solr, we have around 30lacs image gallery
> also per day around 50K imagegallery stats supposed to update,
>
> So we are thinking of spliting of Image gallery And (Stats, comments) as
> separate xml..
>
> 1) if any body used parallel Reader (lucene) let me know how this will be
> usefull for us,
> 2) If any body used multicore let me know how this will be useful for us.
> 3) Is "MultipleIndexes" will be useful or not?
> http://wiki.apache.org/solr/MultipleIndexes
>
> Please suggest us,
>
> Thanks,
> kalidoss.m,
>
> On Thu, Feb 19, 2009 at 11:24 AM, Otis Gospodnetic <
> otis_gospodne...@yahoo.com> wrote:
>
>> Hi,
>>
>> Just flatten it - create a single Person + Address entity (document) and
>> index it.
>>
>> Otis
>> --
>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>
>>
>>
>>
>> 
>> From: Senthil Kumar 
>> To: solr-user@lucene.apache.org
>> Sent: Thursday, February 19, 2009 1:20:23 PM
>> Subject: Unified search of relational data on Solr?
>>
>> Hi,
>>
>>  How to index relational data in Solr which can not be merged as a
>> single file for some reasons?
>>  We have two kinds of XMLs indexed in Solr,
>> 
>>   1_persona
>>   
>>   
>>   
>> 
>>
>> 
>>   1_addr
>>   washington
>> 
>>
>>  Our aim to get a list of persons living in Washington. Can anyone
>> suggest what is the best approach for this and to index relational data in
>> general?
>>
>>
>> Senthil Kumar P
>>
>



-- 
--Noble Paul


Re: foreign characters equivalent in solr search

2009-02-18 Thread radarghost

thanks

we will try that and post the results here but it seems we may get problem
with highlight function.



Ahmet Arslan wrote:
> 
> I think best way to do this is to modify
> org.apache.lucene.index.memory.SynonymTokenFilter and employ this filter
> index time.
> 
> if token.termBuffer() has one those (á, à, â, ä, ã, å) characters you will
> replace it with its equvalent ascii character (a). Then you will inject
> this new Token as a Synonym.
> 
> I don't know is it the best way but it will give you what you want.
> 
> --- On Wed, 2/18/09, radarghost  wrote:
> 
>> From: radarghost 
>> Subject: foreign characters equivalent in solr search
>> To: solr-user@lucene.apache.org
>> Date: Wednesday, February 18, 2009, 4:28 PM
>> we are using solr 1.2 and dont want to upgrade to 1.3 till
>> official release
>> for Debian.
>> i want solr to search for equivalent of a foreign chracter
>> for getting
>> better results
>> 
>> in example:
>> 
>> if a user searches for Tiesto which is indexed in this
>> format Tiësto in our
>> solr. we want solr also return result
>> return search result for á, à, â, ä, ã, å where they
>> are in word but that
>> word has been searched with normal a
>> e for ë, i for ï, o for ö, and so on
>> 
>> any solution?
>> 
>> hope i could tell what i need with my poor English
>> 
>> thanks
>> 
>> 
>> -- 
>> View this message in context:
>> http://www.nabble.com/foreign-characters-equivalent-in-solr-search-tp22079912p22079912.html
>> Sent from the Solr - User mailing list archive at
>> Nabble.com.
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/foreign-characters-equivalent-in-solr-search-tp22079912p22095325.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: foreign characters equivalent in solr search

2009-02-18 Thread radarghost

it may takes too long for Solr 1.4

any other solution for Solr 1.2?

anyway thanks for the reply.


Koji Sekiguchi-2 wrote:
> 
> CharFilter will solve the problem, but it comes with Solr 1.4.
> 
> https://issues.apache.org/jira/browse/SOLR-822
> 
> Koji
> 
> AHMET ARSLAN wrote:
>> I think best way to do this is to modify
>> org.apache.lucene.index.memory.SynonymTokenFilter and employ this filter
>> index time.
>>
>> if token.termBuffer() has one those (á, à, â, ä, ã, å) characters you
>> will replace it with its equvalent ascii character (a). Then you will
>> inject this new Token as a Synonym.
>>
>> I don't know is it the best way but it will give you what you want.
>>
>> --- On Wed, 2/18/09, radarghost  wrote:
>>
>>   
>>> From: radarghost 
>>> Subject: foreign characters equivalent in solr search
>>> To: solr-user@lucene.apache.org
>>> Date: Wednesday, February 18, 2009, 4:28 PM
>>> we are using solr 1.2 and dont want to upgrade to 1.3 till
>>> official release
>>> for Debian.
>>> i want solr to search for equivalent of a foreign chracter
>>> for getting
>>> better results
>>>
>>> in example:
>>>
>>> if a user searches for Tiesto which is indexed in this
>>> format Tiësto in our
>>> solr. we want solr also return result
>>> return search result for á, à, â, ä, ã, å where they
>>> are in word but that
>>> word has been searched with normal a
>>> e for ë, i for ï, o for ö, and so on
>>>
>>> any solution?
>>>
>>> hope i could tell what i need with my poor English
>>>
>>> thanks
>>>
>>>
>>> -- 
>>> View this message in context:
>>> http://www.nabble.com/foreign-characters-equivalent-in-solr-search-tp22079912p22079912.html
>>> Sent from the Solr - User mailing list archive at
>>> Nabble.com.
>>> 
>>
>>
>>   
>>
>>   
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/foreign-characters-equivalent-in-solr-search-tp22079912p22095354.html
Sent from the Solr - User mailing list archive at Nabble.com.