More contextual information in anlyzers

2010-03-08 Thread dbejean

Hello,

If I write a custom analyser that accept a specific attribut in the
constructor

public MyCustomAnalyzer(String myAttribute);

Is there a way to dynamically send a value for this attribute from Solr at
index time in the XML Message ?


  
.


Obviously, in Sorl shema.xml, the "content" field is associated to my custom
Analyser.

Thank you.

Dominique

-- 
View this message in context: 
http://old.nabble.com/More-contextual-information-in-anlyzers-tp27819298p27819298.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: More contextual information in analyser

2010-03-08 Thread dbejean

It is true I need also this metadata at query time. For the moment, I put
this extra information at the beginning of the data too be indexed and at
the beginning of the query. It works, but I really don't like this. In my
case, I need the language of the data to be index and the language of the
query.

The goal is to dynamically use the correct chain of tokenizers and filters
according to the language and so use only one field in my index for all
languages.



Lance Norskog-2 wrote:
> 
> This is an interesting idea. There are other projects to make the
> analyzer/filter chain more "porous", or open to outside interaction.
> 
> A big problem is that queries are analyzed, too. If you want to give
> the same metadata to the analyzer when doing a query against the
> field, things get tough. You would need a special query parser to
> implement your own syntax to do that. However, the analyzer chain in
> the query phase does not receive the parsed query, so you have to in
> some way change this.
> 
> On Mon, Mar 8, 2010 at 2:14 AM, dbejean  wrote:
>>
>> Hello,
>>
>> If I write a custom analyser that accept a specific attribut in the
>> constructor
>>
>> public MyCustomAnalyzer(String myAttribute);
>>
>> Is there a way to dynamically send a value for this attribute from Solr
>> at
>> index time in the XML Message ?
>>
>> 
>>  
>>    .
>>
>>
>> Obviously, in Sorl shema.xml, the "content" field is associated to my
>> custom
>> Analyser.
>>
>> Thank you.
>>
>> Dominique
>>
>> --
>> View this message in context:
>> http://old.nabble.com/More-contextual-information-in-analyser-tp27819298p27819298.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> Lance Norskog
> goks...@gmail.com
> 
> 

-- 
View this message in context: 
http://old.nabble.com/More-contextual-information-in-analyser-tp27819298p27831948.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: More contextual information in anlyzers

2010-03-09 Thread dbejean

So, the way I made my analyzer is the good one. Thank you.


hossman wrote:
> 
> 
> : If I write a custom analyser that accept a specific attribut in the
> : constructor
> : 
> : public MyCustomAnalyzer(String myAttribute);
> : 
> : Is there a way to dynamically send a value for this attribute from Solr
> at
> : index time in the XML Message ?
> : 
> : 
> :   
> : .
> 
> fundementally there are two problems with trying to add functionality like 
> this into Solr...
> 
> 1) the XML Update syntax is just *one* of several differnet pathways that 
> data can make it into Solr, and well before it reaches your custom 
> analyzer, it's converted into what is essentially just a list of triplets 
> (fieldName,fieldvalue,boost).  So it would be hard to generalize out 
> additional metadata attributes associated with field values in a way that 
> could be generalized.
> 
> 2) In Solr (and in Lucene in general) you don't get a seperate ANalyzer 
> instance per field/value pair -- one Analyzer is reused over and over for 
> every field=>value in a doc (and in fact: the same analyzer is used over 
> and over for every document as well)
> 
> This is why people typically encode their "attributes" in the value, and 
> then write their Tokenizers in such a way that it decodes that info and 
> stores it as a Payload on the terms -- because even if you bypassed Solr's 
> pipeline for adding documents directly from some custom RequestHandler 
> that knew about your extended XML syntax, there wouldn't be anyway to pass 
> that metadata to the (Long lived) Analyzer instance.
> 
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/More-contextual-information-in-analyser-tp27819298p27845893.html
Sent from the Solr - User mailing list archive at Nabble.com.



Term Highlighting without store text in index

2010-03-15 Thread dbejean

Hello,

Just in order to be able to show term highlighting in my results list, I
store all the indexed data in the Lucene index and so, it is very huge
(108Gb). Is there any possibilities to do it in an other way ? Now or in the
future, is it possible that Solr use a 3nd-party tool such as ehcache in
order to store the content of the indexed documents outside of the Lucene
index ?

Thank you

Dominique


-- 
View this message in context: 
http://old.nabble.com/Term-Highlighting-without-store-text-in-index-tp27904022p27904022.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Term Highlighting without store text in index

2010-03-19 Thread dbejean

Thank you, I will test this.


Alexey-34 wrote:
> 
> Hey Dominique,
> 
> See
> http://www.lucidimagination.com/search/document/5ea8054ed8348e6f/highlight_arbitrary_text#3799814845ebf002
> 
> Although it might be not good solution for huge texts, wildcard/phrase
> queries.
> http://issues.apache.org/jira/browse/SOLR-1397
> 
> On Mon, Mar 15, 2010 at 4:09 PM, dbejean 
> wrote:
>>
>> Hello,
>>
>> Just in order to be able to show term highlighting in my results list, I
>> store all the indexed data in the Lucene index and so, it is very huge
>> (108Gb). Is there any possibilities to do it in an other way ? Now or in
>> the
>> future, is it possible that Solr use a 3nd-party tool such as ehcache in
>> order to store the content of the indexed documents outside of the Lucene
>> index ?
>>
>> Thank you
>>
>> Dominique
>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/Term-Highlighting-without-store-text-in-index-tp27904022p27904022.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Term-Highlighting-without-store-text-in-index-tp27904022p27951004.html
Sent from the Solr - User mailing list archive at Nabble.com.