Can We append a field to the response that is not in the index but computed at runtime.

2008-03-28 Thread Umar Shah
Hi,

I wanted to know whether we can append a field (Fdyn say) to each doc in the
returned set
Fdyn is computed as some complex function of the fields stored in the index
during the runtime in SOLR.



-umar


RE: Highlight - get terms used by lucene

2008-03-28 Thread Tim Mahy
Hi,

Solr returns the max score and the score per document.
This means that the best hit always is 100% which is not always what you want 
because the article itself could still be quite irrelevant...

groeten,
Tim


-Oorspronkelijk bericht-
Van: Chris Hostetter [mailto:[EMAIL PROTECTED]
Verzonden: vr 28-3-2008 4:34
Aan: solr-user@lucene.apache.org
Onderwerp: RE: Highlight - get terms used by lucene


: thanks for the answer, with that information I can pull out the term
: frequency. Reason for all this, is that we want to use this scoring
: algorithm:
: http://download-uk.oracle.com/docs/cd/B19306_01/text.102/b14218/ascore.htm

Uh why?  Based on the description this sounds exactly like the Lucene
scoring formula with some of hte details glossed over ... why not just use
the score Solr computes for you?


-Hoss






Info Support - http://www.infosupport.com

Alle informatie in dit e-mailbericht is onder voorbehoud. Info Support is op 
geen enkele wijze aansprakelijk voor vergissingen of onjuistheden in dit 
bericht en staat niet in voor de juiste en volledige overbrenging van de inhoud 
hiervan. Op al de werkzaamheden door Info Support uitgevoerd en op al de aan 
ons gegeven opdrachten zijn - tenzij expliciet anders overeengekomen - onze 
Algemene Voorwaarden van toepassing, gedeponeerd bij de Kamer van Koophandel te 
Utrecht onder nr. 30135370. Een exemplaar zenden wij u op uw verzoek per 
omgaande kosteloos toe.

De informatie in dit e-mailbericht is uitsluitend bestemd voor de 
geadresseerde. Gebruik van deze informatie door anderen is verboden. 
Openbaarmaking, vermenigvuldiging, verspreiding en/of verstrekking van deze 
informatie aan derden is niet toegestaan.

Dit e-mailbericht kan vertrouwelijke informatie bevatten. Indien u dit bericht 
dus per ongeluk ontvangt, stelt Info Support het op prijs als u de zender door 
een antwoord op deze e-mail hiervan op de hoogte brengt en deze e-mail 
vervolgens vernietigt.


Re: synonyms

2008-03-28 Thread Erick Erickson
Your problem might be solved by (from memory, so check it), using a filter
for indexing that collapses flexed (accented etc?) characters.
See IsoLatin1AccentFilter

Best
Erick

On Tue, Mar 25, 2008 at 1:56 PM, Lucas F. A. Teixeira <
[EMAIL PROTECTED]> wrote:

> Hello all,
>
> We r having some problems using solr synonyms. If I define a synonym for
> example:
>
> refrigerador,geladeira
>
> And if I search for "refrigerador", I'll have all results for
> "refrigerador", for "geladeira", and all results for the flexed words
> for what i've typed (refrigerador, refrigerado, refrigeração, etc). But
> I won't find the results for the flexed words of the synonym that i've
> defined (geladeira), for example "gelado, gelo, etc".
>
>
> Do you guys know how can i solve this issue?
>
> Thanks all!
>
> []s,
>
> Lucas
>


Re: synonyms

2008-03-28 Thread Lucas F. A. Teixeira

Thanks Erick,

But its already being used :-(

still looking for something :-)

Thank you!

[]s,

Lucas

Erick Erickson wrote:

Your problem might be solved by (from memory, so check it), using a filter
for indexing that collapses flexed (accented etc?) characters.
See IsoLatin1AccentFilter

Best
Erick

On Tue, Mar 25, 2008 at 1:56 PM, Lucas F. A. Teixeira <
[EMAIL PROTECTED]> wrote:

  

Hello all,

We r having some problems using solr synonyms. If I define a synonym for
example:

refrigerador,geladeira

And if I search for "refrigerador", I'll have all results for
"refrigerador", for "geladeira", and all results for the flexed words
for what i've typed (refrigerador, refrigerado, refrigeração, etc). But
I won't find the results for the flexed words of the synonym that i've
defined (geladeira), for example "gelado, gelo, etc".


Do you guys know how can i solve this issue?

Thanks all!

[]s,

Lucas




  


Re: Solr commits automatically on appserver shutdown

2008-03-28 Thread Noble Paul നോബിള്‍ नोब्ळ्
hi,
I am willing to work on this if you can give me some pointers as to
where to start?


On Thu, Mar 27, 2008 at 9:48 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On Thu, Mar 27, 2008 at 12:11 AM, Noble Paul നോബിള്‍ नोब्ळ्
>
> <[EMAIL PROTECTED]> wrote:
>
> > Can I make an API call to remove the stale indexsearcher so that the
>  >  documents do not get committed?
>  >
>  >  Basically what I need is a 'rollback'  feature
>
>  This should be possible when Solr starts using Lucene's update,
>  delete, and deleteByQuery features on the IndexWriter.
>
>  -Yonik
>



-- 
--Noble Paul


Re: synonyms

2008-03-28 Thread Erick Erickson
H. Could you provide some more examples? I'm having a hard
time figuring out what's going into the index, what you're searching
on and what you're getting...

In particular what filters are you using for *both* indexing and queries...

Best
Erick

On Fri, Mar 28, 2008 at 1:33 PM, Lucas F. A. Teixeira <
[EMAIL PROTECTED]> wrote:

> Thanks Erick,
>
> But its already being used :-(
>
> still looking for something :-)
>
> Thank you!
>
> []s,
>
> Lucas
>
> Erick Erickson wrote:
> > Your problem might be solved by (from memory, so check it), using a
> filter
> > for indexing that collapses flexed (accented etc?) characters.
> > See IsoLatin1AccentFilter
> >
> > Best
> > Erick
> >
> > On Tue, Mar 25, 2008 at 1:56 PM, Lucas F. A. Teixeira <
> > [EMAIL PROTECTED]> wrote:
> >
> >
> >> Hello all,
> >>
> >> We r having some problems using solr synonyms. If I define a synonym
> for
> >> example:
> >>
> >> refrigerador,geladeira
> >>
> >> And if I search for "refrigerador", I'll have all results for
> >> "refrigerador", for "geladeira", and all results for the flexed words
> >> for what i've typed (refrigerador, refrigerado, refrigeração, etc). But
> >> I won't find the results for the flexed words of the synonym that i've
> >> defined (geladeira), for example "gelado, gelo, etc".
> >>
> >>
> >> Do you guys know how can i solve this issue?
> >>
> >> Thanks all!
> >>
> >> []s,
> >>
> >> Lucas
> >>
> >>
> >
> >
>


RE: synonyms

2008-03-28 Thread Lance Norskog
Lucas- 

Your examples are Portuguese and Spanish. You might find a Spanish-language
stemmer that follows the very rigid conjugation in Spanish (and I'm assuming
in Portuguese as well). Spanish follows conjugation rules that embed much
more semantics than English, so a huge number of synonyms can be stemmed to
the same word.

Lance

-Original Message-
From: Erick Erickson [mailto:[EMAIL PROTECTED] 
Sent: Friday, March 28, 2008 11:13 AM
To: solr-user@lucene.apache.org
Subject: Re: synonyms

H. Could you provide some more examples? I'm having a hard time figuring
out what's going into the index, what you're searching on and what you're
getting...

In particular what filters are you using for *both* indexing and queries...

Best
Erick

On Fri, Mar 28, 2008 at 1:33 PM, Lucas F. A. Teixeira <
[EMAIL PROTECTED]> wrote:

> Thanks Erick,
>
> But its already being used :-(
>
> still looking for something :-)
>
> Thank you!
>
> []s,
>
> Lucas
>
> Erick Erickson wrote:
> > Your problem might be solved by (from memory, so check it), using a
> filter
> > for indexing that collapses flexed (accented etc?) characters.
> > See IsoLatin1AccentFilter
> >
> > Best
> > Erick
> >
> > On Tue, Mar 25, 2008 at 1:56 PM, Lucas F. A. Teixeira < 
> > [EMAIL PROTECTED]> wrote:
> >
> >
> >> Hello all,
> >>
> >> We r having some problems using solr synonyms. If I define a 
> >> synonym
> for
> >> example:
> >>
> >> refrigerador,geladeira
> >>
> >> And if I search for "refrigerador", I'll have all results for 
> >> "refrigerador", for "geladeira", and all results for the flexed 
> >> words for what i've typed (refrigerador, refrigerado, refrigeração, 
> >> etc). But I won't find the results for the flexed words of the 
> >> synonym that i've defined (geladeira), for example "gelado, gelo, etc".
> >>
> >>
> >> Do you guys know how can i solve this issue?
> >>
> >> Thanks all!
> >>
> >> []s,
> >>
> >> Lucas
> >>
> >>
> >
> >
>



Re: synonyms

2008-03-28 Thread Leonardo Santagada


On 28/03/2008, at 16:28, Lance Norskog wrote:

Lucas-

Your examples are Portuguese and Spanish. You might find a Spanish- 
language
stemmer that follows the very rigid conjugation in Spanish (and I'm  
assuming
in Portuguese as well). Spanish follows conjugation rules that embed  
much
more semantics than English, so a huge number of synonyms can be  
stemmed to

the same word.


Well his examples are in brazilian portuguese and not spanish and the  
biggest problem is that a spanish stemmer is not goin to work. I  
haven't found a pt_BR steammer, have I overlooked something?


--
Leonardo Santagada






Re: Making stop-words optional with DisMax?

2008-03-28 Thread Chris Hostetter

: Operationally, I was thinking a tokenizer could use the stop-word list
: (or an optional-word list) to mark tokens as optional rather than
: removing them from the token stream.  DisMaxOptional would then
: generate appropriate queries with the non-optionals as the core and
: then permute the optionals around those as optional clauses.  I say
: this with no deep understanding of how DisMax does its thing, of
: course, so feel free to call me naive.

you're not naive ... the problem is just that *all* of the clauses are 
allready optional (unless the term had a "+" or "-" in front of it), 
that's where the mm param comes in, it decides how many of those optional 
params should be mandatory.

it sounds like what you want is for a new DisMaxOptional parser to look at 
this...

on mice and men

and because it knows "on" and "and" are stop words, treat it the same as 
if the current DisMax parsed this...

on +mice and +men

which is another interesting idea, but it changes the meaning of "mm" 
significantly, in that dismax with alow mm would not longer be tolerant of 
mispelled (or missing) words unless they were stop words.

my gut tells me changing dismax so that having multiple qf params result 
in multiple dismax queries would address your problem more directly.

: I think I've so internalized list advice *not* to generate multiple
: queries that that didn't readily occur to me.  :-)   One problem I
: suppose is that query might return some results but not the desired
: one (perhaps there is a title On Men and Mice) and so I don't get to
: the second query ("mice men" once stopped) that would get me Of Mice
: and Men.  But an improvement in cases where no results come back from
: an overspecified query, I'd agree.

...which is why multiple dismax queries as clauses in the main query 
would be good ... the results from each would be blended together.

: The other thought I've had is to just do some query analysis up front
: prior to submission -- if the query is all stops, send it to a
...
: to boost up exact matches.  I hate the analysis step which would
: probably duplicate the tokenization done by solr, but might be worth
: it.  There'd still be some problematic queries, but this may be as
: close as it'll get.

you could probably skip the external analysis by swapping the order of 
your queries and looking at the debuging output when hitting the "second" 
query ... if your stopworded fields don't appear in the parsed query 
structure, then it's all stopwords, so you do need your "first" query.


-Hoss