RE: Using MLT feature

2011-04-08 Thread Frederico Azeiteiro
2011 10:11 To: solr-user@lucene.apache.org Subject: Re: Using MLT feature Couldn't you extend the TextProfileSignature and modify the TokenComparator class to use lexical order when token have the same frequency ? Ludovic. 2011/4/8 Frederico Azeiteiro [via Lucene] < ml-node+2794604-1

Re: Using MLT feature

2011-04-08 Thread lboutros
de=2794604&i=0&by-user=t>] > > Sent: sexta-feira, 8 de Abril de 2011 09:49 > To: [hidden > email]<http://user/SendEmail.jtp?type=node&node=2794604&i=1&by-user=t> > Subject: Re: Using MLT feature > > It seems that tokens are sorted by frequencies :

RE: Using MLT feature

2011-04-08 Thread Frederico Azeiteiro
order comes from the way they are inserted in hashmap 'tokens' and not from the order the tokens appear on original text. Frederico -Original Message- From: lboutros [mailto:boutr...@gmail.com] Sent: sexta-feira, 8 de Abril de 2011 09:49 To: solr-user@lucene.apache.org Subject: Re:

Re: Using MLT feature

2011-04-08 Thread lboutros
It seems that tokens are sorted by frequencies : ... Collections.sort(profile, new TokenComparator()); ... and private static class TokenComparator implements Comparator { public int compare(Token t1, Token t2) { return t2.cnt - t1.cnt; } and cnt is the token count. Ludovic. 20

RE: Using MLT feature

2011-04-07 Thread Frederico Azeiteiro
okens by some hashmap internal sort method that I can't understand :), and so, impossible to copy to C# implementation. Thank you for all your help, Frederico  -Original Message- From: Lance Norskog [mailto:goks...@gmail.com] Sent: quinta-feira, 7 de Abril de 2011 04:09 To: solr-use

Re: Using MLT feature

2011-04-06 Thread Lance Norskog
my apps (Java and C#)  return the same signature but SOLR returns a > different one.. > Can anyone understand what I should be doing wrong? > > Thank you once again. > > Frederico > > -Original Message- > From: Markus Jelsma [mailto:markus.jel...@openindex.io] >

RE: Using MLT feature

2011-04-06 Thread Frederico Azeiteiro
Jelsma [mailto:markus.jel...@openindex.io] Sent: terça-feira, 5 de Abril de 2011 15:20 To: solr-user@lucene.apache.org Cc: Frederico Azeiteiro Subject: Re: Using MLT feature If you check the code for TextProfileSignature [1] your'll notice the init method reading params. You can set those pa

Re: Using MLT feature

2011-04-05 Thread Markus Jelsma
5 > > On the processor tag. > > Best regards, > Frederico > > > -Original Message- > From: Markus Jelsma [mailto:markus.jel...@openindex.io] > Sent: terça-feira, 5 de Abril de 2011 12:01 > To: solr-user@lucene.apache.org > Cc: Frederico Azeiteiro > S

RE: Using MLT feature

2011-04-05 Thread Frederico Azeiteiro
essor tag. Best regards, Frederico  -Original Message- From: Markus Jelsma [mailto:markus.jel...@openindex.io] Sent: terça-feira, 5 de Abril de 2011 12:01 To: solr-user@lucene.apache.org Cc: Frederico Azeiteiro Subject: Re: Using MLT feature On Tuesday 05 April 2011 12:19:33 Fred

Re: Using MLT feature

2011-04-05 Thread Markus Jelsma
k you, > Frederico > > > -----Original Message- > From: Markus Jelsma [mailto:markus.jel...@openindex.io] > Sent: segunda-feira, 4 de Abril de 2011 16:47 > To: solr-user@lucene.apache.org > Cc: Frederico Azeiteiro > Subject: Re: Using MLT feature > > > Hi

RE: Using MLT feature

2011-04-05 Thread Frederico Azeiteiro
at these parameters can help creating the same sig for the above example? Is anyone using the TextProfileSignature with success? Thank you, Frederico -Original Message- From: Markus Jelsma [mailto:markus.jel...@openindex.io] Sent: segunda-feira, 4 de Abril de 2011 16:47 To: solr-user@

Re: Using MLT feature

2011-04-04 Thread Markus Jelsma
--Original Message- > From: Frederico Azeiteiro [mailto:frederico.azeite...@cision.com] > Sent: segunda-feira, 4 de Abril de 2011 11:59 > To: solr-user@lucene.apache.org > Subject: RE: Using MLT feature > > Thank you Markus it looks great. >

RE: Using MLT feature

2011-04-04 Thread Frederico Azeiteiro
Azeiteiro [mailto:frederico.azeite...@cision.com] Sent: segunda-feira, 4 de Abril de 2011 11:59 To: solr-user@lucene.apache.org Subject: RE: Using MLT feature Thank you Markus it looks great. But the wiki is not very detailed on this. Do you mean if I: 1. Create: true false

RE: Using MLT feature

2011-04-04 Thread Frederico Azeiteiro
t: segunda-feira, 4 de Abril de 2011 10:48 To: solr-user@lucene.apache.org Subject: Re: Using MLT feature http://wiki.apache.org/solr/Deduplication On Monday 04 April 2011 11:34:52 Frederico Azeiteiro wrote: > Hi, > > The ideia is don't index if something similar (headline+bodyte

Re: Using MLT feature

2011-04-04 Thread Markus Jelsma
in a temp index) > and then use the MLT feature to find similar docs before adding to final > index? > > Thanks, > Frederico > > > -Original Message- > From: Chris Fauerbach [mailto:chris.fauerb...@gmail.com] > Sent: segunda-feira, 4 de Abril de 2011 10:22 >

RE: Using MLT feature

2011-04-04 Thread Frederico Azeiteiro
ssage- From: Chris Fauerbach [mailto:chris.fauerb...@gmail.com] Sent: segunda-feira, 4 de Abril de 2011 10:22 To: solr-user@lucene.apache.org Subject: Re: Using MLT feature Do you want to not index if something similar? Or don't index if exact. I would look into a hash code of the docum

Re: Using MLT feature

2011-04-04 Thread Chris Fauerbach
Do you want to not index if something similar? Or don't index if exact. I would look into a hash code of the document if you don't want to index exact. Similar though, I think has to be based off a document in the index. On Apr 4, 2011, at 5:16, Frederico Azeiteiro wrote: > Hi, > >