Hi Guys,
is it possible to have any feedback ?
Is there any process to speed up bug resolution / discussions ?
just want to understand if the patch is not good enough, if I need to
improve it or simply no-one took a look ...

https://issues.apache.org/jira/browse/LUCENE-6954

Cheers

On 11 January 2016 at 15:25, Alessandro Benedetti <abenede...@apache.org>
wrote:

> Hi guys,
> the patch seems fine to me.
> I didn't spend much more time on the code but I checked the tests and the
> pre-commit checks.
> It seems fine to me.
> Let me know ,
>
> Cheers
>
> On 31 December 2015 at 18:40, Alessandro Benedetti <abenede...@apache.org>
> wrote:
>
>> https://issues.apache.org/jira/browse/LUCENE-6954
>>
>> First draft patch available, I will check better the tests new year !
>>
>> On 29 December 2015 at 13:43, Alessandro Benedetti <abenede...@apache.org
>> > wrote:
>>
>>> Sure, I will proceed tomorrow with the Jira and the simple patch + tests.
>>>
>>> In the meantime let's try to collect some additional feedback.
>>>
>>> Cheers
>>>
>>> On 29 December 2015 at 12:43, Anshum Gupta <ans...@anshumgupta.net>
>>> wrote:
>>>
>>>> Feel free to create a JIRA and put up a patch if you can.
>>>>
>>>> On Tue, Dec 29, 2015 at 4:26 PM, Alessandro Benedetti <
>>>> abenede...@apache.org
>>>> > wrote:
>>>>
>>>> > Hi guys,
>>>> > While I was exploring the way we build the More Like This query, I
>>>> > discovered a part I am not convinced of :
>>>> >
>>>> >
>>>> >
>>>> > Let's see how we build the query :
>>>> > org.apache.lucene.queries.mlt.MoreLikeThis#retrieveTerms(int)
>>>> >
>>>> > 1) we extract the terms from the interesting fields, adding them to a
>>>> map :
>>>> >
>>>> > Map<String, Int> termFreqMap = new HashMap<>();
>>>> >
>>>> > *( we lose the relation field-> term, we don't know anymore where the
>>>> term
>>>> > was coming ! )*
>>>> >
>>>> > org.apache.lucene.queries.mlt.MoreLikeThis#createQueue
>>>> >
>>>> > 2) we build the queue that will contain the query terms, at this
>>>> point we
>>>> > connect again there terms to some field, but :
>>>> >
>>>> > ...
>>>> >> // go through all the fields and find the largest document frequency
>>>> >> String topField = fieldNames[0];
>>>> >> int docFreq = 0;
>>>> >> for (String fieldName : fieldNames) {
>>>> >>   int freq = ir.docFreq(new Term(fieldName, word));
>>>> >>   topField = (freq > docFreq) ? fieldName : topField;
>>>> >>   docFreq = (freq > docFreq) ? freq : docFreq;
>>>> >> }
>>>> >> ...
>>>> >
>>>> >
>>>> > We identify the topField as the field with the highest document
>>>> frequency
>>>> > for the term t .
>>>> > Then we build the termQuery :
>>>> >
>>>> > queue.add(new ScoreTerm(word, *topField*, score, idf, docFreq, tf));
>>>> >
>>>> > In this way we lose a lot of precision.
>>>> > Not sure why we do that.
>>>> > I would prefer to keep the relation between terms and fields.
>>>> > The MLT query can improve a lot the quality.
>>>> > If i run the MLT on 2 fields : *description* and *facilities* for
>>>> example.
>>>> > It is likely I want to find documents with similar terms in the
>>>> > description and similar terms in the facilities, without mixing up the
>>>> > things and loosing the semantic of the terms.
>>>> >
>>>> > Let me know your opinion,
>>>> >
>>>> > Cheers
>>>> >
>>>> >
>>>> > --
>>>> > --------------------------
>>>> >
>>>> > Benedetti Alessandro
>>>> > Visiting card : http://about.me/alessandro_benedetti
>>>> >
>>>> > "Tyger, tyger burning bright
>>>> > In the forests of the night,
>>>> > What immortal hand or eye
>>>> > Could frame thy fearful symmetry?"
>>>> >
>>>> > William Blake - Songs of Experience -1794 England
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Anshum Gupta
>>>>
>>>
>>>
>>>
>>> --
>>> --------------------------
>>>
>>> Benedetti Alessandro
>>> Visiting card : http://about.me/alessandro_benedetti
>>>
>>> "Tyger, tyger burning bright
>>> In the forests of the night,
>>> What immortal hand or eye
>>> Could frame thy fearful symmetry?"
>>>
>>> William Blake - Songs of Experience -1794 England
>>>
>>
>>
>>
>> --
>> --------------------------
>>
>> Benedetti Alessandro
>> Visiting card : http://about.me/alessandro_benedetti
>>
>> "Tyger, tyger burning bright
>> In the forests of the night,
>> What immortal hand or eye
>> Could frame thy fearful symmetry?"
>>
>> William Blake - Songs of Experience -1794 England
>>
>
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Reply via email to