Thank you Christine! I am still in the data gathering / model building phase 
and I have not yet re-ranked my results so that makes sense. It sounds like 
when I add re-ranking, the caching will start working. Thanks!

--Brian

-----Original Message-----
From: Christine Poerschke (BLOOMBERG/ LONDON) [mailto:cpoersc...@bloomberg.net] 
Sent: Tuesday, October 31, 2017 8:48 AM
To: solr-user@lucene.apache.org
Subject: RE: LTR feature extraction performance issues

Hi Brian,

I just tried to explore the scenario you describe with the techproducts example 
and am able to see what you see:

# step 1: start solr with techproducts example and ltr enabled # step 2: upload 
one feature (originalScore) and one model using that feature # step 3: examine 
cache stats via the Admin UI (all zero to start with) # step 4: run a query 
which includes feature extraction e.g. [features] in fl # step 5: examine cache 
stats to see lookups but no inserts # step 6: run a query with feature 
extraction _and_ re-ranking using the model # step 7: examine cache stats to 
see both lookups and inserts

Looking around the code the cache insert happens in FeatureLogger.java [1] 
which is called by the Rescorer [2] and this would allow the 'fl' feature 
logging to reuse the feature values calculated as part of the 'rq' re-ranking.

However, if there was no feature value in the cache (because no 'rq' re-ranking 
happened) then the feature value is calculated by 
LTRFeatureLoggerTransformerFactory.java [3] and based on code inspection the 
results of that calculation are not added to the cache.

It might be interesting to explore if/how that logic [3] could be changed.

--Christine

[1] 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.1.0/solr/contrib/ltr/src/java/org/apache/solr/ltr/FeatureLogger.java#L51-L60
[2] 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.1.0/solr/contrib/ltr/src/java/org/apache/solr/ltr/LTRRescorer.java#L185-L205
[3] 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.1.0/solr/contrib/ltr/src/java/org/apache/solr/ltr/response/transform/LTRFeatureLoggerTransformerFactory.java#L267-L280

----- Original Message -----
From: solr-user@lucene.apache.org
To: solr-user@lucene.apache.org
At: 10/30/17 16:55:14

I'm still having this issue. Does anyone have LTR feature extraction 
successfully running and have cache inserts/hits?

--Brian

-----Original Message-----
From: Brian Yee [mailto:b...@wayfair.com]
Sent: Tuesday, October 24, 2017 12:14 PM
To: solr-user@lucene.apache.org
Subject: RE: LTR feature extraction performance issues

Hi Alessandro,

Unfortunately some of my most important features are query dependent. I think I 
found an issue though. I don't think my features are being inserted into the 
cache. Notice "cumulative_inserts:0". There are a lot of lookups, but since 
there appear to be no values in the cache, the hitratio is 0.

stats:
cumulative_evictions:0
cumulative_hitratio:0
cumulative_hits:0
cumulative_inserts:0
cumulative_lookups:215319
evictions:0
hitratio:0
hits:0
inserts:0
lookups:3303
size:0
warmupTime:0


My configs look are as follows:

<cache name="QUERY_DOC_FV" class="solr.search.LRUCache" size="4096" 
initialSize="2048" autowarmCount="4096" 
regenerator="solr.search.NoOpRegenerator" />

  <queryParser name="ltr" class="org.apache.solr.ltr.search.LTRQParserPlugin"/>

  <transformer name="features" 
class="org.apache.solr.ltr.response.transform.LTRFeatureLoggerTransformerFactory">
    <str name="fvCacheName">QUERY_DOC_FV</str>
    <str name="defaultFormat">sparse</str>
  </transformer>

Would anyone have any idea why my features are not being inserted into the 
cache? Is there an additional config setting I need?


--Brian

-----Original Message-----
From: alessandro.benedetti [mailto:a.benede...@sease.io]
Sent: Monday, October 23, 2017 10:01 AM
To: solr-user@lucene.apache.org
Subject: Re: LTR feature extraction performance issues

It strictly depends on the kind of features you are using.
At the moment there is just one cache for all the features.
This means that even if you have 1 query dependent feature and 100 document 
dependent feature, a different value for the query dependent one will 
invalidate the cache entry for the full vector[1].

You may look to optimise your features ( where possible).

[1]  https://issues.apache.org/jira/browse/SOLR-10448



-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Reply via email to