Hi All - One of the really neat features of solr 6 is the ability to
create machine learning models (information gain) and then use those
models as a query. If I want a user to be able to execute a query for
the text Hawaii and use a machine learning model related to weather
data, how can I correctly rank the results? It looks like I would need
to classify all the documents in some date range (assuming the query is
date restricted), look at the probability_d and pick the top n
documents. Is there a better way to do this?
I'm using a stream like this:
classify(model(models,id="WeatherModel",cacheMillis=5000),search(COL1,df="FULL_DOCUMENT",q="Hawaii
AND DocTimestamp:[2017-07-23T04:00:00Z TO
2017-08-23T03:59:00Z]",fl="ClusterText,id",sort="id
asc",rows="10000"),field="ClusterText")
This sends this to all the shards who can return at most 10,000 docs each.
Thanks!
-Joe