I'm not sure if Solr is the right tool to do this task. You probably need a machine learning library like Mahout or Weka.
PS: Lucene doesn't really use Cosine Similarity, it's using a practical TF-IDF Similarity. Nicholas Ding On Wed, Nov 26, 2014 at 3:05 PM, Upayavira <u...@odoko.co.uk> wrote: > Hi, > > I've been asked how to use Solr as a component in a machine learning > system, doing document comparison based upon feature vectors. > > If I have two vectors, one in the index (in some form) and one in the > query (in some form), how can I do, for example, a vector multiplication > of the two vectors in order to calculate a score? > > The feature space I am being given has 100 features, with numerical > scores for each feature. In this case, it is not sparse - most features > will have a value. > > I have ideas, but it seems they get me some of the way, but not all. > > Has anyone worked with Solr in this way? > > Thanks, > > Upayavira >