Have you taken a look at Solr's TermVector component? It's probably what you want:
http://wiki.apache.org/solr/TermVectorComponent didier On Tue, Jun 15, 2010 at 8:38 AM, sarfaraz masood <sarfarazmasood2...@yahoo.com> wrote: > I am Sarfaraz, working on a Search Engine > project which is based on Nutch & Solr. I am trying to implement a > new Search Algorithm for this engine. > > Our search engine is crawling the web and storing the documents in form of > large strings in the database indexed by their urls. > > Now > to implement my algorithm i need tf - idf values(0 - 1) for each > document given by the crawler. but i m unable to find any method in > solr or lucene which can serve my purpose. > > For my algorithm i need to maintain a relevance matrix of the following type : > > eg > term1 term2 term3 term4........... > url1 0.7 0.8 > 0.3 0.1 > url2 0.4 0.1 0.4 0.5 > url3 > > . > . > . > and > for this purpose i need a core java method/function in solr that > returns me the tf idf values for all terms in all documents for the > available document list.. > > Plz help > > I will highly grateful to you all > > -Sarfaraz Masood > >