Have you taken a look at Solr's TermVector component? It's probably
what you want:

http://wiki.apache.org/solr/TermVectorComponent

didier

On Tue, Jun 15, 2010 at 8:38 AM, sarfaraz masood
<sarfarazmasood2...@yahoo.com> wrote:
> I am Sarfaraz, working on a Search Engine
> project which is based on Nutch & Solr. I am trying to implement a
> new Search Algorithm for this engine.
>
> Our search engine is crawling the web and storing the documents in form of 
> large strings in the database indexed by their urls.
>
> Now
> to implement my algorithm i need tf - idf values(0 - 1) for each
> document given by the crawler. but i m unable to find any method in
> solr or lucene which can serve my purpose.
>
> For my algorithm i need to maintain a relevance matrix of the following type :
>
> eg
>         term1   term2    term3    term4...........
> url1    0.7       0.8
>  0.3        0.1
> url2    0.4       0.1       0.4       0.5
> url3
>
> .
> .
> .
> and
> for this purpose i need a core java method/function in solr that
> returns me the tf idf values for all terms in all documents for the
> available document list..
>
> Plz help
>
> I will highly grateful to you all
>
> -Sarfaraz Masood
>
>

Reply via email to