Re: How to retrieve the full corpus

2010-09-08 Thread Lance Norskog
If you want to do a mass scan of an index, the most scalable way is to make a variation of the Lucene CheckIndex program. Unfortunately, CheckIndex does not know any of the Solr types. But first, you should try the above techniques because they are much much easier. On Mon, Sep 6, 2010 at 7:59 AM

Re: How to retrieve the full corpus

2010-09-06 Thread Markus Jelsma
You can use Luke to inspect a Lucene index. Check the schema browser in your Solr admin interface for an example. On Monday 06 September 2010 16:52:03 Roland Villemoes wrote: > Hi All, > > How can I retrieve all words from a Solr core? > I need a list of all the words and how often they occur in

Re: How to retrieve the full corpus

2010-09-06 Thread Andrzej Bialecki
On 2010-09-06 17:15, Yonik Seeley wrote: On Mon, Sep 6, 2010 at 10:52 AM, Roland Villemoes wrote: How can I retrieve all words from a Solr core? I need a list of all the words and how often they occur in the index. http://wiki.apache.org/solr/TermsComponent It doesn't currently stream thoug

Re: How to retrieve the full corpus

2010-09-06 Thread Yonik Seeley
On Mon, Sep 6, 2010 at 10:52 AM, Roland Villemoes wrote: > How can I retrieve all words from a Solr core? > I need a list of all the words and how often they occur in the index. http://wiki.apache.org/solr/TermsComponent It doesn't currently stream though, so requesting *all* at once might take

Re: How to retrieve the full corpus

2010-09-06 Thread mike anderson
You might check out Luke, the Lucene Index Toolbox. http://www.getopt.org/luke/ I know you can browse the index and get frequency counts, though I'm not sure if you can export the entire index as a list like what you're looking for. Hope this helps, Mike On Mon, Sep 6, 2010 at 10:52 AM, Roland

How to retrieve the full corpus

2010-09-06 Thread Roland Villemoes
Hi All, How can I retrieve all words from a Solr core? I need a list of all the words and how often they occur in the index. med venlig hilsen/best regards Roland Villemoes Tel: (+45) 22 69 59 62 E-Mail: mailto:r...@alpha-solutions.dk Alpha Solutions A/S Borgergade 2, 3.sal, 1300 København K Te