Re: Large Data Set Suggestions

Noble Paul നോബിള്‍ नोब्ळ् Wed, 05 Nov 2008 19:35:51 -0800

The performance of DIH is likely to be faster than SolrJ. Because , it
does not have the overhead of an http request.
What is your data source? I am assuming it is xml. SolrJ cannot
directly index xml . You may need to read docs from xml before solrj
can index it.




--Noble

On Wed, Nov 5, 2008 at 9:22 PM, Steven Anderson <[EMAIL PROTECTED]> wrote:
> Greetings!
>
> I've been asked to do some indexing performance testing on Solr 1.3
> using large XML document data sets (10M-60M docs) with DIH versus SolrJ.
>
>
> Does anyone have any suggestions where I might find a good data set this
> size?
>
> I saw the wikipedia dump reference in the DIH wiki, but that is only in
> the 7M+ doc range.
>
> Any suggestions would be greatly appreciated.
>
> Thanks,
>
> Steve
>
>
>



-- 
--Noble Paul

Re: Large Data Set Suggestions

Reply via email to