subject:"Generating large datasets for Solr proof\-of\-concept"

Re: Generating large datasets for Solr proof-of-concept

2011-09-17 Thread Pulkit Singhal

Thanks Hoss. I agree that the way you restated the question is better for getting results. BTW I think you've tipped me off to exactly what I needed with this URL: http://bbyopen.com/ Thanks! - Pulkit On Fri, Sep 16, 2011 at 4:35 PM, Chris Hostetter wrote: > > : Has anyone ever had to create lar

Re: Generating large datasets for Solr proof-of-concept

2011-09-16 Thread Chris Hostetter

: Has anyone ever had to create large mock/dummy datasets for test : environments or for POCs/Demos to convince folks that Solr was the : wave of the future? Any tips would be greatly appreciated. I suppose : it sounds a lot like crawling even though it started out as innocent : DIH usage. the be

Re: Generating large datasets for Solr proof-of-concept

2011-09-16 Thread Toke Eskildsen

On Thu, 2011-09-15 at 22:54 +0200, Pulkit Singhal wrote: > Has anyone ever had to create large mock/dummy datasets for test > environments or for POCs/Demos to convince folks that Solr was the > wave of the future? Yes, but I did it badly. The problem is that real data are not random so any simple

Re: Generating large datasets for Solr proof-of-concept

2011-09-15 Thread Lance Norskog

http://aws.amazon.com/datasets DBPedia might be the easiest to work with: http://aws.amazon.com/datasets/2319 Amazon has a lot of these things. Infochimps.com is a marketplace for free & pay versions. Lance On Thu, Sep 15, 2011 at 6:55 PM, Pulkit Singhal wrote: > Ah missing } doh! > > BTW I s

Re: Generating large datasets for Solr proof-of-concept

2011-09-15 Thread Pulkit Singhal

Thanks for all the feedback thus far. Now to get little technical about it :) I was thinking of feeding a file with all the tags of amazon that yield close to roughly 5 results each into a file and then running my rss DIH off of that, I came up with the following config but something is amiss

Re: Generating large datasets for Solr proof-of-concept

2011-09-15 Thread Pulkit Singhal

Ah missing } doh! BTW I still welcome any ideas on how to build an e-commerce test base. It doesn't have to be amazon that was jsut my approach, any one? - Pulkit On Thu, Sep 15, 2011 at 8:52 PM, Pulkit Singhal wrote: > Thanks for all the feedback thus far. Now to get little technical about it

Re: Generating large datasets for Solr proof-of-concept

2011-09-15 Thread Markus Jelsma

If we want to test with huge amounts of data we feed portions of the internet. The problem is it takes a lot of bandwith and lots of computing power to get to a `reasonable` size. On the positive side, you deal with real text so it's easier to tune for relevance. I think it's easier to create a

Re: Generating large datasets for Solr proof-of-concept

2011-09-15 Thread Daniel Skiles

I've done it using SolrJ and a *lot *of of parallel processes feeding dummy data into the server. On Thu, Sep 15, 2011 at 4:54 PM, Pulkit Singhal wrote: > Hello Everyone, > > I have a goal of populating Solr with a million unique products in > order to create a test environment for a proof of con

Generating large datasets for Solr proof-of-concept

2011-09-15 Thread Pulkit Singhal

Hello Everyone, I have a goal of populating Solr with a million unique products in order to create a test environment for a proof of concept. I started out by using DIH with Amazon RSS feeds but I've quickly realized that there's no way I can glean a million products from one RSS feed. And I'd go

Re: Generating large datasets for Solr proof-of-concept

Re: Generating large datasets for Solr proof-of-concept

Re: Generating large datasets for Solr proof-of-concept

Re: Generating large datasets for Solr proof-of-concept

Re: Generating large datasets for Solr proof-of-concept

Re: Generating large datasets for Solr proof-of-concept

Re: Generating large datasets for Solr proof-of-concept

Re: Generating large datasets for Solr proof-of-concept

Generating large datasets for Solr proof-of-concept

9 matches

Site Navigation

Mail list logo

Footer information