Hey AJ,
For simplicity sake, I am using Solr to serve as storage and search for
http://researchwatch.net.
The dataset is 110K NSF grants from 1999 to 2009. The faceting is all
dynamic fields and I use a catch all to copy all fields to a default
text field. All fields are also stored and used for individual grant view.
The performance seems fine for my purposes. I haven't done any extensive
benchmarking with it. The site was built using a light ROR/rsolr layer
on a small EC2 instance.
Feel free to bang against the site with jmeter if you want to stress
test a sample server to failure. :)
--
Tommy Chheng
Developer & UC Irvine Graduate Student
http://tommy.chheng.com
On 2/3/10 5:41 PM, AJ Asver wrote:
Hi all,
I work on search at Scoopler.com, a real-time search engine which uses Solr.
We current use solr for indexing but then fetch data from our couchdb
cluster using the IDs solr returns. We are now considering storing a larger
portion of data in Solr's index itself so we don't have to hit the DB too.
Assuming that we are still storing data on the db (for backend and back up
purposes) are there any significant disadvantages to using solr as a data
store too?
We currently run a master-slave setup on EC2 using x-large slave instances
to allow for the disk cache to use as much memory as possible. I imagine we
would definitely have to add more slave instances to accomodate the extra
data we're storing (and make sure it stays in memory).
Any tips would be really helpful.
--
AJ Asver
Co-founder, Scoopler.com
+44 (0) 7834 609830 / +1 (415) 670 9152
a...@scoopler.com
Follow me on Twitter: http://www.twitter.com/_aj
Add me on Linkedin: http://www.linkedin.com/in/ajasver
or YouNoodle: http://younoodle.com/people/ajmal_asver
My Blog: http://ajasver.com