Hi. I think I need to be more specific. What I am trying to find out is if I should aim for:
CPU (2x4 cores, 2.0-3.0Ghz)? Fast disk IO: 8 disks, RAID1+0 ? RAM - if the index does not fit into RAM how much RAM should I then buy ? Please any hints would be appreciated since I am going to invest soon. //Marcus On Sat, Jun 27, 2009 at 12:02 AM, Marcus Herou <marcus.he...@tailsweep.com>wrote: > Hi. > > I currently have an index which is 16GB per machine (8 machines = 128GB) > (data is stored externally, not in index) and is growing like crazy (we are > indexing blogs which is crazy by nature) and have only allocated 2GB per > machine to the SOLR app since we are running some other stuff there in > parallell. > > Each doc should be roughly the size of a blog post, no more than 20k. > > We currently have about 90M documents and it is increasing rapidly so > getting into the G+ document range is not going to be too far away. > > Now due to search performance I think I need to move these instances to > dedicated index/search machines (or index on some machines and search on > others). > Anyway I would like to get some feedback about two things: > > 1. What is the most important hardware aspect when it comes to add document > to the index and optimize it. > 1.1 Is it disk I|O write throghput ? (sequential or random-io ?) > 1.2 Is it RAM ? > 1.3 Is is CPU ? > > My guess would be disk-io, right, wrong ? > > 2. What is the most important hardware aspect when it comes to searching > documents in my setup ? (result-set is limited to return only the top 10 > matches with page handling) > We facet and sort on the publishedDate of the entry (memory intensive I > presume) > > 2.1 Is it disk read throughput ? (sequential or random-io ?) > 2.2 Is it RAM ? > 2.3 Is is CPU ? > > I have no clue since the data might not fit into memory. What is then the > most important factor ? read-performance while scanning the index ? CPU > while comparing fields and collecting results ? > > What I'm trying to find out is what I can do to get most bang for the buck > with a limited (aren't we all limited?) budget. > > Kindly > > //Marcus > > > > > > -- > Marcus Herou CTO and co-founder Tailsweep AB > +46702561312 > marcus.he...@tailsweep.com > http://www.tailsweep.com/ > > > > -- > Marcus Herou CTO and co-founder Tailsweep AB > +46702561312 > marcus.he...@tailsweep.com > http://www.tailsweep.com/ > > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tailsweep.com/