No one has a suggestion? I must be missing something because as I
understand it from Dennis' email, all of queries are very quick
(cached type response times) whereas mine are not. I can clearly see
time differences between queries that are cached (things that have
been auto warmed) and queries that are not. This seems odd as my whole
index is loaded on a tmpfs memory based file system. Thanks for the
help.
Matt
On Dec 4, 2007, at 3:55 PM, Matthew Phillips wrote:
Thanks for the suggestion, Dennis. I decided to implement this as
you described on my collection of about 400,000 documents, but I did
not receive the results I expected.
Prior to putting the indexes on a tmpfs, I did a bit of benchmarking
and found that it usually takes a little under two seconds for each
facet query. After moving my indexes from disk to a tmpfs file
system, I seem to get about the same result from facet queries:
about two seconds.
Does anyone have any insight into this? Doesn't it seem odd that my
response times are about the same? Thanks for the help.
Matt Phillips
Dennis Kubes wrote:
One way to do this if you are running on linux is to create a
tempfs (which is ram) and then mount the filesystem in the ram.
Then your index acts normally to the application but is essentially
served from Ram. This is how we server the Nutch lucene indexes on
our web search engine (www.visvo.com) which is ~100M pages. Below
is how you can achieve this, assuming your indexes are in /path/to/
indexes:
mv /path/to/indexes /path/to/indexes.dist
mkdir /path/to/indexes
cd /path/to
mount -t tmpfs -o size=2684354560 none /path/to/indexes
rsync --progress -aptv indexes.dist/* indexes/
chown -R user:group indexes
This would of course be limited by the amount of RAM you have on
the machine. But with this approach most searches are sub-second.
Dennis Kubes
Evgeniy Strokin wrote:
Hello,...
we have 110M records index under Solr. Some queries takes a while,
but we need sub-second results. I guess the only solution is cache
(something else?)...
We use standard LRUCache. In docs it says (as far as I understood)
that it loads view of index in to memory and next time works with
memory instead of hard drive.
So, my question: hypothetically, we can have all index in memory
if we'd have enough memory size, right? In this case the result
should come up very fast. We have very rear updates. So I think
this could be a solution.
How should I configure the cache to achieve such approach?
Thanks for any advise.
Gene