HTTP caching and distributed search

Charlie Jackson Tue, 02 Feb 2010 10:52:33 -0800

Currently, I've got a Solr setup in which we're distributing searches
across two cores on a machine, say core1 and core2. I'm toying with the
notion of enabling Solr's HTTP caching on our system, but I noticed an
oddity when using it in combination with distributed searching. Say, for
example, I have this query:


 

http://localhost:8080/solr/core1/select/?q=google&start=0&rows=10&shards
=localhost:8080/solr/core1,localhost:8080/solr/core2

 

Both cores have HTTP caching enabled, and it seems to be working. First
time I run the query through Squid, it correctly sees it doesn't have
this cached and so requests it from Solr. Second time I request it, it
hits the Squid cache. That part works fine. 

 

Here's the problem. If I commit to core1, it changes the ETag value of
the request, which will invalidate the cache, as it should. But
committing to core2 doesn't, so I get the cached version back, even
though core2 has changed and the cache is stale. I'm guessing this is
because the request is going against core1, hence using core1's cache
values, but in a distributed search, it seems like it should be using
cache values from all cores in the shards parameter. Is this a known
issue, and if so, is there a patch for it?

 

Thanks,

Charlie

HTTP caching and distributed search

Reply via email to