Hi Eric,
I think I understand what you are saying but I'm not sure how it would work.
I think you are saying to have two different indexes, each one has the
same documents, but one has the hard-to-get fields and the other has the
easy-to-get fields. Then I would make the same query twice, once to each
index.
So, let's say I'm looking for all documents that contain the word "poem"
and I want to initially display the the 10 most relevant matches. I
think I'd have to ask each index for its 10 most relevant matches, then
merge them myself, and display the appropriate ones.
Well, the same document could appear in both lists so I'd have to get
rid of duplicates. Also, wouldn't the relevancy of the duplicate doc go
up? But I wouldn't know by how much.
That's the first problem, but then what if the user wants to see page 2?
I certainly wouldn't query for documents #10-19 on each server.
Eric Pugh wrote:
Right... You know, if some of your data needs to updated frequently,
but other is updated once per year, and is really massive dataset,
then maybe splitting it up into separate cores? Since you mentioned
that you can't get the raw data again, you could just duplicate your
existing index by doing a filesytem copy. Leave that alone so you
don't update it and lose your data, and start a new core that you can
update and ignore the fact is has all the website data in it. And tie
the two cores data sets together outside of Solr.
Eric
On Thu, Aug 27, 2009 at 1:46 PM, Paul Tomblin<ptomb...@xcski.com> wrote:
On Thu, Aug 27, 2009 at 1:27 PM, Eric
Pugh<ep...@opensourceconnections.com> wrote:
You can just query Solr, find the records that you want (including all
the website data). Update them, and then send the entire record back.
Correct me if I'm wrong, but I think you'd end up losing the fields
that are indexed but not stored.
--
http://www.linkedin.com/in/paultomblin