Before going too far down that path, let's check something.

I assume you're storing *all* the fields for each document,
right? Because unless you are, you'll lose data if you're reading
the document from Solr and then updating it.

When you fetch a document from Solr, only the *stored*
fields are returned, so the cycle is lossy.

If you have access to all of the original information from
the system-of-record, a reasonable approach is to re-get
all the original data and simply replace the entire document.

Best
Erick


On Fri, Oct 28, 2011 at 1:22 PM, Thibaut Colar <tco...@colar.net> wrote:
> Related questions is:
> Is there a way to update a doc to remove a specific value from a multi-value
> field (in my case remove a role)
>
> I manage to do that by querying the doc and reading all the other values
> "manually" then saving, but that has the same issues and is inefficient.
>
> On 10/28/11 10:04 AM, Thibaut Colar wrote:
>>
>> Sorry for the lengthy text, it's a bit difficult to explain:
>>
>> We are using Solr to index some user info like username, email (among
>> other things).
>>
>> I'm also trying to use facets for search, so for example, I added a
>> multi-value field to user called "organizations" where I would store the
>> name of the organizations that user work for.
>>
>> So i can use that field for facetted search and be able to filter a user
>> search query result by the organizations this user work for.
>>
>> So now, the issue I have is my code does something like: 1) Add users
>> documents to Solr 2) When a user is assigned an organization
>> membership(role), update the user doc to set the organizations field
>>
>> Now I have the following issue with step 2: If I just do a
>> addField("organizations", "BigCorp") on the user doc, it will add that value
>> regardless if organizations already have that value("BigCorp") or not, but I
>> want each org name to appear only once.
>>
>> So only way I found to get that behavior is to query the user document,
>> get the values of "organization" and only add the new value if it's not
>> already in there - if !userDoc.getValues("organiations").contains(value)
>> {... add the value to the doc and save it ...}-
>>
>> Now that works well, but only if I commit all the time(between step 1 & 2
>> at least), because the document query will not work unless it has been
>> committed already. Obviously in theory its best not to commit all the time
>> performance-wise, and unpractical since I process those inserts in batches.
>>
>> *So I guess the main issue would be:*
>>
>>  *
>>
>>   Is there a way to update a multi-value field, without allowing
>>   duplicates, that would not require querying the doc to manually
>>   prevent duplicates ?
>>
>>  *
>>
>>   Maybe some better way to do this ?
>>
>> Thanks.
>>
>>
>
>

Reply via email to