Updating a document multi-value field (no dup values) without needed it to be already committed

2011-10-28 Thread Thibaut Colar

Sorry for the lengthy text, it's a bit difficult to explain:

We are using Solr to index some user info like username, email (among 
other things).


I'm also trying to use facets for search, so for example, I added a 
multi-value field to user called "organizations" where I would store the 
name of the organizations that user work for.


So i can use that field for facetted search and be able to filter a user 
search query result by the organizations this user work for.


So now, the issue I have is my code does something like: 1) Add users 
documents to Solr 2) When a user is assigned an organization 
membership(role), update the user doc to set the organizations field


Now I have the following issue with step 2: If I just do a 
addField("organizations", "BigCorp") on the user doc, it will add that 
value regardless if organizations already have that value("BigCorp") or 
not, but I want each org name to appear only once.


So only way I found to get that behavior is to query the user document, 
get the values of "organization" and only add the new value if it's not 
already in there - if !userDoc.getValues("organiations").contains(value) 
{... add the value to the doc and save it ...}-


Now that works well, but only if I commit all the time(between step 1 & 
2 at least), because the document query will not work unless it has been 
committed already. Obviously in theory its best not to commit all the 
time performance-wise, and unpractical since I process those inserts in 
batches.


*So I guess the main issue would be:*

 *

   Is there a way to update a multi-value field, without allowing
   duplicates, that would not require querying the doc to manually
   prevent duplicates ?

 *

   Maybe some better way to do this ?

Thanks.



Re: Updating a document multi-value field (no dup values) without needed it to be already committed

2011-10-28 Thread Thibaut Colar

Related questions is:
Is there a way to update a doc to remove a specific value from a 
multi-value field (in my case remove a role)


I manage to do that by querying the doc and reading all the other values 
"manually" then saving, but that has the same issues and is inefficient.


On 10/28/11 10:04 AM, Thibaut Colar wrote:

Sorry for the lengthy text, it's a bit difficult to explain:

We are using Solr to index some user info like username, email (among 
other things).


I'm also trying to use facets for search, so for example, I added a 
multi-value field to user called "organizations" where I would store 
the name of the organizations that user work for.


So i can use that field for facetted search and be able to filter a 
user search query result by the organizations this user work for.


So now, the issue I have is my code does something like: 1) Add users 
documents to Solr 2) When a user is assigned an organization 
membership(role), update the user doc to set the organizations field


Now I have the following issue with step 2: If I just do a 
addField("organizations", "BigCorp") on the user doc, it will add that 
value regardless if organizations already have that value("BigCorp") 
or not, but I want each org name to appear only once.


So only way I found to get that behavior is to query the user 
document, get the values of "organization" and only add the new value 
if it's not already in there - if 
!userDoc.getValues("organiations").contains(value) {... add the value 
to the doc and save it ...}-


Now that works well, but only if I commit all the time(between step 1 
& 2 at least), because the document query will not work unless it has 
been committed already. Obviously in theory its best not to commit all 
the time performance-wise, and unpractical since I process those 
inserts in batches.


*So I guess the main issue would be:*

 *

   Is there a way to update a multi-value field, without allowing
   duplicates, that would not require querying the doc to manually
   prevent duplicates ?

 *

   Maybe some better way to do this ?

Thanks.