Since you don't have any "update" attribute specified, you are doing a
simple "add" - which deletes the old document with that key and replaces it
with the data from the "add" document.
Again: It is the presence of the "update" that turns the document <add> into
an "update", otherwise <add> simply replaces any existing document or adds a
new document.
-- Jack Krupansky
-----Original Message-----
From: Curtis Beattie
Sent: Friday, April 05, 2013 2:52 PM
To: solr-user@lucene.apache.org
Subject: Solr 4.2 - Unexpected behaviour when updating a document with only
id field specified in the update
I am experiencing some peculiar behavior when updating a document. I'm
curious whether this is "working as intended" or whether it is a
defect. Allow me to articulate the problem using an example (should be
easily reproducable with the "example" configuration data).
The workflow is as follows:
1) Create a document with fields: id, name_s and keywords_ss (works as
expected).
2) Update the document by specifying id and replacing keywords_ss
(works as expected).
3) Update the document by only specifying id (unusual behavior:
document is "wiped")
Step #1 - Create the document
curl http://localhost:10000/solr/simple-collection/update?commit=true
-H "Content-Type: text/xml" -d '
<?xml version="1.0" encoding="UTF-8"?>
<add>
<doc>
<field name="id">doc1</field>
<field name="name_s">Document 1</field>
<field name="keywords_ss">A</field>
</doc>
</add>'
http://localhost:10000/solr/simple-collection_shard1_replica1/select?q=*%3A*&wt=xml&indent=true
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">14</int>
<lst name="params">
<str name="indent">true</str>
<str name="q">*:*</str>
<str name="wt">xml</str>
</lst>
</lst>
<result name="response" numFound="1" start="0" maxScore="1.0">
<doc>
<str name="id">doc1</str>
<str name="name_s">Document 1</str>
<arr name="keywords_ss">
<str>A</str>
</arr>
<long name="_version_">1431502565339561984</long>
</doc>
</result>
</response>
Step #2 - Update the document specifying id & keywords_ss
curl http://localhost:10000/solr/simple-collection/update?commit=true
-H "Content-Type: text/xml" -d '
<?xml version="1.0" encoding="UTF-8"?>
<add>
<doc>
<field name="id">doc1</field>
<field name="keywords_ss" update="set">B</field>
</doc>
</add>'
http://localhost:10000/solr/simple-collection_shard1_replica1/select?q=*%3A*&wt=xml&indent=true
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">13</int>
<lst name="params">
<str name="indent">true</str>
<str name="q">*:*</str>
<str name="wt">xml</str>
</lst>
</lst>
<result name="response" numFound="1" start="0" maxScore="1.0">
<doc>
<str name="id">doc1</str>
<str name="name_s">Document 1</str>
<arr name="keywords_ss">
<str>B</str>
</arr>
<long name="_version_">1431502700990693376</long>
</doc>
</result>
</response>
Step #3 - Update the document specifying only 'id'
curl http://localhost:10000/solr/simple-collection/update?commit=true
-H "Content-Type: text/xml" -d '
<?xml version="1.0" encoding="UTF-8"?>
<add>
<doc>
<field name="id">doc1</field>
</doc>
</add>'
http://localhost:10000/solr/simple-collection_shard1_replica1/select?q=*%3A*&wt=xml&indent=true
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">14</int>
<lst name="params">
<str name="indent">true</str>
<str name="q">*:*</str>
<str name="wt">xml</str>
</lst>
</lst>
<result name="response" numFound="1" start="0" maxScore="1.0">
<doc>
<str name="id">doc1</str>
<long name="_version_">1431502818264481792</long>
</doc>
</result>
</response>
---
Now I realize that "updating" a document and specifying only the 'id'
is pointless but the unusual behavior, in my view, is that in this
circumstance Solr seems to be deleting the 'name_s' field. In fact,
all fields except 'id' are lost. The unusual behaviour, in my view, is
that Solr will perform an update when at least one field (other than
'id') is specified but when only 'id' is specified it seems to be
deleting and re-adding the document without preserving the existing
data.
Can someone please comment on this behaviour and indicate whether or
not it is in fact correct or if it represents a defect?
Thanks,
--
Curt