Noble,
Thanks for continuing to assist me on trying to come up a config that works.
A couple of questions/clarifications:
1) I had to introduce the "artificial" comboID and the transformer because
of a conflict with a parallel entity on the "id" ("vets" and "owners").
2) I don't think there is a conflict with the petID because prior to the
introduction of "vets" I had "owners" with no "id" issues regarding "pets".
3) The conflict was introduced the moment I tried to add "vets".
Unfortunately by introducing the transformer, for "owners", the "pets"
relationships stopped working.
Below are 3 specifications. The first 2 work in isolation, when
combined(last one) it doesn't work.
* CASE 1 (Works -nested entities -no conflicts on ids -no transformer)
<document name="doc-1">
<entity name="owners" pk="id"
query="select id,first_name,last_name FROM owners">
<field column="id" name="id"/>
<field column="first_name" name="userName"/>
<field column="last_name" name="userName"/>
<entity name="pets" pk="id"
query="SELECT id,name,birth_date,type_id FROM pets WHERE
owner_id='${owners.id}'"
parentDeltaQuery="SELECT id FROM owners WHERE
id=${pets.owner_id}">
<field column="id" name="id"/>
<field column="name" name="name"/>
<field column="birth_date" name="birthDate"/>
</entity>
</entity>
</document>
* CASE 2 (Works -parallel independent entities -introduced transformer to
avoid id conflicts)
<document name="doc-1">
<entity name="vets" pk="id"
query="select id,first_name,last_name FROM vets"
transformer="TemplateTransformer">
<field column="id" name="comboId" template="vets-${vets.id}"/>
<field column="first_name" name="userName"/>
<field column="last_name" name="userName"/>
</entity>
<entity name="owners" pk="id"
query="select id,first_name,last_name FROM owners"
transformer="TemplateTransformer">
<field column="id" name="comboId"
template="owners-${owners.id}"/>
<field column="first_name" name="userName"/>
<field column="last_name" name="userName"/>
</entity>
</document>
* CASE 3 (Commented out "vets" to simplify case. Nested entities don't work:
"Document [null] missing required field: id")
<document name="doc-1">
<!--entity name="vets" pk="id"
query="select id,first_name,last_name FROM vets"
transformer="TemplateTransformer">
<field column="id" name="comboId" template="vets-${vets.id}"/>
<field column="first_name" name="userName"/>
<field column="last_name" name="userName"/>
</entity-->
<entity name="owners" pk="id"
query="select id,first_name,last_name FROM owners"
transformer="TemplateTransformer">
<field column="id" name="comboId"
template="owners-${owners.id}"/>
<field column="first_name" name="userName"/>
<field column="last_name" name="userName"/>
<entity name="pets" pk="id"
query="SELECT id,name,birth_date,type_id FROM pets WHERE
owner_id='${owners.id}'"
parentDeltaQuery="SELECT id FROM owners WHERE
id=${pets.owner_id}">
<field column="id" name="id"/>
<field column="name" name="name"/>
<field column="birth_date" name="birthDate"/>
</entity>
</entity>
</document>
The debug output for one row from the dataImporter while iterating over pets
where the first row owner_id=1 (which gets transformed to 'owners-1' -where
owner_id is a fk to owners id column) shows as follows:
"SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='owners-1'
I believe the issue on somehow having to "untransform" the owners-id prior
to comparison with pets foreign key owner_id
Thanks again
** julio
-----Original Message-----
From: Noble Paul ??????? ?????? [mailto:[EMAIL PROTECTED]
Sent: Tuesday, June 03, 2008 10:30 PM
To: [email protected]
Subject: Re: How to describe 2 entities in dataConfig for the DataImporter?
The id in pet should be aliased to 'petid' , because id is coming from both
entities there is a conflict <entity name="owners" pk="id"
query="select id,first_name,last_name FROM owners"
transformer="TemplateTransformer">
<field column="comboId" template="owners-${owners.id}"/>
<field column="id" />
<field column="first_name" name="userName"/>
<field column="last_name" name="userName"/>
<entity name="pets"
query="SELECT id,name,birth_date,type_id FROM pets WHERE
owner_id='${owners.id}'"
parentDeltaQuery="SELECT id FROM owners WHERE
id=${pets.owner_id}">
<field column="id" name="petid"/>
<field column="name" name="name"/>
<field column="birth_date" name="birthDate"/>
</entity>
</entity>
On Wed, Jun 4, 2008 at 10:37 AM, Noble Paul ??????? ??????
<[EMAIL PROTECTED]> wrote:
> hi julio,
> You must create an extra field for 'comboid' because you really need
> the 'id' for your sub-entities. Your data-config must look as follows.
> The pet also has a field called 'id' . It is not a good idea. call it
> 'petid' or something (both in dataconfig and schema.xml). Please make
> sure that the field names are unique .
>
>
> <entity name="owners" pk="id"
> query="select id,first_name,last_name FROM owners"
> transformer="TemplateTransformer">
> <field column="comboId" template="owners-${owners.id}"/>
> <field column="id" />
> <field column="first_name" name="userName"/>
> <field column="last_name" name="userName"/>
>
> <entity name="pets" pk="id"
> query="SELECT id,name,birth_date,type_id FROM pets WHERE
> owner_id='${owners.id}'"
> parentDeltaQuery="SELECT id FROM owners WHERE
> id=${pets.owner_id}">
> <field column="id" name="id"/>
> <field column="name" name="name"/>
> <field column="birth_date" name="birthDate"/>
> </entity>
> </entity>
>
>
> On Wed, Jun 4, 2008 at 5:50 AM, Julio Castillo <[EMAIL PROTECTED]>
wrote:
>> Hi Noble,
>> I had forgotten to also list comboId as a uniqueKey in the schema.xml
file.
>> But that didn't make a difference.
>> It still complained about the "Document [null] missing required field:
id"
>> for each row it ran into of the outer entity.
>>
>> If you look at the debug output of the entity:pets (see below on
>> original message).
>> The query looks like this:
>> "SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='owners-1'
>>
>> This is the problem lies, because, the owner_id in the pets table is
>> currently a number and thus will not match the modified combo id
>> generated for the owners' id column.
>>
>> So, somehow, I need to be able to either remove the 'owners-' suffix
>> before comparing, or append the same suffix to the pets.owner_id
>> value prior to comparing.
>>
>> Thanks
>>
>> ** julio
>>
>> -----Original Message-----
>> From: Noble Paul ??????? ?????? [mailto:[EMAIL PROTECTED]
>> Sent: Monday, June 02, 2008 9:20 PM
>> To: [email protected]
>> Subject: Re: How to describe 2 entities in dataConfig for the
DataImporter?
>>
>> hi Julio,
>> delete my previous response. In your schema , 'id' is the uniqueKey.
>> make 'comboid' the unique key. Because that is the target field name
>> coming out of the entity 'owners'
>>
>> --Noble
>>
>> On Tue, Jun 3, 2008 at 9:46 AM, Noble Paul ??????? ??????
>> <[EMAIL PROTECTED]> wrote:
>>> The field 'id' is repeated for pet also rename it to something else
>>> say <entity name="pets" pk="id"
>>> query="SELECT id,name,birth_date,type_id FROM pets
>>> WHERE owner_id='${owners.id}'"
>>> parentDeltaQuery="SELECT id FROM owners WHERE
>>> id=${pets.owner_id}">
>>> <field column="id" name="petid"/>
>>> </entity>
>>>
>>> --Noble
>>>
>>> On Tue, Jun 3, 2008 at 3:28 AM, Julio Castillo
>>> <[EMAIL PROTECTED]>
>> wrote:
>>>> Shalin,
>>>> I experimented with it, and the null pointer exception has been
>>>> taken care of. Thank you.
>>>>
>>>> I have a different problem now. I believe it is a
>>>> syntax/specification problem.
>>>>
>>>> When importing data, I got the following exceptions:
>>>> SEVERE: Exception while adding:
>>>> SolrInputDocumnt[{comboId=comboId(1.0)={owners-9},
>>>> userName=userName(1.0)={[David, Schroeder]}}]
>>>>
>>>> org.apache.solr.common.SolrException: Document [null] missing
>>>> required
>>>> field: id
>>>> at
>>>>
>> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.jav
>> a:289)
>>>> at
>>>> org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataI
>>>> mp
>>>> ortHand
>>>> ler.java:263)
>>>> ...
>>>>
>>>> The problem arises the moment I try to include nested entities (e.g.
>>>> pets -the problem does not occur if I don't use the transformer,
>>>> but I have to use the transformer because other unrelated entities
>>>> also have
>> id's).
>>>> My data config file looks as follows.
>>>>
>>>> <dataConfig>
>>>> <document name="doc-1">
>>>> <entity name="owners" pk="id"
>>>> query="select id,first_name,last_name FROM owners"
>>>> transformer="TemplateTransformer">
>>>> <field column="id" name="comboId"
>> template="owners-${owners.id}"/>
>>>> <field column="first_name" name="userName"/>
>>>> <field column="last_name" name="userName"/>
>>>>
>>>> <entity name="pets" pk="id"
>>>> query="SELECT id,name,birth_date,type_id FROM pets
>>>> WHERE owner_id='${owners.id}'"
>>>> parentDeltaQuery="SELECT id FROM owners WHERE
>>>> id=${pets.owner_id}">
>>>> <field column="id" name="id"/>
>>>> <field column="name" name="name"/>
>>>> <field column="birth_date" name="birthDate"/>
>>>> </entity>
>>>> </entity>
>>>> </document>
>>>> </dataConfig>
>>>>
>>>> The debug output of the data import looks as follows:
>>>>
>>>> ....
>>>> - <lst name="verbose-output">
>>>> - <lst name="entity:owners">
>>>> - <lst name="document#1">
>>>> <str name="query">select id,first_name,last_name FROM owners</str>
>>>> <str name="time-taken">0:0:0.15</str>
>>>> <str>----------- row #1-------------</str>
>>>> <int name="id">1</int>
>>>> <str name="first_name">George</str>
>>>> <str name="last_name">Franklin</str>
>>>> <str>---------------------------------------------</str>
>>>> - <lst name="transformer:TemplateTransformer">
>>>> <str>---------------------------------------------</str>
>>>> <str name="id">owners-1</str>
>>>> <str name="first_name">George</str>
>>>> <str name="last_name">Franklin</str>
>>>> <str>---------------------------------------------</str>
>>>> - <lst name="entity:pets">
>>>> <str name="query">SELECT id,name,birth_date,type_id FROM
>>>> pets WHERE owner_id='owners-1'</str>
>>>> <str name="time-taken">0:0:0.0</str>
>>>> </lst>
>>>> </lst>
>>>> </lst>
>>>> + <lst name="document#1">
>>>> ....
>>>>
>>>> Thanks again
>>>>
>>>> ** julio
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Shalin Shekhar Mangar [mailto:[EMAIL PROTECTED]
>>>> Sent: Saturday, May 31, 2008 10:26 AM
>>>> To: [email protected]
>>>> Subject: Re: How to describe 2 entities in dataConfig for the
>> DataImporter?
>>>>
>>>> Hi Julio,
>>>>
>>>> I've fixed the bug, can you please replace the exiting
>>>> TemplateTransformer.java in the SOLR-469.patch and use the attached
>>>> TemplateTransformer.java file. We'll add the changes to our next patch.
>>>> Sorry for all the trouble.
>>>>
>>>> On Sat, May 31, 2008 at 10:31 PM, Noble Paul ??????? ??????
>>>> <[EMAIL PROTECTED]> wrote:
>>>>> julio,
>>>>> Looks like it is a bug.
>>>>> We can give u a new TemplateTransformer.java which we will
>>>>> incorporate in the next patch --Noble
>>>>>
>>>>> On Sat, May 31, 2008 at 12:24 AM, Julio Castillo
>>>>> <[EMAIL PROTECTED]> wrote:
>>>>>> I'm sorry Shalin, but I still get the same Null Pointer exception.
>>>>>> This is my complete dataconfig.xml (I remove the parallel entity
>>>>>> to narrow down the scope of the problem).
>>>>>> <dataConfig>
>>>>>> <document name="doc-1">
>>>>>> <entity name="vets" pk="id"
>>>>>> query="select id as idAlias,first_name,last_name FROM
vets"
>>>>>> deltaQuery="SELECT id as idAlias FROM vets WHERE
>>>>>> last_modified > '${dataimporter.last_index_time}'"
>>>>>> transformer="TemplateTransformer">
>>>>>> <field column="id" name="id"
>>>>>> template="vets-${vets.idAlias}"/>
>>>>>> <field column="first_name" name="userName"/>
>>>>>> <field column="last_name" name="userName"/>
>>>>>> </entity>
>>>>>> </document>
>>>>>> </dataConfig>
>>>>>>
>>>>>> Thanks again.
>>>>>>
>>>>>> ** julio
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Shalin Shekhar Mangar [mailto:[EMAIL PROTECTED]
>>>>>> Sent: Friday, May 30, 2008 11:38 AM
>>>>>> To: [email protected]
>>>>>> Subject: Re: How to describe 2 entities in dataConfig for the
>>>> DataImporter?
>>>>>>
>>>>>> The surname is used just as an example of a field.
>>>>>>
>>>>>> The NullPointerException is because the same field "id" tries to
>>>>>> use it's own value in a template. The template cannot contain the
>>>>>> same field on which it is being applied. I'd suggest that you get
>>>>>> the id aliased to another name, for example using a query "select
>>>>>> id as idAlias from vets" and then
>>>>>> use:
>>>>>> <field column="id" template="vets-${vets.idAlias}" />
>>>>>>
>>>>>> That should work, let me know if you face a problem.
>>>>>>
>>>>>> On Fri, May 30, 2008 at 10:40 PM, Julio Castillo
>>>>>> <[EMAIL PROTECTED]>
>>>>>> wrote:
>>>>>>> Thanks for all the leads.
>>>>>>> I did get however a null pointer exception while implementing it:
>>>>>>>
>>>>>>> May 30, 2008 9:57:50 AM
>>>>>>> org.apache.solr.handler.dataimport.EntityProcessorBase
>> applyTransformer
>>>>>>> WARNING: transformer threw error java.lang.NullPointerException
>>>>>>> at
>>>>>>>
>> org.apache.solr.handler.dataimport.TemplateTransformer.transformRow(T
>> emplate
>> Transformer.java:55)
>>>>>>> at
>>>>>>>
>> org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransform
>> er(Enti
>> tyProcessorBase.java:186)
>>>>>>>
>>>>>>> Looking at the source code, it appears that the resolverMap is null.
>>>>>>> The resolver returned null given the entityName.
>>>>>>>
>>>>>>>
>>>>>>> ** julio
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Noble Paul ??????? ?????? [mailto:[EMAIL PROTECTED]
>>>>>>> Sent: Thursday, May 29, 2008 11:10 PM
>>>>>>> To: [email protected]
>>>>>>> Subject: Re: How to describe 2 entities in dataConfig for the
>>>>>> DataImporter?
>>>>>>>
>>>>>>> Sorry I forgot to mention that.
>>>>>>> http://wiki.apache.org/solr/DataImportHandler#head-a6916b30b5d76
>>>>>>> 05
>>>>>>> a9
>>>>>>> 90
>>>>>>> fb03c4
>>>>>>> ff461b3736496a9
>>>>>>> --Noble
>>>>>>>
>>>>>>> On Fri, May 30, 2008 at 11:37 AM, Shalin Shekhar Mangar
>>>>>>> <[EMAIL PROTECTED]> wrote:
>>>>>>>> You need to enable TemplateTransformer for your entity. For
example:
>>>>>>>> <entity name="owners" pk="id" query="...."
>>>>>>>> transformer="TemplateTransformer">
>>>>>>>>
>>>>>>>> On Fri, May 30, 2008 at 11:31 AM, Julio Castillo
>>>>>>>> <[EMAIL PROTECTED]> wrote:
>>>>>>>>> Noble,
>>>>>>>>> I tried the template setting for the "id" field, but I didn't
>>>>>>>>> notice any different behavior. I also didn't see where this
>>>>>>>>> would be
>>>> reflected.
>>>>>>>>> I looked at the fields and the debug output for the
>>>>>>>>> dataImporter and couldn't see any reference to a modified id
>>>>>>>>> name (per the template instructions).
>>>>>>>>>
>>>>>>>>> The behavior in the end seemed to be the same. Did I miss
anything?
>>>>>>>>> I assume that the <uniqueKey>id</uniqueKey> setting in the
>>>>>>>>> schema.xml remains the same?
>>>>>>>>>
>>>>>>>>> Thanks again
>>>>>>>>>
>>>>>>>>> ** julio
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Noble Paul ??????? ?????? [mailto:[EMAIL PROTECTED]
>>>>>>>>> Sent: Thursday, May 29, 2008 9:46 PM
>>>>>>>>> To: [email protected]
>>>>>>>>> Subject: Re: How to describe 2 entities in dataConfig for the
>>>>>>> DataImporter?
>>>>>>>>>
>>>>>>>>> Consider constructing the id concatenating an extra string for
>>>>>>>>> each document . You can construct that field using the
>>>> TeplateTransformer.
>>>>>>>>> in the entity owners keep the id as
>>>>>>>>>
>>>>>>>>> <field column="id" name="id" template="owners-${owners.id}"/>
>>>>>>>>> and in vets <field column="id" name="id"
>>>>>>>>> template="vets-${vets.id}"/>
>>>>>>>>>
>>>>>>>>> or anything else which can make it unique
>>>>>>>>>
>>>>>>>>> --Noble
>>>>>>>>>
>>>>>>>>> On Fri, May 30, 2008 at 10:05 AM, Shalin Shekhar Mangar
>>>>>>>>> <[EMAIL PROTECTED]> wrote:
>>>>>>>>>> That will happen only if id is the uniqueKey in Solr and the
>>>>>>>>>> id coming from both your tables have same values. In that
>>>>>>>>>> case, they will overwrite each other. You will need a
>>>>>>>>>> separate uniqueKey (on other than id field).
>>
>>
>
>
>
> --
> --Noble Paul
>
--
--Noble Paul