Solved it! commonField="true" should be commonField="false"
mistakes that happen when copying source a sample proyect... Thanks for your help. On Fri, Apr 1, 2011 at 10:29 AM, Marcelo Iturbe <marc...@santiago.cl> wrote: > > Hello, > I was able to repeat this behaviour in Solr 3.1.0 > > The procedure is > - rename the directory example-DIH/rss to example-DIH/gcontacts > - modify solrconfig.xml to only load gcontacts > - rename rss-data-config.xml to gcontacts-data-config.xml and modify (see > content below) > - modify schema.xml > > This is from my schema.xml > <field name="source" type="text" indexed="true" stored="true" /> > <field name="source-link" type="string" indexed="false" stored="true" > /> > > <field name="title" type="string" indexed="true" stored="true" /> > <field name="link" type="string" indexed="true" stored="true" /> > > <field name="email" type="string" indexed="true" stored="true" > multiValued="true" default=" "/> > <field name="phoneNumber" type="string" indexed="true" stored="true" > multiValued="true" default=" "/> > <field name="organization" type="string" indexed="true" stored="true" > multiValued="true" default=" "/> > > <field name="postalAddress" type="string" indexed="true" stored="true" > multiValued="true" default=" "/> > > <field name="all_text" type="text" indexed="true" stored="true" > multiValued="true" /> > <copyField source="title" dest="all_text" /> > <copyField source="email" dest="all_text" /> > <copyField source="phoneNumber" dest="all_text" /> > <copyField source="organization" dest="all_text" /> > <copyField source="postalAddress" dest="all_text" /> > > this is my gcontacts-data-config.xml file > <dataConfig> > <dataSource type="URLDataSource" /> > <document> > <entity name="gcontacts" > pk="link" > url="http://172.16.0.30/sayt2/contacts/testtim.xml" > processor="XPathEntityProcessor" > forEach="/feed/entry" > > > > <field column="source" xpath="/feed/entry/id" > commonField="true" /> > <field column="source-link" > xpath="/feed/entry/link[@rel='edit']/@href" commonField="true" /> > > <field column="title" xpath="/feed/entry/title" > commonField="true"/> > <field column="link" > xpath="/feed/entry/link[@rel='edit']/@href" /> > <field column="email" xpath="/feed/entry/email/@address" > commonField="true"/> > <field column="phoneNumber" xpath="/feed/entry/phoneNumber" > commonField="true"/> > <field column="organization" xpath="/feed/entry/organization" > commonField="true"/> > <field column="postalAddress" > xpath="/feed/entry/postalAddress" commonField="true"/> > </entity> > </document> > </dataConfig> > > This is from my solrconfig.xml file > <?xml version="1.0" encoding="UTF-8" standalone="yes"?> > <solr sharedLib="lib" persistent="true"> > <cores adminPath="/admin/cores"> > <core default="false" instanceDir="gcontacts" name="gcontacts"/> > </cores> > </solr> > > Thanks for your help. > > Regards > > > On Fri, Apr 1, 2011 at 4:27 AM, Stefan Matheis < > matheis.ste...@googlemail.com> wrote: > >> Marcelo, >> >> could you paste the relevant parts of your DIH config? >> >> Regards >> Stefan >> >> On Thu, Mar 31, 2011 at 9:55 PM, Marcelo Iturbe <marc...@santiago.cl> >> wrote: >> > Hello, >> > I have an XML which contains personal contacts. Not all contacts have >> the >> > same fields (email, phone, postal). >> > >> > The problem is that when certain fields are NOT present, SOLR is >> injecting >> > the previous contacts data. >> > >> > For example, assume the following from the XML feed: >> > <entry> >> > <title type='text'>Jane Doe</title> >> > <gd:email rel='http://schemas.google.com/g/2005#work' address=' >> > jane....@gmail.com' primary='true'/> >> > <gd:postalAddress rel='http://schemas.google.com/g/2005#home >> > '>Santiago >> > Region Metropolitana >> > Chile</gd:postalAddress> >> > </entry> >> > <entry> >> > <title type='text'>Jeff Smith</title> >> > <gd:email rel='http://schemas.google.com/g/2005#work' address=' >> > jeff.sm...@gmail.com' primary='true'/> >> > </entry> >> > <entry> >> > <title type='text'>Ana Mercurio</title> >> > <gd:phoneNumber rel='http://schemas.google.com/g/2005#mobile' >> > primary='true'>+56912345678</gd:phoneNumber> >> > </entry> >> > >> > The second contact, will have the first contacts postal address. >> > The third contact, will have Janes Postal Address and Jeffs email >> address: >> > >> > <lst> >> > <arr name="title"> >> > <str>Ana Mercurio</str> >> > </arr> >> > <arr name="phoneNumber"> >> > <str>+5612345678</str> >> > </arr> >> > <arr name="email"> >> > <str>jeff.sm...@gmail.com</str> >> > </arr> >> > <arr name="postalAddress"> >> > <str>Santiago >> > Region Metropolitana >> > Chile</str> >> > </arr> >> > </lst> >> > >> > This is how I have the fields specified in the schema.xml file: >> > <field name="email" type="string" indexed="true" stored="true" >> > multiValued="true" default=" "/> >> > <field name="phoneNumber" type="string" indexed="true" stored="true" >> > multiValued="true" default=" "/> >> > <field name="postalAddress" type="string" indexed="true" >> stored="true" >> > multiValued="true" default=" "/> >> > >> > What did I miss? >> > >> > Thanks for your help. >> > >> > >