Each document in SOLR will correspond to one db record and since both
databases have the same schema, you can't index two records from two
databases into the same SOLR document.

So after indexing, you should have 7k different documents, each of which
holds data from a db record.

Also one problem I see here is that since the record id in each table is
unique only within the table and (most probably) not globally, there will
be collisions. To aviod this, I would prepend a record_id with some static
value, like: concat("t1",  CONVERT(id, CHAR(8))).

Dmitry

On Thu, Feb 16, 2012 at 4:47 PM, Radu Toev <radut...@gmail.com> wrote:

> I'm not sure I follow.
> The idea is to have only one document. Do the multiple documents have the
> same structure then(different datasources), and if so how are they actually
> indexed?
>
> Thanks.
>
> On Thu, Feb 16, 2012 at 4:40 PM, Dmitry Kan <dmitry....@gmail.com> wrote:
>
> > I think the problem here is that initially you trying to create separate
> > documents for two different tables, while your config is aiming to create
> > only one document. Here there is one solution (not tried by me):
> >
> > ------
> > You can have multiple documents generated by the same data-config:
> >
> > <dataConfig>
> >  <dataSource name="ds1" .../>
> >  <dataSource name="ds2" .../>
> >  <dataSource name="ds3" .../>
> >  <document>
> >   <entity blah blah rootEntity="false">
> >       <entity blah blah this is a document>
> >          <entity sets unique id/>
> >       </document>
> >       <document blah blah this is another document>
> >          <entity sets unique id>
> >       </document>
> >  </document>
> > </dataConfig>
> >
> > It's the 'rootEntity="false" that makes the child entity a document.
> > ------
> >
> > Dmitry
> >
> > On Thu, Feb 16, 2012 at 2:37 PM, Radu Toev <radut...@gmail.com> wrote:
> >
> > > <dataConfig>
> > >  <dataSource
> > >     name="s"
> > >     driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
> > >     url=""
> > >     user=""
> > >     password=""/>
> > >  <dataSource
> > >     name="p"
> > >  driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
> > >     url=""
> > >     user=""
> > >     password=""/>
> > >  <document>
> > >    <entity name="ms"
> > >        datasource="s"
> > > query="SELECT m.id as id, m.serial as m_machine_serial, m.ivk as
> > > m_machine_ivk, m.sitename as m_sitename, m.deliveryDate as
> > m_delivery_date,
> > > m.hotsite as m_hotsite, m.guardian as m_guardian, m.warranty as
> > m_warranty,
> > > m.contract as m_contract,
> > >   st.name as m_st_name, pm.name as m_pm_name, p.name as m_p_name,
> > > sv.shortName as m_sv_name, c.clusterMajor as m_c_cluster_major,
> > > c.clusterMinor as m_c_cluster_minor, c.country as m_c_country, c.code
> as
> > > m_c_code
> > >   FROM Machine AS m
> > >   LEFT JOIN SystemType AS st ON m.fk_systemType=st.id
> > >   LEFT JOIN ProductModel AS pm ON fk_productModel = pm.id
> > >   LEFT JOIN Platform AS p ON m.fk_platform = p.id
> > >   LEFT JOIN SoftwareVersion AS sv ON fk_softwareVersion = sv.id
> > >   LEFT JOIN Country AS c ON fk_country = c.id"
> > > readOnly="true"
> > > transformer="DateFormatTransformer">
> > > <field column="id" />
> > > <field column="m_machine_serial"/>
> > > <field column="m_machine_ivk"/>
> > > <field column="m_sitename"/>
> > > <filed column="m_delivery_date" dateTimeFormat="yyyy-MM-dd"/>
> > > <field column="m_hotsite"/>
> > > <field column="m_guardian"/>
> > > <field column="m_warranty"/>
> > > <field column="m_contract"/>
> > > <field column="m_st_name"/>
> > > <field column="m_pm_name"/>
> > > <field column="m_p_name"/>
> > > <field column="m_sv_name"/>
> > > <field column="m_c_cluster_major"/>
> > > <field column="m_c_cluster_minor"/>
> > > <field column="m_c_country"/>
> > > <field column="m_c_code"/>
> > >   </entity>
> > >
> > >   <entity name="mp"
> > >        datasource="p"
> > > query="SELECT m.id as id, m.serial as m_machine_serial, m.ivk as
> > > m_machine_ivk, m.sitename as m_sitename, m.deliveryDate as
> > m_delivery_date,
> > > m.hotsite as m_hotsite, m.guardian as m_guardian, m.warranty as
> > m_warranty,
> > > m.contract as m_contract,
> > >   st.name as m_st_name, pm.name as m_pm_name, p.name as m_p_name,
> > > sv.shortName as m_sv_name, c.clusterMajor as m_c_cluster_major,
> > > c.clusterMinor as m_c_cluster_minor, c.country as m_c_country, c.code
> as
> > > m_c_code
> > >   FROM Machine AS m
> > >   LEFT JOIN SystemType AS st ON m.fk_systemType=st.id
> > >   LEFT JOIN ProductModel AS pm ON fk_productModel = pm.id
> > >   LEFT JOIN Platform AS p ON m.fk_platform = p.id
> > >   LEFT JOIN SoftwareVersion AS sv ON fk_softwareVersion = sv.id
> > >   LEFT JOIN Country AS c ON fk_country = c.id"
> > > readOnly="true"
> > > transformer="DateFormatTransformer">
> > > <field column="id" />
> > > <field column="m_machine_serial"/>
> > > <field column="m_machine_ivk"/>
> > > <field column="m_sitename"/>
> > > <filed column="m_delivery_date" dateTimeFormat="yyyy-MM-dd"/>
> > > <field column="m_hotsite"/>
> > > <field column="m_guardian"/>
> > > <field column="m_warranty"/>
> > > <field column="m_contract"/>
> > > <field column="m_st_name"/>
> > > <field column="m_pm_name"/>
> > > <field column="m_p_name"/>
> > > <field column="m_sv_name"/>
> > > <field column="m_c_cluster_major"/>
> > > <field column="m_c_cluster_minor"/>
> > > <field column="m_c_country"/>
> > > <field column="m_c_code"/>
> > >   </entity>
> > >  </document>
> > > </dataConfig>
> > >
> > > I've removed the connection params
> > > The unique key is id.
> > >
> > > On Thu, Feb 16, 2012 at 2:27 PM, Dmitry Kan <dmitry....@gmail.com>
> > wrote:
> > >
> > > > OK, maybe you can show the db-data-config.xml just in case?
> > > > Also in schema.xml, does you <uniqueKey> correspond to the unique
> field
> > > in
> > > > the db?
> > > >
> > > > On Thu, Feb 16, 2012 at 2:13 PM, Radu Toev <radut...@gmail.com>
> wrote:
> > > >
> > > > > I tried running with just one datasource(the one that has 6k
> entries)
> > > and
> > > > > it indexes them ok.
> > > > > The same, if I do sepparately the 1k database. It indexes ok.
> > > > >
> > > > > On Thu, Feb 16, 2012 at 2:11 PM, Dmitry Kan <dmitry....@gmail.com>
> > > > wrote:
> > > > >
> > > > > > It sounds a bit, as if SOLR stopped processing data once it
> queried
> > > all
> > > > > > from the smaller dataset. That's why you have 2000. If you just
> > have
> > > a
> > > > > > handler pointed to the bigger data set (6k), do you manage to get
> > all
> > > > 6k
> > > > > db
> > > > > > entries into solr?
> > > > > >
> > > > > > On Thu, Feb 16, 2012 at 1:46 PM, Radu Toev <radut...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > 1. Nothing in the logs
> > > > > > > 2. No.
> > > > > > >
> > > > > > > On Thu, Feb 16, 2012 at 12:44 PM, Dmitry Kan <
> > dmitry....@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > 1. Do you see any errors / exceptions in the logs?
> > > > > > > > 2. Could you have duplicates?
> > > > > > > >
> > > > > > > > On Thu, Feb 16, 2012 at 10:15 AM, Radu Toev <
> > radut...@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello,
> > > > > > > > >
> > > > > > > > > I created a data-config.xml file where I define a
> datasource
> > > and
> > > > an
> > > > > > > > entity
> > > > > > > > > with 12 fields.
> > > > > > > > > In my use case I have 2 databases with the same schema, so
> I
> > > want
> > > > > to
> > > > > > > > > combine in one index the 2 databases.
> > > > > > > > > I defined a second dataSource tag and duplicateed the
> entity
> > > with
> > > > > its
> > > > > > > > > field(changed the name and the datasource).
> > > > > > > > > What I'm expecting is to get around 7k results(I have
> around
> > 6k
> > > > in
> > > > > > the
> > > > > > > > > first db and 1k in the second). However I'm getting a total
> > of
> > > > 2k.
> > > > > > > > > Where could be the problem?
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Regards,
> > > > > > > >
> > > > > > > > Dmitry Kan
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards,
> > > > > >
> > > > > > Dmitry Kan
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > >
> > > > Dmitry Kan
> > > >
> > >
> >
> >
> >
> > --
> > Regards,
> >
> > Dmitry Kan
> >
>



-- 
Regards,

Dmitry Kan

Reply via email to