1) Shouldn't you put your "entity" elements under "document" tag, i.e.
<dataConfig>
  <dataSource ... />
  <dataSource ... />

  <document name="docs">
    <entity ...>...</entity>
    <entity ...>...</entity>
  </document>
</dataConfig>

2) What happens if you try to run full-import with explicitly
specified "entity" GET parameter?
command=full-import&entity=carrers
command=full-import&entity=hidrants


On Wed, Jul 7, 2010 at 11:15 AM, Xavier Rodriguez <xee...@gmail.com> wrote:
> Thanks for the quick reply!
>
> In fact it was a typo, the 200 rows I got were from postgres. I tried to say
> that the full-import was omitting the 100 oracle rows.
>
> When I run the full import, I run it as a single job, using the url
> command=full-import. I've tried to clear the index both using the clean
> command and manually deleting it, but when I run the full-import, the number
> of indexed documents are the documents coming from postgres.
>
> To be sure that the id field is unique, i get the id by assigning a letter
> before the id value. When indexed, the id looks like s_123, and that's the
> id 123 for an entity identified as "s". Other entities use different
> prefixes, but never "s".
>
> I used DIH to index the data. My configuration is the folllowing:
>
> File db-data-config.xml
>
>  <dataSource
>        type="JdbcDataSource"
>        name="ds_ora"
>        driver="oracle.jdbc.OracleDriver"
>        url="jdbc:oracle:thin:@xxx.xxx.xxx.xxx:1521:SID"
>        user="user"
>        password="password"
>    />
>
>  <dataSource
>        type="JdbcDataSource"
>        name="ds_pg"
>        driver="org.postgresql.Driver"
>        url="jdbc:postgresql://xxx.xxx.xxx.yyy:5432/sid"
>        user="user"
>        password="password"
>    />
>
> <entity name="carrers" dataSource="ds_ora" query="select 's_'||id as
> id_carrer,'a' as tooltip from imi_carrers">
>            <field column="id_carrer" name="identificador" />
>            <field column="tooltip" name="Nom" />
> </entity>
>
>
> <entity name="hidrants" dataSource="ds_pg" query="select 'h_'||id as
> id_hidrant, parc as tooltip from hidrants">
>            <field column="id_hidrant" name="identificador" />
>            <field column="tooltip" name="Nom" />
>  </entity>
>
> ----------
>
> In that configuration, all the fields coming from ds_pg are indexed, and the
> fields coming from ds_ora are not indexed. As I've said, the strange
> behaviour for me is that no error is logged in tomcat, the number of
> documents created is the number of rows returned by "hidrants", while the
> number of rows returned is the sum of the rows from "hidrants" and
> "carrers".
>
> Thanks in advance.
>
> Xavi.
>
>
>
>
>
>
>
> On 7 July 2010 02:46, Erick Erickson <erickerick...@gmail.com> wrote:
>
>> first do you have a unique key defined in your schema.xml? If you
>> do, some of those 300 rows could be replacing earlier rows.
>>
>> You say: " if I have 200
>> rows indexed from postgres and 100 rows from Oracle, the full-import
>> process
>> only indexes 200 documents from oracle, although it shows clearly that the
>> query retruned 300 rows."
>>
>> Which really looks like a typo, if you have 100 rows from Oracle how
>> did you get 200 rows from Oracle?
>>
>> Are you perhaps doing this in two different jobs and deleting the
>> first import before running the second?
>>
>> And if this is irrelevant, could you provide more details like how you're
>> indexing things (I'm assuming DIH, but you don't state that anywhere).
>> If it *is* DIH, providing that configuration would help.
>>
>> Best
>> Erick
>>
>> On Tue, Jul 6, 2010 at 11:19 AM, Xavier Rodriguez <xee...@gmail.com>
>> wrote:
>>
>> > Hi,
>> >
>> > I have a SOLR installed on a Tomcat application server. This solr
>> instance
>> > has some data indexed from a postgres database. Now I need to add some
>> > entities from an Oracle database. When I run the full-import command, the
>> > documents indexed are only documents from postgres. In fact, if I have
>> 200
>> > rows indexed from postgres and 100 rows from Oracle, the full-import
>> > process
>> > only indexes 200 documents from oracle, although it shows clearly that
>> the
>> > query retruned 300 rows.
>> >
>> > I'm not doing a delta-import, simply a full import. I've tried to clean
>> the
>> > index, reload the configuration, and manually remove
>> dataimport.properties
>> > because it's the only metadata i found.  Is there any other file to check
>> > or
>> > modify just to get all 300 rows indexed?
>> >
>> > Of course, I tried to find one of that oracle fields, with no results.
>> >
>> > Thanks a lot,
>> >
>> > Xavier Rodriguez.
>> >
>>
>

Reply via email to