Hi,

Thanks for the replies. The info in my admin/stats is the following:

searcherName : Searcher@f4e40da main
caching : true
numDocs : 654
maxDoc : 654
reader : 
SolrIndexReader{this=6a6078e7,r=ReadOnlyDirectoryReader@6a6078e7,refCnt=1,segments=1}
readerDir : 
org.apache.lucene.store.MMapDirectory@/home/andre/workspace/test/3rd_party/solr/apache-solr-3.6.1/example/solr/data/index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@51a422f6
indexVersion : 1343578710140
openedAt : Sun Aug 05 19:04:35 WEST 2012
registeredAt : Sun Aug 05 19:04:35 WEST 2012
warmupTime : 15

There are 654 docs.

Some more info, my solrconfig.xml:

  <!-- Request handler added by Andre Lopes to import data from database -->
  <requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
    <!-- default values for query parameters can be specified, these
         will be overridden by parameters in the request
      -->
     <lst name="defaults">
       <str name="config">db-data-config.xml</str>
     </lst>

  </requestHandler>


My db-data-config.xml:

<?xml version="1.0" encoding="UTF-8" ?>
<dataConfig>
        <dataSource driver="org.postgresql.Driver"
url="jdbc:postgresql://localhost:5432/euvoudebicicleta" user="myuser"
password="mypass" />
        <document>
                <entity name="bicyclebusinesses" query="select * from
table_text__single_occurrencies order by date_inserted">
                <field column="uri" name="uri" />               
                <field column="business_name" name="name" />
                <field column="business_address" name="address" />
                </entity>
        </document>
</dataConfig>


My schema.xml:

<?xml version="1.0" encoding="UTF-8" ?>
<schema name="example" version="1.5">
  <types>
    <fieldType name="string" class="solr.StrField"/>
  </types>

  <fields>
    <dynamicField name="*"       type="string" indexed="false" stored="false" />
    <field name="uri" type="string" indexed="true" stored="true" />
<!--
    <field name="name" type="string" indexed="true" stored="true" />
    <field name="address" type="string" indexed="true" stored="true" />
-->     
  </fields>
    <uniqueKey>uri</uniqueKey>
   <!-- <defaultSearchField>catchall</defaultSearchField> -->
</schema>


I've tested, and the SELECT in the db-data-config.xml outputs 654
results. Some more clues?


Best Regards,




On Sun, Aug 5, 2012 at 6:59 PM, Erick Erickson <erickerick...@gmail.com> wrote:
> A quick check here is to go to your admin/stats page and look at
> numDocs and maxDocs. numDocs is the number of documents that it's
> possible to find, i.e. non updated/deleted docs. maxDocs is the number
> of documents that have been added, and that count includes ones with
> duplicate unique IDs.
>
> So I'm guessing that numDocs == 9 and maxDocs == 654, which as Jack
> says indicates that your uniqueKey is repeated for lots and lots of
> your data...
>
> Best
> Erick
>
> On Sun, Aug 5, 2012 at 1:40 PM, Jack Krupansky <j...@basetechnology.com> 
> wrote:
>> Make sure the id is not duplicated. You might have inadvertently populated
>> the id field in your Solr schema with some non-key value that occurs with
>> high frequency (and may have roughly 9 unique values.)
>>
>> Examine the 9 results and their id fields. Then look at some of your input
>> data to verify that the values placed in the id field are what you expected.
>>
>> If possible, identify one input record that isn't in the 9 results but
>> should be and verify its id.
>>
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Andre Lopes
>> Sent: Sunday, August 05, 2012 1:31 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: How to configure schema.xml to take in account two database
>> tables?
>>
>>
>> Thanks for the replies,
>>
>> I've now successfully indexed the database using the DataImportHandler
>> but there is something weird. I've indexed 654 entries but I can't
>> output all the 654 results.
>>
>> After the I run the
>> "http://localhost:8983/solr/dataimport?command=full-import"; I got 654
>> adds:
>>
>> Aug 5, 2012 6:16:51 PM
>> org.apache.solr.update.processor.LogUpdateProcessor finish
>> INFO: {deleteByQuery=*:*,add=[http://1.com, http://2.com,
>> http://3.com, http://4.com, http://5.com, http://6a.com, http://7.vu,
>> http://8.com/, ... (654 adds)],commit=} 0 35
>>
>> But when I query the Solr with this query
>> "http://localhost:8983/solr/select?q=*:*"; I only get 9 results.
>>
>> I've used a very basic schema.xml:
>>
>> <?xml version="1.0" encoding="UTF-8" ?>
>> <schema name="example" version="1.5">
>>
>>  <types>
>>    <fieldType name="string" class="solr.StrField"/>
>>  </types>
>>
>>  <fields>
>>    <dynamicField name="*"       type="string" indexed="true" stored="true"
>> />
>>
>>    <field name="id" type="string" indexed="true" stored="true"
>> multiValued="false" />
>>    <field name="name" type="string" indexed="true" stored="true"
>> multiValued="false" />
>>    <field name="address" type="string" indexed="true" stored="true"
>> multiValued="false" />
>>
>>  </fields>
>>
>>    <uniqueKey>id</uniqueKey>
>>   <!-- <defaultSearchField>catchall</defaultSearchField> -->
>>
>> </schema>
>>
>>
>> Some clues on what I'm doing wrong?
>>
>> Best Regards,
>>
>>
>>
>>
>>
>>
>> On Sun, Aug 5, 2012 at 1:19 PM, Gora Mohanty <g...@mimirtech.com> wrote:
>>>
>>> On 5 August 2012 17:17, Andre Lopes <lopes80an...@gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm new to Solr. I've take some reads about how it works, but I can't
>>>> find a clue for my specific situation.
>>>>
>>>> Here is my case. I've 2 database tables that I need to add to the
>>>> index, but they are related. One entry in the table "clients" could
>>>> have more than one entry in the table "contacts".
>>>
>>> [...]
>>>
>>> There seem to be various things that you need clarity on:
>>> 1. Firstly, schema.xml describes the various fields that you
>>>     might be indexing, and/or storing in Solr. Thus, it should
>>>     contain a description for each field that you will be using,
>>>     no matter what data source the field might come from.
>>> 2. One typically flattens data when indexing into Solr.
>>>     Following your example, as customers can have multiple
>>>     phone numbers, you should denormalise your data.
>>>     E.g., each Solr record could have these fields:
>>>        <cust. name>, <cust. desc.>, <phone>
>>>     Thus, for customer 1 you would need two records, for
>>>     customer 2 one record, and for customer 3 three records.
>>>
>>>     You might find this blog useful, though it probably has
>>>      more detail than you need:
>>>      http://mysolr.com/tips/denormalized-data-structure/
>>> 3. You will need some way to index the data into Solr. One
>>>     way is to use the DataImportHandler which allows
>>>     indexing from multiple databases:
>>>     http://wiki.apache.org/solr/DataImportHandler
>>>
>>> Regards,
>>> Gora
>>
>>

Reply via email to