Hi, Thanks for the replies. The info in my admin/stats is the following:
searcherName : Searcher@f4e40da main caching : true numDocs : 654 maxDoc : 654 reader : SolrIndexReader{this=6a6078e7,r=ReadOnlyDirectoryReader@6a6078e7,refCnt=1,segments=1} readerDir : org.apache.lucene.store.MMapDirectory@/home/andre/workspace/test/3rd_party/solr/apache-solr-3.6.1/example/solr/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@51a422f6 indexVersion : 1343578710140 openedAt : Sun Aug 05 19:04:35 WEST 2012 registeredAt : Sun Aug 05 19:04:35 WEST 2012 warmupTime : 15 There are 654 docs. Some more info, my solrconfig.xml: <!-- Request handler added by Andre Lopes to import data from database --> <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <!-- default values for query parameters can be specified, these will be overridden by parameters in the request --> <lst name="defaults"> <str name="config">db-data-config.xml</str> </lst> </requestHandler> My db-data-config.xml: <?xml version="1.0" encoding="UTF-8" ?> <dataConfig> <dataSource driver="org.postgresql.Driver" url="jdbc:postgresql://localhost:5432/euvoudebicicleta" user="myuser" password="mypass" /> <document> <entity name="bicyclebusinesses" query="select * from table_text__single_occurrencies order by date_inserted"> <field column="uri" name="uri" /> <field column="business_name" name="name" /> <field column="business_address" name="address" /> </entity> </document> </dataConfig> My schema.xml: <?xml version="1.0" encoding="UTF-8" ?> <schema name="example" version="1.5"> <types> <fieldType name="string" class="solr.StrField"/> </types> <fields> <dynamicField name="*" type="string" indexed="false" stored="false" /> <field name="uri" type="string" indexed="true" stored="true" /> <!-- <field name="name" type="string" indexed="true" stored="true" /> <field name="address" type="string" indexed="true" stored="true" /> --> </fields> <uniqueKey>uri</uniqueKey> <!-- <defaultSearchField>catchall</defaultSearchField> --> </schema> I've tested, and the SELECT in the db-data-config.xml outputs 654 results. Some more clues? Best Regards, On Sun, Aug 5, 2012 at 6:59 PM, Erick Erickson <erickerick...@gmail.com> wrote: > A quick check here is to go to your admin/stats page and look at > numDocs and maxDocs. numDocs is the number of documents that it's > possible to find, i.e. non updated/deleted docs. maxDocs is the number > of documents that have been added, and that count includes ones with > duplicate unique IDs. > > So I'm guessing that numDocs == 9 and maxDocs == 654, which as Jack > says indicates that your uniqueKey is repeated for lots and lots of > your data... > > Best > Erick > > On Sun, Aug 5, 2012 at 1:40 PM, Jack Krupansky <j...@basetechnology.com> > wrote: >> Make sure the id is not duplicated. You might have inadvertently populated >> the id field in your Solr schema with some non-key value that occurs with >> high frequency (and may have roughly 9 unique values.) >> >> Examine the 9 results and their id fields. Then look at some of your input >> data to verify that the values placed in the id field are what you expected. >> >> If possible, identify one input record that isn't in the 9 results but >> should be and verify its id. >> >> >> -- Jack Krupansky >> >> -----Original Message----- From: Andre Lopes >> Sent: Sunday, August 05, 2012 1:31 PM >> To: solr-user@lucene.apache.org >> Subject: Re: How to configure schema.xml to take in account two database >> tables? >> >> >> Thanks for the replies, >> >> I've now successfully indexed the database using the DataImportHandler >> but there is something weird. I've indexed 654 entries but I can't >> output all the 654 results. >> >> After the I run the >> "http://localhost:8983/solr/dataimport?command=full-import" I got 654 >> adds: >> >> Aug 5, 2012 6:16:51 PM >> org.apache.solr.update.processor.LogUpdateProcessor finish >> INFO: {deleteByQuery=*:*,add=[http://1.com, http://2.com, >> http://3.com, http://4.com, http://5.com, http://6a.com, http://7.vu, >> http://8.com/, ... (654 adds)],commit=} 0 35 >> >> But when I query the Solr with this query >> "http://localhost:8983/solr/select?q=*:*" I only get 9 results. >> >> I've used a very basic schema.xml: >> >> <?xml version="1.0" encoding="UTF-8" ?> >> <schema name="example" version="1.5"> >> >> <types> >> <fieldType name="string" class="solr.StrField"/> >> </types> >> >> <fields> >> <dynamicField name="*" type="string" indexed="true" stored="true" >> /> >> >> <field name="id" type="string" indexed="true" stored="true" >> multiValued="false" /> >> <field name="name" type="string" indexed="true" stored="true" >> multiValued="false" /> >> <field name="address" type="string" indexed="true" stored="true" >> multiValued="false" /> >> >> </fields> >> >> <uniqueKey>id</uniqueKey> >> <!-- <defaultSearchField>catchall</defaultSearchField> --> >> >> </schema> >> >> >> Some clues on what I'm doing wrong? >> >> Best Regards, >> >> >> >> >> >> >> On Sun, Aug 5, 2012 at 1:19 PM, Gora Mohanty <g...@mimirtech.com> wrote: >>> >>> On 5 August 2012 17:17, Andre Lopes <lopes80an...@gmail.com> wrote: >>>> >>>> Hi, >>>> >>>> I'm new to Solr. I've take some reads about how it works, but I can't >>>> find a clue for my specific situation. >>>> >>>> Here is my case. I've 2 database tables that I need to add to the >>>> index, but they are related. One entry in the table "clients" could >>>> have more than one entry in the table "contacts". >>> >>> [...] >>> >>> There seem to be various things that you need clarity on: >>> 1. Firstly, schema.xml describes the various fields that you >>> might be indexing, and/or storing in Solr. Thus, it should >>> contain a description for each field that you will be using, >>> no matter what data source the field might come from. >>> 2. One typically flattens data when indexing into Solr. >>> Following your example, as customers can have multiple >>> phone numbers, you should denormalise your data. >>> E.g., each Solr record could have these fields: >>> <cust. name>, <cust. desc.>, <phone> >>> Thus, for customer 1 you would need two records, for >>> customer 2 one record, and for customer 3 three records. >>> >>> You might find this blog useful, though it probably has >>> more detail than you need: >>> http://mysolr.com/tips/denormalized-data-structure/ >>> 3. You will need some way to index the data into Solr. One >>> way is to use the DataImportHandler which allows >>> indexing from multiple databases: >>> http://wiki.apache.org/solr/DataImportHandler >>> >>> Regards, >>> Gora >> >>