On Thu, Feb 4, 2010 at 10:50 AM, Lance Norskog <goks...@gmail.com> wrote: > I just tested this with a DIH that does not use database input. > > If the DataImportHandler JDBC code does not support a schema that has > optional fields, that is a major weakness. Noble/Shalin, is this true? The problem is obviously not with DIH. DIH blindly passes on all the fields it could obtain from the DB. if some field is missing DIH does not do anything > > On Tue, Feb 2, 2010 at 8:50 AM, Sascha Szott <sz...@zib.de> wrote: >> Hi, >> >> since some of the fields used in your DIH configuration aren't mandatory >> (e.g., keywords and tags are defined as nullable in your db table schema), >> add a default value to all optional fields in your schema configuration >> (e.g., default = ""). Note, that Solr does not understand the db-related >> concept of null values. >> >> Solr's log output >> >> SolrInputDocument[{keywords=keywords(1.0)={Dolce}, name=name(1.0)={Dolce >> & Gabbana D&G Neckties designer Tie for men 543}, >> productID=productID(1.0)={220213}}] >> >> indicates that there aren't any tags or descriptions stored for the item >> with productId 220213. Since no default value is specified, Solr raises an >> error when creating the index document. >> >> -Sascha >> >> Jean-Michel Philippon-Nadeau wrote: >>> >>> Hi, >>> >>> Thanks for the reply. >>> >>> On Tue, 2010-02-02 at 16:57 +0100, Sascha Szott wrote: >>>> >>>> * the output of MySQL's describe command for all tables/views referenced >>>> in your DIH configuration >>> >>> mysql> describe products; >>> >>> +----------------+------------------+------+-----+---------+----------------+ >>> | Field | Type | Null | Key | Default | Extra >>> | >>> >>> +----------------+------------------+------+-----+---------+----------------+ >>> | productID | int(10) unsigned | NO | PRI | NULL | >>> auto_increment | >>> | skuCode | varchar(320) | YES | MUL | NULL | >>> | >>> | upcCode | varchar(320) | YES | MUL | NULL | >>> | >>> | name | varchar(320) | NO | | NULL | >>> | >>> | description | text | NO | | NULL | >>> | >>> | keywords | text | YES | | NULL | >>> | >>> | disqusThreadID | varchar(50) | NO | | NULL | >>> | >>> | tags | text | YES | | NULL | >>> | >>> | createdOn | int(10) unsigned | NO | | NULL | >>> | >>> | lastUpdated | int(10) unsigned | NO | | NULL | >>> | >>> | imageURL | varchar(320) | YES | | NULL | >>> | >>> | inStock | tinyint(1) | YES | MUL | 1 | >>> | >>> | active | tinyint(1) | YES | | 1 | >>> | >>> >>> +----------------+------------------+------+-----+---------+----------------+ >>> 13 rows in set (0.00 sec) >>> >>> mysql> describe product_soldby_vendor; >>> +-----------------+------------------+------+-----+---------+-------+ >>> | Field | Type | Null | Key | Default | Extra | >>> +-----------------+------------------+------+-----+---------+-------+ >>> | productID | int(10) unsigned | NO | MUL | NULL | | >>> | productVendorID | int(10) unsigned | NO | MUL | NULL | | >>> | price | double | NO | | NULL | | >>> | currency | varchar(5) | NO | | NULL | | >>> | buyURL | varchar(320) | NO | | NULL | | >>> +-----------------+------------------+------+-----+---------+-------+ >>> 5 rows in set (0.00 sec) >>> >>> mysql> describe products_vendors_subcategories; >>> >>> +----------------------------+------------------+------+-----+---------+----------------+ >>> | Field | Type | Null | Key | Default | >>> Extra | >>> >>> +----------------------------+------------------+------+-----+---------+----------------+ >>> | productVendorSubcategoryID | int(10) unsigned | NO | PRI | NULL | >>> auto_increment | >>> | productVendorCategoryID | int(10) unsigned | NO | | NULL | >>> | >>> | labelEnglish | varchar(320) | NO | | NULL | >>> | >>> | labelFrench | varchar(320) | NO | | NULL | >>> | >>> >>> +----------------------------+------------------+------+-----+---------+----------------+ >>> 4 rows in set (0.00 sec) >>> >>> mysql> describe products_vendors_categories; >>> >>> +-------------------------+------------------+------+-----+---------+----------------+ >>> | Field | Type | Null | Key | Default | >>> Extra | >>> >>> +-------------------------+------------------+------+-----+---------+----------------+ >>> | productVendorCategoryID | int(10) unsigned | NO | PRI | NULL | >>> auto_increment | >>> | labelEnglish | varchar(320) | NO | | NULL | >>> | >>> | labelFrench | varchar(320) | NO | | NULL | >>> | >>> >>> +-------------------------+------------------+------+-----+---------+----------------+ >>> 3 rows in set (0.00 sec) >>> >>> mysql> describe product_vendor_in_subcategory; >>> +-------------------+------------------+------+-----+---------+-------+ >>> | Field | Type | Null | Key | Default | Extra | >>> +-------------------+------------------+------+-----+---------+-------+ >>> | productVendorID | int(10) unsigned | NO | MUL | NULL | | >>> | productCategoryID | int(10) unsigned | NO | MUL | NULL | | >>> +-------------------+------------------+------+-----+---------+-------+ >>> 2 rows in set (0.00 sec) >>> >>> mysql> describe products_vendors_countries; >>> >>> +------------------------+------------------+------+-----+---------+----------------+ >>> | Field | Type | Null | Key | Default | >>> Extra | >>> >>> +------------------------+------------------+------+-----+---------+----------------+ >>> | productVendorCountryID | int(10) unsigned | NO | PRI | NULL | >>> auto_increment | >>> | name | varchar(50) | NO | | NULL | >>> | >>> | code | varchar(2) | NO | | NULL | >>> | >>> >>> +------------------------+------------------+------+-----+---------+----------------+ >>> 3 rows in set (0.00 sec) >>> >>> mysql> describe product_vendor_shipsto_country; >>> +------------------------+---------+------+-----+---------+-------+ >>> | Field | Type | Null | Key | Default | Extra | >>> +------------------------+---------+------+-----+---------+-------+ >>> | productVendorID | int(11) | NO | MUL | NULL | | >>> | productVendorCountryID | int(11) | NO | MUL | NULL | | >>> +------------------------+---------+------+-----+---------+-------+ >>> 2 rows in set (0.00 sec) >>> >>> >>>> * the DIH configuration file (i.e., data-config.xml) >>> >>> <dataConfig> >>> <dataSource type="JdbcDataSource" >>> driver="com.mysql.jdbc.Driver" >>> url="jdbc:mysql://localhost/giftiniti_api" >>> user="giftiniti_api" >>> password="17i6ARfqJa1K"/> >>> <document> >>> >>> <entity name="products" >>> query="select productID,name,keywords,tags from products"> >>> <field column="productID" name="productID"/> >>> <field column="name" name="name"/> >>> <field column="keywords" name="keywords"/> >>> <field column="tags" name="tags"/> >>> <field column="description" name="description"/> >>> >>> <entity name="soldby" >>> query="select productVendorID,price from >>> product_soldby_vendor where productID='${product.productID}'"> >>> <field column="price" name="price"/> >>> >>> <entity name="insubcategory" >>> query="select productCategoryID from >>> product_vendor_in_subcategory where >>> productVendorID='${soldby.productVendorID}'"> >>> >>> <entity name="subcategories" >>> query="select concat_ws(' ',labelEnglish,labelFrench) >>> as label, productVendorCategoryID from products_vendors_subcategories >>> where productVendorSubcategoryID='${insubcategory.productCategoryID}'"> >>> <field column="label" name="subcategories"/> >>> >>> <entity name="categories" >>> query="select concat_ws(' >>> ',labelEnglish,labelFrench) as label from products_vendors_categories >>> where >>> productVendorCategoryID='${subcategories.productVendorCategoryID}'"> >>> <field column="label" name="categories"/> >>> </entity> >>> >>> </entity> >>> >>> </entity> >>> >>> <entity name="shipsto" >>> query="select productVendorCountryID from >>> product_vendor_shipsto_country where >>> productVendorID='${soldby.productVendorID}'"> >>> <entity name="countries" >>> query="select code from products_vendors_countries >>> where productVendorCountryID='${shipsto.productVendorCountryID}'"> >>> <field column="code" name="countries"/> >>> </entity> >>> </entity> >>> >>> </entity> >>> >>> </entity> >>> >>> </document> >>> </dataConfig> >>> >>>> * the schema definition (i.e., schema.xml) >>> >>> <?xml version="1.0" encoding="UTF-8" ?> >>> <schema name="example" version="1.2"> >>> >>> <types> >>> <fieldType name="string" class="solr.StrField" >>> sortMissingLast="true" omitNorms="true"/> >>> <fieldType name="textgen" class="solr.TextField" >>> positionIncrementGap="100"> >>> <analyzer type="index"> >>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >>> <filter class="solr.StopFilterFactory" ignoreCase="true" >>> words="stopwords.txt" enablePositionIncrements="true" /> >>> <filter class="solr.WordDelimiterFilterFactory" >>> generateWordParts="1" generateNumberParts="1" catenateWords="1" >>> catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/> >>> <filter class="solr.LowerCaseFilterFactory"/> >>> </analyzer> >>> <analyzer type="query"> >>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >>> <filter class="solr.SynonymFilterFactory" >>> synonyms="synonyms.txt" ignoreCase="true" expand="true"/> >>> <filter class="solr.StopFilterFactory" >>> ignoreCase="true" >>> words="stopwords.txt" >>> enablePositionIncrements="true" >>> /> >>> <filter class="solr.WordDelimiterFilterFactory" >>> generateWordParts="1" generateNumberParts="1" catenateWords="0" >>> catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/> >>> <filter class="solr.LowerCaseFilterFactory"/> >>> </analyzer> >>> </fieldType> >>> <fieldType name="text_ws" class="solr.TextField" >>> positionIncrementGap="100"> >>> <analyzer> >>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >>> </analyzer> >>> </fieldType> >>> <fieldType name="text" class="solr.TextField" >>> positionIncrementGap="100"> >>> <analyzer type="index"> >>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >>> <filter class="solr.StopFilterFactory" >>> ignoreCase="true" >>> words="stopwords.txt" >>> enablePositionIncrements="true" >>> /> >>> <filter class="solr.WordDelimiterFilterFactory" >>> generateWordParts="1" generateNumberParts="1" catenateWords="1" >>> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> >>> <filter class="solr.LowerCaseFilterFactory"/> >>> <filter class="solr.SnowballPorterFilterFactory" >>> language="English" protected="protwords.txt"/> >>> </analyzer> >>> <analyzer type="query"> >>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >>> <filter class="solr.SynonymFilterFactory" >>> synonyms="synonyms.txt" ignoreCase="true" expand="true"/> >>> <filter class="solr.StopFilterFactory" >>> ignoreCase="true" >>> words="stopwords.txt" >>> enablePositionIncrements="true" >>> /> >>> <filter class="solr.WordDelimiterFilterFactory" >>> generateWordParts="1" generateNumberParts="1" catenateWords="0" >>> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> >>> <filter class="solr.LowerCaseFilterFactory"/> >>> <filter class="solr.SnowballPorterFilterFactory" >>> language="English" protected="protwords.txt"/> >>> </analyzer> >>> </fieldType> >>> </analyzer> >>> </fieldType> >>> <fieldType name="double" class="solr.TrieDoubleField" >>> precisionStep="0" omitNorms="true" positionIncrementGap="0"/> >>> >>> </types> >>> >>> <fields> >>> <field name="productID" type="string" indexed="true" stored="true" >>> required="true" /> >>> >>> <field name="name" type="textgen" indexed="true" stored="true"/> >>> <field name="keywords" type="textgen" indexed="true" stored="true"/> >>> <field name="tags" type="textgen" indexed="true" stored="true"/> >>> <field name="description" type="text" indexed="true" stored="true"/> >>> >>> <field name="price" type="double" indexed="true" stored="true"/> >>> >>> <field name="categories" type="text_ws" indexed="true" stored="true" >>> multiValued="true" omitNorms="true" /> >>> <field name="subcategories" type="text_ws" indexed="true" >>> stored="true" multiValued="true" omitNorms="true" /> >>> <field name="countries" type="text_ws" indexed="true" stored="true" >>> multiValued="true" omitNorms="true" /> >>> >>> <!-- Text is a meta-data field containing all fields that are going >>> to be searchable by default --> >>> <field name="text" type="text" indexed="true" stored="false" >>> multiValued="true"/> >>> </fields> >>> >>> <uniqueKey>productID</uniqueKey> >>> >>> <defaultSearchField>text</defaultSearchField> >>> >>> <solrQueryParser defaultOperator="AND"/> >>> >>> <copyField source="name" dest="text"/> >>> <copyField source="keywords" dest="text"/> >>> <copyField source="tags" dest="text"/> >>> <copyField source="description" dest="text"/> >>> >>> </schema> >>> >>> >>> >>> Jean-Michel >>> >> >> > > > > -- > Lance Norskog > goks...@gmail.com >
-- ----------------------------------------------------- Noble Paul | Systems Architect| AOL | http://aol.com