> > what am I missing? > Change your <entity name="category" query="select cfcr.feedId ..."> to <entity name="category" *transformer="RegexTransformer"* query="select cfcr.feedId .."> The "splitBy" directive is understood by this transformer and in your case the attribute was simply ignored.
Don't forget to re-index once you have changed. Cheers Avlesh On Fri, Oct 30, 2009 at 9:33 PM, Joel Nylund <jnyl...@yahoo.com> wrote: > Thanks Chantal, I will keep that in mind for tuning, > > for sql I figured way to combine them into one row using concat, but I > still seem to be having an issue splitting them: > > Db now returns as one column categoryType: > TOPIC,LANGUAGE > > but my solr result, if you note the item in categoryType all seem to be > within one str, I would expect it to be in multiple strings within the > array, is this assumption wrong? > > <doc> > - > <arr name="categoryType"> > <str>TOPIC,LANGUAGE</str> > </arr> > <str name="id">40</str> > <str name="title">feed title</str> > </doc> > > > Here is my import: > <document name="doc"> > <entity name="item" > query="SELECT f.id, f.title > FROM Feed f > <field column="id" name="id" /> > <field column="title" name="title" /> > <entity name="category" query="select cfcr.feedId, > group_concat(cfcr.categoryType) as categoryType > from CFR cfcr > where > cfcr.feedId = '${item.id}' > AND > group by cfcr.feedId"> > <field column="categoryType" > name="categoryType" splityBy="," /> > </entity> > > </entity> > > In schema: > <field name="categoryType" type="text" indexed="true" stored="true" > required="false" multiValued="true"/> > <field name="categoryName" type="text" indexed="true" stored="true" > required="false" multiValued="true"/> > > > what am I missing? > > thanks > Joel > > > > On Oct 30, 2009, at 10:00 AM, Chantal Ackermann wrote: > > That depends a bit on your database, but it is tricky and might not be >> performant. >> >> If you are more of a Java developer, you might prefer retrieving mutliple >> rows per SOLR document from your dataSource (join on your category and main >> table), and aggregate them in your custom EntityProcessor. I got a far(!) >> better performance retrieving everything in one query and doing the >> aggregation in Java. But this is, of course, depending on your table >> structure and data. >> >> Noble Paul helped me with the custom EntityProcessor, and it turned out >> quite easy. Have a look at the thread with the heading from this mailing >> list (SOLR-USER): >> DataImportHandler / Import from DB : one data set comes in multiple rows >> >> Cheers, >> Chantal >> >> >> Joel Nylund schrieb: >> >>> thanks, but im confused how I can aggregate across rows, I dont know >>> of any easy way to get my db to return one row for all the categories >>> (given the hint from your other email), I have split the category >>> query into a separate entity, but its returning multiple rows, how do >>> I combine multiple rows into 1 index entity? >>> thanks >>> Joel >>> On Oct 29, 2009, at 8:58 PM, Avlesh Singh wrote: >>> >>>> In the database this is modeled a a 1-N where category table has the >>>>> mapping of feed to category >>>>> I need to be able to query , give me all the feeds in any given >>>>> category. >>>>> How can I best model this in solr? >>>>> Seems like multiValued field might help, but how would I populate >>>>> it, and >>>>> would the query above work?. >>>>> >>>>> Yes you are right. A multivalued field for "categories" is the answer. >>>> >>>> For populating in the index - >>>> >>>> 1. If you use DIH to populate your indexes and your datasource is a >>>> database then you can use DIH's RegexTransformer on an aggregated >>>> list of >>>> categories. e.g. if your database query retruns "a,b,c,d" in a >>>> column called >>>> "db_categories", this is how you would put it in DIH's data-config >>>> file - >>>> <field column="db_categories" name="categories" splityBy="," />. >>>> 2. If you "add" documents to Solr yourself multiple values for >>>> the field >>>> can be specified as an array or list of values in the >>>> SolrInputDocument. >>>> >>>> A multivalued field provides the same faceting and searching >>>> capabilites >>>> like regular fields. There is no special syntax. >>>> >>>> Cheers >>>> Avlesh >>>> >>>> On Fri, Oct 30, 2009 at 4:55 AM, Joel Nylund <jnyl...@yahoo.com> >>>> wrote: >>>> >>>> Hi, >>>>> >>>>> I have one index so far which contains feeds. I have been able to >>>>> de-normalize several tables and map this data onto the feed entity. >>>>> There is >>>>> one tricky problem that I need help on. >>>>> >>>>> Feeds have 1 - many categories. >>>>> >>>>> So Lets say we have Category1, Category2 and Category3 >>>>> >>>>> Feed 1 - is in Category 1 >>>>> Feed 2 is in category2 and category3 >>>>> Feed 3 is in category2 >>>>> Feed 4 has no category >>>>> >>>>> In the database this is modeled a a 1-N where category table has the >>>>> mapping of feed to category >>>>> >>>>> I need to be able to query , give me all the feeds in any given >>>>> category. >>>>> >>>>> How can I best model this in solr? >>>>> >>>>> Seems like multiValued field might help, but how would I populate >>>>> it, and >>>>> would the query above work?. >>>>> >>>>> thanks >>>>> Joel >>>>> >>>>> >>>>> >