I just configured with the caching and it works mighty fast now.
Instead of unbelievable amount queries it queris only 4 times.
CPU usage has moved from the db to the solr computer but only for a very
short time.

Problem :
I dont see the multi value fields (Inner Entities) anymore
This is  my configuration

<entity name="PackageVersion" pk="PackageVersionId"
 query="select PackageVersion.Id PackageVersionId, .... from ....">
<entity name="PackageTag" pk="ResourceId"
processor="CachedSqlEntityProcessor" where="ResourceId =
'${PackageVersion.PackageId}'"
 query="SELECT [Text] PackageTag from [dbo].[Tag]">
</entity>
<entity name="PackageVersionTag" pk="ResourceId"
processor="CachedSqlEntityProcessor" where="ResourceId =
PackageVersion.PackageVersionId"
 query="SELECT [Text] PackageVersionTag from [dbo].[Tag]">
</entity>
<entity name="LibraryItem" pk="ResourceId"
processor="CachedSqlEntityProcessor" where="Asset.[PackageVersionId] =
PackageVersion.PackageVersionId"
 query="select CatalogVendorPartNum SKU, LibraryItems.[Description]
SKUDescription
FROM ...
 INNER JOIN ...
ON Asset.Id = LibraryVendors.DesignProjectId
INNER JOIN ...
 ON LibraryVendors.LibraryVendorId = LibraryItems.LibraryVendorId
WHERE Asset.[AssetTypeId]=1">
 </entity>
</entity>

Now, when I query
http://localhost:8983/solr/vaultCache/select?q=*&indent=true
it returns only the main entity attriburtes.
Where are my inner entities attributes now?
Thanks a lot.







On Thu, Jun 27, 2013 at 10:15 AM, Gora Mohanty <g...@mimirtech.com> wrote:

> On 27 June 2013 12:32, Mysurf Mail <stammail...@gmail.com> wrote:
> >
> > I have a relational database model
> > This is the basics of my data-config.xml
> >
> > <entity name="MyMainEntity" pk="pID" query="select ... from
> [dbo].[TableA]
> > inner join TableB on ...">
> >  <entity name="Entity1" pk="Id1" query="SELECT [Text] Tag from [Table2]
> > where ResourceId = '${MyMainEntity.pId}'"></entity>
> >                 <entity name="Entity1" pk="Id2" query="SELECT [Text] Tag
> > from [Table2] where ResourceId2 = '${MyMainEntity.pId}'"></entity>
> > <entity name="LibraryItem" pk="ResourceId"
> > query="select SKU
> >  FROM [TableB]
> > INNER JOIN ...
> > ON ...
> >  INNER JOIN ...
> > ON ...
> > WHERE ... AND ...'">
> >  </entity>
> > </entity>
> >
> > Now, this takes a lot of time.
> > 10000 rows in the first query and then each other inner entities are
> > fetched later (around 10 rows each).
> >
> > If I use a db profiler I see a the three inner entities query running
> over
> > and over (3 select sentences than again 3 select sentences over and over)
> > This is really not efficient.
> > And the import can run over 40 hrs ()
> > Now,
> > What are my options to run it faster .
> > 1. Obviously there is an option to flat the tables to one big table - but
> > that will create a lot of other side effects.
> >     I would really like to avoid that extra effort and run solr on my
> > production relational tables.
> >     So far it works great out of the box and I am searching here if there
> > is a configuration tweak.
> > 2. If I will flat the rows that - does the schema.xml need to be change
> > too? or the same fields that are multivalued will keep being multivalued.
>
> You have not shared your actual queries, so it is difficult
> to tell, but my guess would be that it is the JOINs that
> are the bottle-neck rather than the SELECTs. You should
> start by:
> 1. Profile queries from the database back-end to see
>     which are taking the most time, and try to simplify
>     them.
> 2. Make sure that relevant database columns are indexed.
>     This can make a huge difference, though going overboard
>      in indexing all columns might be counter-productive.
> 3. Use Solr DIH's CachedSqlEntityProcessor:
>     http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor
> 4. Measure the time that Solr indexing takes: From your
>     description, you seem to be guessing at it.
>
> In general, you should not flatten the records in the
> database as that is supposed to be relational data.
>
> Regards,
> Gora
>

Reply via email to