Re: Solr Near Realtime with denormalized Data

Jack Krupansky Mon, 26 Nov 2012 06:48:30 -0800

At least from what you have said, it doesn't sound as if a lot of updating(as opposed to adding new documents) would be needed. I suspect that a lotof users would rather enter a simple patient ID anyway (with a separate namelookup capability.)

I mean, once a "visit" is complete, is it really updated that much? Dittofor a study, and an image.

In short, I still don't see any problem here - provided that youde-normalized only the patient ID and not all the patient metadata. Or maybethis is simply a "Phantom SQL" or "grieving" problem - wishing you could dothings exactly as they were done in SQL. It may simply take time for you tofinish "grieving" about Solr not being SQL. Focus on exploiting Solr'sstrengths rather than obsessing over it's differences from SQL.

I mean, you were going to have more than 4,000 total rows in your visit,study, and image tables anyway.


-- Jack Krupansky

-----Original Message-----From: zbindigonzales

Sent: Monday, November 26, 2012 9:08 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr Near Realtime with denormalized Data

Hello again

The problem is that the software is used in different fields. The table
schema for hospital software is not the same as in the industrie sectors.

Customers usually create their own schema. The worst case scenary that we
know is that there are four tables connected.

Table patient --> visit --> study --> image

So if a patient has 10 visits each visit has 10 studies and each study has
40 images then we would need to update 4000 documents just because some
values changed in the patient row.

Example queries are quite difficult because the change from customer to
customer. But a normal query would look like.

query:q=(patient___idpatient__:"55" AND ((image_user__:*admin*) AND
((image_double_:{* TO 0.1} OR image_double_:{99.1 TO *})) AND
(image_text:*fiji*)))

But basicly our customers can define their searchfields on their own.

I already tried out  the JOIN capability but I couldn't find out how to join
over more then 1 table. I think denormalizing is the better solution then
try to join the tables durning the query.

What I had in my mind was some kind of reference fields or somethig.
So that in a image document you could refer to the connected patient fields.
But I don't know if something exists.

What i am now trying is to reduce the update fields. This will speed up the
delta import time but i am not sure if this is the "best practice"

Regards Sandro

--

View this message in context:http://lucene.472066.n3.nabble.com/Solr-Near-Realtime-with-denormalized-Data-tp4022072p4022351.htmlSent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Near Realtime with denormalized Data

Reply via email to