At least from what you have said, it doesn't sound as if a lot of updating (as opposed to adding new documents) would be needed. I suspect that a lot of users would rather enter a simple patient ID anyway (with a separate name lookup capability.)

I mean, once a "visit" is complete, is it really updated that much? Ditto for a study, and an image.

In short, I still don't see any problem here - provided that you de-normalized only the patient ID and not all the patient metadata. Or maybe this is simply a "Phantom SQL" or "grieving" problem - wishing you could do things exactly as they were done in SQL. It may simply take time for you to finish "grieving" about Solr not being SQL. Focus on exploiting Solr's strengths rather than obsessing over it's differences from SQL.

I mean, you were going to have more than 4,000 total rows in your visit, study, and image tables anyway.

-- Jack Krupansky

-----Original Message----- From: zbindigonzales
Sent: Monday, November 26, 2012 9:08 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr Near Realtime with denormalized Data

Hello again

The problem is that the software is used in different fields. The table
schema for hospital software is not the same as in the industrie sectors.

Customers usually create their own schema. The worst case scenary that we
know is that there are four tables connected.

Table patient --> visit --> study --> image

So if a patient has 10 visits each visit has 10 studies and each study has
40 images then we would need to update 4000 documents just because some
values changed in the patient row.

Example queries are quite difficult because the change from customer to
customer. But a normal query would look like.

query:q=(patient___idpatient__:"55" AND ((image_user__:*admin*) AND
((image_double_:{* TO 0.1} OR image_double_:{99.1 TO *})) AND
(image_text:*fiji*)))

But basicly our customers can define their searchfields on their own.

I already tried out  the JOIN capability but I couldn't find out how to join
over more then 1 table. I think denormalizing is the better solution then
try to join the tables durning the query.

What I had in my mind was some kind of reference fields or somethig.
So that in a image document you could refer to the connected patient fields.
But I don't know if something exists.

What i am now trying is to reduce the update fields. This will speed up the
delta import time but i am not sure if this is the "best practice"

Regards Sandro






--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-Near-Realtime-with-denormalized-Data-tp4022072p4022351.html Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to