Hi,
What's the correct way to create index(es) using denormalization?
1. something like that?
<entity name="solr_publisher" query="select name from publishers">
<entity name="solr_journal" query="select name as j_name from
journals WHERE publisher_id='${solr_publisher.id}'">
<entity name="solr_articles" query="select title, abstract from
articles WHERE journal_id='${solr_journal.id}'">
<entity name="solr_authors" query="select given_name, last_name from
authors WHERE article_id='${solr_article.id}'">
OR even:
<entity name="solr_publisher" query="select a.x, a.y, j.z from
articles a inner join journals j [...]">
2. OR a different index for each SQL table?
-> if yes, how can I then retrieve all the needed data (i.e.:
intersection)?...JOIN/Streaming exp.?
I have more than 68 millions of articles, which are all linked to 1
journal and 1 publisher...And I have 8 different services requesting the
data (so I cannot really provide a specific use case, I'd like to know a
more general answer).
But in general, would it be better/faster to query:
- a single normalized index with all the data at the same place (but
larger index because of duplicated data)
- several indexes (smaller indexes, but need to make a solr "join")
I got good tips about using 'Streaming expressions' & 'Parallel SQL
interface', but I first want to know the best way to store the data.
Kind regards,
Bastien Latard
Web engineer
--
MDPI AG
Postfach, CH-4005 Basel, Switzerland
Office: Klybeckstrasse 64, CH-4057
Tel. +41 61 683 77 35
Fax: +41 61 302 89 18
E-mail:
lat...@mdpi.com
http://www.mdpi.com/