I have a project where the client wants to store time series data (maybe in SOLR if it can work). We want to store daily "prices" over last 20 years (about 6000 values with associate dates), for up to 500,000 entities.
This data currently exists in a SQL database. Access to SQL is too slow for clients needs at this point. The requirements are to fetch up to 6000 daily prices for an entity and render a chart in real-time on a web page. One way we can do it is to generate one document for every daily price, per entity, so we have 500,000 * 6000 = 3 billion docs in SOLR. We created simple proof of concept with 10 million documents and it works perfectly. But, I assume up to 3 billion small documents is too much for a single index. What is the hard limit on the total # of documents you can put into a SOLR index (regardless of memory, disk space, etc.)? The good thing about this approach is it works fine using existing data import handler for SQL. I know we can shard the index per entity using some hash, but want to know what upper limit per index is. Another way is to store each set of 6000 prices as some blog (maybe JSON) as single field on a document, and have one document per entity (500,000 documents). That will work, but there is no way to do this using existing data import handlers correct? If possible I dont want to develop custom import handler or data loader unless I absolutely have to. Is there some template function or something available in current DIH features to make this work? Thanks Bob