Hi, Is your data so partitioned that it makes sense to consider splitting up in multiple collections and make some arrangement that will keep only a few collections live at a time, loading index files from S3 on demand?
I cannot see how an S3 directory would be able to effectively cache files in S3 and what units the index files would be stored as? Have you investigated EFS as an alternative? That would look like a normal filesystem to Solr but might be cheaper storage wise, but much slower. Jan > 23. apr. 2020 kl. 06:57 skrev dhurandar S <dhurandarg...@gmail.com>: > > Hi, > > I am looking to use S3 as the place to store indexes. Just how Solr uses > HdfsDirectory to store the index and all the other documents. > > We want to provide a search capability that is okay to be a little slow but > cheaper in terms of the cost. We have close to 2 petabytes of data on which > we want to provide the Search using Solr. > > Are there any open-source implementations around using S3 as the Directory > for Solr ?? > > Any recommendations on this approach? > > regards, > Rahul