check out the videos on this website TROO.TUBE don't be such a sheep/zombie/loser/NPC. Much love! https://troo.tube/videos/watch/aaa64864-52ee-4201-922f-41300032f219
On Wed, May 13, 2020 at 7:24 AM Bernd Fehling <bernd.fehl...@uni-bielefeld.de> wrote: > > Thanks Eric for your answer. > > I was thinking to complex and seeing problems which are not there. > > I have your second scenario. The first huge collection still remains > and will grow further while the second will start with same schema but > content from a new source. Sure I could also load the content > from the new source into the first huge collection but I want to > have source, loading, maintenance handling separated. > May be I also start the new collection with a new instance. > > Regards > Bernd > > Am 13.05.20 um 13:40 schrieb Erick Erickson: > > So a doc in your new collection is expected to supersede a doc > > with the same ID in the old one, right? > > > > What I’d do is delete the IDs from my old collection as they were added to > > the new one, there’s not much use in keeping both if you always want > > the new one. > > > > Let’s assume you do this, the next issue is making sure all of your docs in > > the new collection are deleted from the old one, and your process will > > inevitably have a hiccough or two. You could periodically use streaming to > > produce a list of IDs common to both collections, and have a cleanup > > process you occasionally ran to make up for any glitches in the normal > > delete-from-the-old-collection process, see: > > https://lucene.apache.org/solr/guide/6_6/stream-decorators.html#stream-decorators > > > > If that’s not the case, then having the same id in the different collections > > doesn’t matter. Solr doesn’t use the ID for combining results, just routing > > and > > then updating. > > > > This is illustrated by the fact that, through user error, you can even get > > the same > > document repeated in a result set if it gets indexed to two different > > shards. > > > > And if neither of those are on target, what about “handling” unique IDs > > across > > two collections do you think might go wrong? > > > > Best, > > Erick > > > >> On May 13, 2020, at 4:26 AM, Bernd Fehling > >> <bernd.fehl...@uni-bielefeld.de> wrote: > >> > >> Dear list, > >> > >> in my SolrCloud 6.6 I have a huge collection and now I will get > >> much more data from a different source to be indexed. > >> So I'm thinking about a new collection and combine both, the existing > >> one and the new one with an alias. > >> > >> But how to handle the unique key accross collections within a datacenter? > >> Is it at all possible? > >> > >> I don't see any problems with add, update and delete of documents because > >> these operations are not using the alias. > >> > >> But searching accross collections with alias and then fetching documents > >> by id from the result may lead to results where the id is in both > >> collections? > >> > >> I have no idea, but there are SolrClouds with a lot of collections out > >> there. > >> How do they handle uniqueness accross collections within a datacenter? > >> > >> Regards > >> Bernd > >