Re: Time-Routed Alias Not Distributing Wrongly Placed Docs

Jason Gerlowski Wed, 28 Nov 2018 09:24:38 -0800

Hi John,

I'm not an expert on TRA, but I don't think so.  The TRA functionality
I'm familiar with involves creating and deleting underlying
collections and then routing documents based on that information.  As
far as I know that happens at the UpdateRequestProcessor level - once
your data is indexed there's nothing available to move it around.


Best,

Jason
On Tue, Nov 27, 2018 at 12:42 PM John Nashorn <nashornj...@gmail.com> wrote:
>
> Hello Everyone,
> I'm using "hive-solr" from Lucidworks to index my data into Solr (v:7.5, 
> cloud mode). As written in the Solr Manual, TRA expects documents to be 
> indexed using its alias name, and not directly into the collections under it. 
> Unfortunately, hive-solr doesn't allow using TRA names as indexing targets. 
> So what I do is: I index data using the first collection created by TRA and 
> expect Solr to distribute my data into its respective collection under the 
> hood. This works to some extent, but a big portion of data stays in where 
> they were indexed, ie. the first collection of the TRA. For example 
> (approximate numbers):
>
> * coll_2018-07-01 => 800.000.000 docs
> * coll_2018-08-01 => 0 docs
> * coll_2018-09-01 => 0 docs
> * coll_2018-10-01 => 150.000.000 docs
> * coll_2018-11-01 => 0 docs
>
> Here, coll_2018-07-01 contains data that should normally be in the other four 
> collections.
>
> Is there a way to make TRA scan (somehow intentionally) misplaced data and 
> send them to their correct places?

Re: Time-Routed Alias Not Distributing Wrongly Placed Docs

Reply via email to