Thanks Doug. Now I think it's better to customize Manifold CF's output connector for Solr.
Sreenivas On Thu, Dec 7, 2017 at 10:01 AM Doug Turnbull < dturnb...@opensourceconnections.com> wrote: > A tokenizer plugin is probably not what you want, you probably want > something more like an UpdateProcessor that can manipulate the whole > document as it comes into Solr. Or you may want to avoid having a Solr > plugin call to an API and do this work outside of Solr (what happens when > the API is down, should doc updates fail? for example). > > A tokenizer plugin would definitely not be recommended. Tokenizers need to > fast, low-level code that split up text into tokens based on readily > accesible config & data. The overhead of a network call would be far too > high, > > You probably want to put your extracted tags Into a different field anyway, > and a tokenizer only works on text within a single field. > > -Doug > > On Wed, Dec 6, 2017 at 10:57 PM Sreenivas.T <sree...@gmail.com> wrote: > > > All, > > > > I need help from experts. We are trying to build a cognitive search > > platform with enterprise content from content sources like sharepoint, > file > > share etc.. before content is getting indexed to Solr, I need to call our > > internal AI platform to get additional metadata like classification tags > > etc.. > > > > I'm planning to leverage manifold cf for getting the content from sources > > and planning to write > > Custom tokenizer plugin to send the content to AI platform, which intern > > returns with additional tags. I'll index additional tags dynamically > > through plugin code. > > > > Is it a feasible solution?Is there any other way to achieve the same? I > was > > planning to not to customize manifold cf. > > > > Please suggest > > > > > > > > Regards, > > Sreenivas > > > -- > Consultant, OpenSource Connections. Contact info at > http://o19s.com/about-us/doug-turnbull/; Free/Busy ( > http://bit.ly/dougs_cal) >