Yes, from an utilitarian perspective you're absolutely right. Mine is actually a more academic exercise.
I will be more clear on the steps that I would like to take: 1) Call the analyzer of Solr that returns me an XML response in the following format (just a snippet as example) <lst name="attributeNames"> <lst name="index"> <lst name="incomingArc|1.6 outgoingArc|1.6"> <arr name="org.apache.lucene.analysis.WhitespaceTokenizer"> <lst> <str name="text">incomingArc|1.6</str> <str name="type">word</str> <int name="start">0</int> <int name="end">15</int> <int name="position">1</int> </lst> <lst> <str name="text">outgoingArc|1.6</str> <str name="type">word</str> <int name="start">16</int> <int name="end">31</int> <int name="position">2</int> </lst> </arr> <arr name="org.apache.lucene.analysis.payloads.DelimitedPayloadTokenFilter"> <lst> <str name="text">incomingArc</str> <str name="type">word</str> <int name="start">0</int> <int name="end">15</int> <int name="position">1</int> <str name="payload">org.apache.lucene.index.Payload:org.apache.lucene.index.Payload@ffe807d2</str> </lst> <lst> etc..... 2) now I would like to be able to extract the info that I need from there and tell Solr directly which things to index, telling him directly also which are the tokens with their respective payload without performing more analysis. I know that solr does all those things internally starting from the original text but is there a way to skip that phase by telling it immediately from a given field which are the tokens with their payloads? So that they will be stored internally as before, only that this time I would have performed the 2 steps (analysis and indexing) in 2 different phases, with my application orchestrating both of them. I don't know if building the documents with SolrJ could help...maybe that's the way to go? Or is there a particular XML format to send to Solr? For example somthing like: <add> <doc> <field name="id">0001</field> <field name="text"> <rawValue>this is text</rawValue> <token pos="1" payload="2.0">this</token> <token pos="2" payload="1.0">is</token> <token pos="3" payload="2.5">text</token> </field> </doc> </add> Does it make sense? Or maybe I'm dreaming? :) Thank you for answering! -- View this message in context: http://lucene.472066.n3.nabble.com/Feed-index-with-analyzer-output-tp3131771p3132556.html Sent from the Solr - User mailing list archive at Nabble.com.