Hi. I asked this on the Tika group and the recommendation was to ask it here. I 
am using the following C# code to call Tika and would like it to return the raw 
text without any XML or JSON. So if the Word document contains "Hello World", 
this should return only that text and no XML or anything else to wrap it in -- 
just the raw text.

This code returns JSON of the XML, which in turn contains the text of the 
document. I need it to return the raw text only, no XML. Thanks.

var url = @"http://localhost:8983/solr/update/extract";;

var client = new WebClient();
client.QueryString.Add("extractOnly","true");
client.QueryString.Add("wt","json");
var data = client.UploadFile(url, "input.txt"); 
var json = ASCIIEncoding.ASCII.GetString(data);



Sincerely,
Alex 

Reply via email to