With a basically default install of the trunk version of solr 1.4 when trying to index an xml file, it appears that the xml tags seem to get stripped when indexed. If the tag names and their frequenicies are important to me for search purposes could someone tell me what my options are to not have solr strip out xml tags? for example if I have and xml tag of <tag1> hello </tag1> I'd like to see tag1 appear twice as a term and count as 2 is some termFrequency vector. I was trying out the examples from this link http://wiki.apache.org/solr/ExtractingRequestHandler and sending in an xml file. Would I need to modify some exsiting code or is it just a configuration to not strip out xml tags in processing? -Peter
****************************************************************** Peter Thung Software Developer IBS Project Technical Lead -Web Developer Code 56340 - Net-centric ISR Development Branch Joint & National ISR Systems Division Inteligence, Surveillance and Reconnaissance Department US Navy Space & Naval Warfare Systems Center Pacific (SSC PAC) Topside Campus, Bldg A33, room 0055 53560 Hull Street, San Diego, CA 92152 UNCLASS Email: peter.th...@navy.mil SIPRNET Email: thu...@spawar.navy.smil.mil COMM (Primary): (619) 553-6513 COMM (Secondary):(619) 553-0777 FAX: (619) 553-1586 ******************************************************************