Hi, I'm new to Xerces-C and not sure of many concepts within this API. I though to learn this useful API by following tutorials and problems discussed in the mailing list.
I'm able to extract attributes of the tag <LOCATE_protein>. This tag contains nested children. I need to traverse through the XML tree to fetch the required information in the nested tags( from child or grandchild nodes). Can any one suggest any simple function to do that in Xerces-C. Below is the sample XML file and modified code (http://www.yolinux.com/TUTORIALS/XML-Xerces-C.html). Thanks in advance Cheers Gaurav <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <LOCATE_interaction xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <LOCATE_protein uid="6000002" uniprot="P27824" refseq=""> <externalannot> <source db="HPRD" db_id="00252" goid="GO:0005764">Lysosomes</source> <source db="HPRD" db_id="00252" goid="GO:0005635">Nuclear Envelope</source> <source db="HPRD" db_id="00252" goid="GO:0005794">Golgi Apparatus</source> <source db="HPRD" db_id="00252" goid="GO:0005783">Endoplasmic Reticulum</source> <source db="HPRD" db_id="00252" goid="GO:0005886">Plasma Membrane</source> <source db="UniProt/SPTrEMBL" db_id="P27824" goid="GO:0005783">endoplasmic reticulum</source> <source db="UniProt/SPTrEMBL" db_id="P27824" goid="GO:0042470">melanosome</source> </externalannot> <literature></literature> <direct_interaction> <entry source="HPRD" source_id="00252" uniprot="P27824" refseq="NP_001737.1"> <name>Calnexin</name> <interactor type="direct" pubmed_id="8136357"> <molecule source_id="00127" gene_symbol="IFNGR1" uniprot="P15260" refseq="">Interferon gamma receptor 1</molecule> </interactor> </direct_interaction> <metabolic_interaction> <entry source_id="hsa:55832"> <gene_name>CAND1</gene_name> <defination>cullin-associated and neddylation-dissociated 1</defination> <orthology></orthology> <class></class> <enzyme></enzyme> </entry> <entry source_id="ENSG00000111530-MONOMER"></entry> </metabolic_interaction> </LOCATE_protein> . . .. </LOCATE_interaction> m_ConfigFileParser->parse( configFile.c_str() ); DOMDocument* xmlDoc = m_ConfigFileParser->getDocument(); DOMElement* elementRoot = xmlDoc->getDocumentElement(); if( !elementRoot ) throw(std::runtime_error( "empty XML document" )); DOMNodeList* children = elementRoot->getChildNodes(); cout << "Total Locates Proteins : " << children->getLength() << endl; for( XMLSize_t xx = 0; xx < children->getLength(); ++xx ) { DOMNode* currentNode = children->item(xx); if( currentNode->getNodeType() == DOMNode::ELEMENT_NODE ) { // Found node which is an Element. Re-cast node as element DOMElement* currentElement = dynamic_cast< xercesc::DOMElement* >( currentNode ); //cout << currentElement << endl; if( XMLString::equals(currentElement->getTagName(),TAG_locateProtein)) { // Already tested node as type element and of name "ApplicationSettings". // Read attributes of element "ApplicationSettings". const XMLCh* xmlch_locateID = currentElement->getAttribute(ATTR_locateID); m_locateID = XMLString::transcode(xmlch_locateID); const XMLCh* xmlch_locateUniprotID = currentElement->getAttribute(ATTR_locateUniprotID); m_locateUniprotID = XMLString::transcode(xmlch_locateUniprotID); const XMLCh* xmlch_locateRefseqID = currentElement->getAttribute(ATTR_locateRefseqID); m_locateRefseqID = XMLString::transcode(xmlch_locateRefseqID); cout << "Locate ID:" << m_locateID << "|UniprotID:" << m_locateUniprotID << "|RefseqID:" << m_locateRefseqID << endl; DOMNode* currentChild=currentNode->getFirstChild(); cout << currentChild->getTextContent() << endl; cout << XMLString::transcode(currentNode->getFirstChild()->getNodeName()) << endl; } } } -- Mr. Gaurav Kumar PhD Student (Bioinformatics/Computational Biology)
