I am working with a XML, which can be found in the link Sample XML file <https://www.dropbox.com/s/8kn9g8xev2u5n8o/Dummy.xml?dl=0&preview=Dummy.xml>
I am trying to extract each and every fields information to a csv file. I want my output to be as below: Required output: *Total of 20 columns and 2 rows* DateCreated DateModified Creator.UserAccountName Creator.PersonName Creator..attrs.referenceNumber Modifier.UserAccountName Modifier.PersonName Modifier..attrs.referenceNumber AdditionalEmailStr AdditionalComment DateIssued DocumentaryInstructions NominationParcel.attr.Referencenumber NominationParcel.SecondContractNumber NominationParcel.Coordinator.RefernceNumber NominationParcel.Coordinator.Username NominationParcel.Coordinator.Email NominationParcel.Coordinator.Office.Name NominationParcel.Coordinator.Office.Email NominationParcel.Coordinator.Office.attrs.referenceNumber Nomination 2007-11-25T17:01:32 2007-11-25T17:11:09 mkolker Merryn Kolker 15351 mkolker Merryn Kolker 15351 Good work 7 sam Nomination 2007-11-25T17:18:01 2007-11-25T17:19:11 mkolker Merryn Kolker 15351 mkolker Merryn Kolker 15351 Nicely Performed 10 107 102 But I am not able to get my output in the required format. I have tried in two different ways 1 Below is my first code, the problem with this is that my NULL fields are not getting captured correctly and there is spillover of data. Also I am not able to capture all the fields of nested lists in the XML *Code 1* doc <- xmlParse("Dummy.xml") lst<-xmlToList(doc) f <- function(col) do.call(rbind, lapply(lst, function(x) unlist(x[cols]))); cols <-c("DateCreated","DateModified","Creator","Modifier","AdditionalEmailStr","AdditionalComment","DateIssued", "DocumentaryInstructions", "NominationParcel" ); res <- setNames(lapply(cols, f), cols); list2env(res, .GlobalEnv) *Output 1* DateCreated DateModified Creator.UserAccountName Creator.PersonName Creator..attrs.referenceNumber Modifier.UserAccountName Modifier.PersonName Modifier..attrs.referenceNumber AdditionalComment NominationParcel.Coordinator.UserAccountName NominationParcel.Coordinator.Office..attrs.referenceNumber NominationParcel.Coordinator..attrs.referenceNumber NominationParcel..attrs.referenceNumber Nomination 2007-11-25T17:01:32 2007-11-25T17:11:09 mkolker Merryn Kolker 15351 mkolker Merryn Kolker 15351 Good Work sam 7 Nomination 2007-11-25T17:18:01 2007-11-25T17:19:11 mkolker Merryn Kolker 15351 mkolker Merryn Kolker 15351 Nicely performed 102 107 10 2007-11-25T17:18:01 2 To avoid spillover of information of one cell to other because of "NULL", I have used for loop to replace the NULL cells with NA. By using this I was able to capture the correct data, but I could not get all the fields information present in the XML *Code 2* doc <- xmlParse("Dummy.xml") lstsub<-xmlToList(doc) for(i in 1:length(lstsub)) { for(j in 1:length(lstsub[[i]])) { lstsub[[i]][[j]]= ifelse(is.null(lstsub[[i]][[j]]),NA,lstsub[[i]][[j]]) if(length(lstsub[[i]][[j]])>1) { for(k in 1:length(lstsub[[i]][[j]])) { lstsub[[i]][[j]][[k]]= ifelse(is.null(lstsub[[i]][[j]][[k]]),NA,lstsub[[i]][[j]][[k]]) if(length(lstsub[[i]][[j]][[k]])>1) { for(l in 1:length(lstsub[[i]][[j]][[k]])) { lstsub[[i]][[j]][[k]][[l]]= ifelse(is.null(lstsub[[i]][[j]][[k]][[l]]),NA,lstsub[[i]][[j]][[k]][[l]]) } } } } } } f <- function(col) do.call(rbind, lapply(lstsub, function(x) unlist(x[cols]))); cols <- c("DateCreated","DateModified","Creator","Modifier","AdditionalEmailStr","AdditionalComment","DateIssued", "DocumentaryInstructions", "NominationParcel" ); res <- setNames(lapply(cols, f), cols); list2env(res, .GlobalEnv) write.csv(Creator,"dummy_2.csv") *Output 2* DateCreated DateModified Creator Modifier AdditionalEmailStr AdditionalComment DateIssued DocumentaryInstructions Nomination 2007-11-25T17:01:32 2007-11-25T17:11:09 mkolker mkolker NA Good Work NA NA Nomination 2007-11-25T17:18:01 2007-11-25T17:19:11 mkolker mkolker NA Nicely performed NA NA Could somebody please help me in how could I get the required output I have posted the same question in Stackoverflow and the link is here (it might help in giving more clear picture) http://stackoverflow.com/questions/34963724/extracting-complete-information-from-nested-lists-in-xml-to-a-data-frame-using-r/34963821#34963821 Regards, Sowmiyan [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.