Hi List, I am trying to extract the key words from 1403 papers in xml format. I programmed such codes but they do not work but they only do with the modification showed below. But that variation is not the one I need because the 1403 xml files do not match to those in my folder. Could you please tell me where are the mistakes in the codes list (A or B) to help me to correct them? The data frame columns are an id and the paths.

A-Does not work, but it is the one I need.

keyword <-
  muestra %>%
  select(path) %>%
  read_xmlmap(.f = function(x) { read_xml(x) %>%
       xml_find_all( ".//kwd") %>%
       xml_text(trim=T) })

B-It works but only with a small number of papers.

keyword <-
  muestra %>%
  select(path) %>%
   dplyr::sample_n(50) %>%
   unlist() %>%
  map(.f = function(x) { read_xml(x) %>%
       xml_find_all( ".//kwd") %>%
       xml_text(trim=T) })

Thank you,
Maicel Monzon MD, PHD


----------------------------------------------------------------




--
Este mensaje le ha llegado mediante el servicio de correo electronico que 
ofrece Infomed para respaldar el cumplimiento de las misiones del Sistema 
Nacional de Salud. La persona que envia este correo asume el compromiso de usar 
el servicio a tales fines y cumplir con las regulaciones establecidas

Infomed: http://www.sld.cu/

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to