Hi List, I am trying to extract the key words from 1403 papers in xml
format. I programmed such codes but they do not work but they only do
with the modification showed below. But that variation is not the one
I need because the 1403 xml files do not match to those in my folder.
Could you please tell me where are the mistakes in the codes list (A
or B) to help me to correct them? The data frame columns are an id and
the paths.
A-Does not work, but it is the one I need.
keyword <-
muestra %>%
select(path) %>%
read_xmlmap(.f = function(x) { read_xml(x) %>%
xml_find_all( ".//kwd") %>%
xml_text(trim=T) })
B-It works but only with a small number of papers.
keyword <-
muestra %>%
select(path) %>%
dplyr::sample_n(50) %>%
unlist() %>%
map(.f = function(x) { read_xml(x) %>%
xml_find_all( ".//kwd") %>%
xml_text(trim=T) })
Thank you,
Maicel Monzon MD, PHD
----------------------------------------------------------------
--
Este mensaje le ha llegado mediante el servicio de correo electronico que
ofrece Infomed para respaldar el cumplimiento de las misiones del Sistema
Nacional de Salud. La persona que envia este correo asume el compromiso de usar
el servicio a tales fines y cumplir con las regulaciones establecidas
Infomed: http://www.sld.cu/
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.