On Windows with R-2.15.1 in a 1252 locale, I had to read (and toss) out the initial 3 bytes (the byte-order mark?) to make things work:
> socket <- url("http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt",open="r",encoding="utf-8") > readChar(socket, nchars=3, useBytes=TRUE) [1] "" > d <- read.table(socket, quote="", sep="|", stringsAsFactors=FALSE) > dim(d) [1] 485 5 > head(d) V1 V2 V3 V4 V5 1 aar aa Afar afar 2 abk ab Abkhazian abkhaze 3 ace Achinese aceh 4 ach Acoli acoli 5 ada Adangme adangme 6 ady Adyghe; Adygei adyghé If I deleted no initial bytes I got > socket <- url("http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt",open="r",encoding="utf-8") > d <- read.table(socket, quote="", sep="|", stringsAsFactors=FALSE) Warning messages: 1: In read.table(socket, quote = "", sep = "|", stringsAsFactors = FALSE) : invalid input found on input connection 'http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt' 2: In read.table(socket, quote = "", sep = "|", stringsAsFactors = FALSE) : incomplete final line found by readTableHeader on 'http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt' > dim(d) [1] 1 1 > str(d) 'data.frame': 1 obs. of 1 variable: $ V1: chr "?" If I delete the initial 2 bytes I got an "empty beginning of file" error: > socket <- url("http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt",open="r",encoding="utf-8") > readChar(socket, nchars=2, useBytes=TRUE) [1] "ï»" > d <- read.table(socket, quote="", sep="|", stringsAsFactors=FALSE) Error in read.table(socket, quote = "", sep = "|", stringsAsFactors = FALSE) : empty beginning of file In addition: Warning messages: 1: In read.table(socket, quote = "", sep = "|", stringsAsFactors = FALSE) : invalid input found on input connection 'http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt' 2: In read.table(socket, quote = "", sep = "|", stringsAsFactors = FALSE) : incomplete final line found by readTableHeader on 'http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt' > sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of peter dalgaard > Sent: Thursday, September 13, 2012 12:32 PM > To: s...@gnu.org > Cc: r-help@r-project.org > Subject: Re: [R] cannot read iso639 table > > > On Sep 13, 2012, at 19:42 , Sam Steingold wrote: > > > line 109 did not have 5 elements ... but it did! > > empty beginning of file ... but it's not! > > > > details: > > --8<---------------cut here---------------start------------->8--- > > get.language.ISO.table <- function () { > > socket <- url("http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt", > > open="r",encoding="utf-8"); > > data <- read.table(socket, as.is = TRUE, sep = "|", header = FALSE, > > col.names = c("a3bibliographic","a3terminologic", > > "a2","english","french")); > > quote="" would seem to be your friend (apostrophes in the file are doing you > in). I can't > reproduce the "empty beginning" error, though. > > > > close(socket); > > data > > } > > language.ISO.table <- get.language.ISO.table() > > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Email: pd....@cbs.dk Priv: pda...@gmail.com > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.