Re: [Rd] readLines() segfaults on large file & question on how to work around

2017-09-03 Thread rhelp
Although the problem can apparently be avoided in this case. readLines causing a segfault still seems unwanted behaviour to me. I can replicate this with the example below (sessionInfo is further down): # Generate an example file l <- paste0(sample(c(letters, LETTERS), 1E6, replace = TRUE),

Re: [Rd] readLines() segfaults on large file & question on how to work around

2017-09-03 Thread Jennifer Lyon
Jeroen: Thank you for pointing me to ndjson, which I had not heard of and is exactly my case. My experience: jsonlite::stream_in - segfaults ndjson::stream_in - my fault, I am running Ubuntu 14.04 and it is too old so it won't compile the package corpus::read_ndjson - works!!! Of course it

Re: [Rd] readLines() segfaults on large file & question on how to work around

2017-09-03 Thread Jeroen Ooms
On Sat, Sep 2, 2017 at 8:58 PM, Jennifer Lyon wrote: > I have a 2.1GB JSON file. Typically I use readLines() and > jsonlite:fromJSON() to extract data from a JSON file. If your data consists of one json object per line, this is called 'ndjson'. There are several packages specialized to read ndjon