If "that" refers to using a database on disk to temporarily hold
the file then example 6 on the home page shows it, as mentioned,
and you may wish to look at the other examples there too and
there is further documentation in the ?sqldf help file.
On Fri, Jan 16, 2009 at 11:11 AM, Gundala Viswanath
Hi,
> Unless you specify an in-memory database the database is stored on disk.
Thanks for your explanation.
I just downloaded 'sqldf'.
Where can I find the option for that? In sqldf I can't see the command.
I looked at:
envir = parent.frame()
doesn't appear to be the one.
- Gundala Viswanath
Only the portion your extract is ever in R -- the file itself is read
into a database
without ever going through R so your memory requirements correspond to what
you extract, not the size of the file.
On Fri, Jan 16, 2009 at 10:49 AM, Gundala Viswanath wrote:
> Hi Gabor,
>
> Do you mean storing d
Hi Gabor,
Do you mean storing data in "sqldf', doesn't take memory?
For example, I have 3GB data file. with standard R object using read.table()
the object size will explode twice ~6GB. My current 4GB RAM
cannot handle that.
Do you mean with "sqldf", this is not the issue?
Why is that?
Sorry for
On Fri, Jan 16, 2009 at 5:52 AM, r...@quantide.com wrote:
> I agree on the database solution.
> Database are the rigth tool to solve this kind of problem.
> Only consider the start up cost of setting up the database. This could be a
> very time consuming task if someone is not familiar with databa
r...@quantide.com wrote:
> I agree on the database solution.
> Database are the rigth tool to solve this kind of problem.
> Only consider the start up cost of setting up the database. This could
> be a very time consuming task if someone is not familiar with database
> technology.
and won't pay if
r...@quantide.com wrote:
>
> Using file() is not a real reading of all the file. This function will
> simply open a connection to the file without reading it.
> countLines should do something lile "wc -l" from a bash shell
just for a test:
cat(rep('', 10^7), file='test.txt', fill=1)
library(R.ut
I agree on the database solution.
Database are the rigth tool to solve this kind of problem.
Only consider the start up cost of setting up the database. This could
be a very time consuming task if someone is not familiar with database
technology.
Using file() is not a real reading of all the f
if the file is really large, reading it twice may add considerable penalty:
r...@quantide.com wrote:
> Something like this should work
>
> library(R.utils)
> out = numeric()
> qr = c("AAC", "ATT")
> n =countLines("test.txt")
# 1st pass
> file = file("test.txt", "r")
> for (i in 1:n){
# 2nd pass
Something like this should work
library(R.utils)
out = numeric()
qr = c("AAC", "ATT")
n =countLines("test.txt")
file = file("test.txt", "r")
for (i in 1:n){
line = readLines(file, n = 1)
A = strsplit (line, split = " ")[[1]][1]
if(is.element(A, qr)) {
value = as.numeric(strsplit (line, split = "
The sqldf package can read a large file to a database without going
through R followed by extracting it. The package makes it easier
to use RSQLite by setting up the database for you and after extracting
the portion you want removing the database automatically. You can
specify all this in two li
you might try to iteratively read a limited number of line of lines in a
batch using readLines:
# filename, the name of your file
# n, the maximal count of lines to read in a batch
connection = file(filename, open="rt")
while (length(lines <- readLines(con=connection, n=n))) {
# do your stuff h
On Fri, 2009-01-16 at 18:02 +0900, Gundala Viswanath wrote:
> Dear all,
>
> I have a repository file (let's call it repo.txt)
> that contain two columns like this:
>
> # tag value
> AAA0.2
> AAT0.3
> AAC 0.02
> AAG 0.02
> ATA0.3
> ATT 0.7
>
> Given another query vector
>
> >
Dear all,
I have a repository file (let's call it repo.txt)
that contain two columns like this:
# tag value
AAA0.2
AAT0.3
AAC 0.02
AAG 0.02
ATA0.3
ATT 0.7
Given another query vector
> qr <- c("AAC", "ATT")
I would like to find the corresponding value for each query above,
y
14 matches
Mail list logo