[R] R course in Italy

2009-06-12 Thread r...@quantide.com


Quantide is pleased to announce the above course in Milan:

--
   Introduction to the R language
6-7th July 2009
  Milano Italy
--

* Who Should Attend ?

This is a course suitable for beginners and improvers in the R language
and is ideal for people wanting an all round introduction to R

* Course Goals

- To allow attendees to understand the technology behind the R package
- Improve attendees programming style and confidence
- To enable users to access a wide range of available functionality

* Course Outline

1. Introduction to the R language and the R community
2. The R Environment
3. R data objects
4. Functions and Operators
5. Data import and export
7. Standard Graphics
8. Advanced Graphics
9. Introduction to R Statistics

The cost of this course is Euro 500+VAT

Should your organization have more than 3 possible attendees why not
talk to us about hosting a customized and focused course delivered at
your premises?

Should you want to book a place on this course or have any questions
please visit:
http://www.quantide.com/formazioneR.php
or contact:
Daniela Manzato
daniela.manz...@quantide.com
+39 328 537 51 09

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting multiple ablines

2009-04-02 Thread r...@quantide.com

May be:

plot(c(-1, 1) , c(-1, 1), type = "n")
n = 4
a = rep(0, n)
b = 1:n/n


fun = function(i, a, b, col = 1 , ...) {
   abline(a[i], b[i], col = col[i], ...)
}

lapply(1:n, fun, a=a, b=b, col = 1:n)

Andrea


Thomas Levine wrote:

I really want to do this:

abline(
a=tan(-kT*pi/180),
b=kY-tan(-kT*pi/180)*kX
)

where kX,kY and kT are vectors of equal length. But I can't do that
with abline unless I use a loop, and I haven't figured out the least
unelegant way of writing the loop yet. So is there a way to do this
without a loop?

Or if I am to resort to the loop, what's the best way of doing it
considering that I have some missing data? Here's the mess that I
wrote.

converge <- na.omit(data.frame(kX,kY,kT))
for (z in (length(converge$kT)))
{abline(
a=tan(converge$kT[z]*pi/180),
b=converge$kY[z]-tan(-converge$kT[z]*converge$kX[z]*pi/180)
)}

I think the missing data are causing the problem; this happens when I run:

Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) :
  'a' and 'b' must be finite

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] JGR

2009-05-05 Thread r...@quantide.com

Dear R User
I am using JGR on a Linux Ubuntu Computer with 2 Cpus
When Opening JGR, one Cpu goes up to 100% even if no calculation is yet 
started

Did any of you already noticed this strange behaviour?
Thanks for your help
Andrea

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] JGR

2009-05-05 Thread r...@quantide.com

The point is that one Cpu stays at 100% for all time JGR is up.
Any ideas?
Andrea

Uwe Ligges wrote:



r...@quantide.com wrote:

Dear R User
I am using JGR on a Linux Ubuntu Computer with 2 Cpus
When Opening JGR, one Cpu goes up to 100% even if no calculation is 
yet started

Did any of you already noticed this strange behaviour?


Does it take very long? On the first start it builds some databases 
and that takes some time if you have installed many packages.


Uwe Ligges



Thanks for your help
Andrea

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Histograms: Boxes and lines

2009-01-14 Thread r...@quantide.com

Could be ...

legend("topright",legend=c("Histogram","Kernel Density
Estimate"),lty=c(NA,1), lwd=c(NA,2), pch = c(15, NA), col = 
c("lightblue", "black"), merge=TRUE,inset=.01,cex=.8,adj=0)


A.

John Kerpel wrote:

Hi folks!  I'm trying to get a histogram legend to give me a filled box and
a line.  The problem is I keep getting both filled boxes and a line.  How
can I get rid of the second box from the code below?



x<-rnorm(1000,mean=0,sd=1)

hist(x, breaks = 50, main="Histogram of x",freq=FALSE,

xlab=" x", ylab="Density",col="lightblue", border="black")

x_dens<-density(x,kernel="gaussian")

points(x_dens,type="l",lwd=3)

legend("topright",legend=c("Histogram","Kernel Density
Estimate"),lty=c(-1,1),lwd=c(-1,2),fill=c("lightblue"),merge=TRUE,inset=.01,cex=.8,adj=0)



Thx!  John

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread r...@quantide.com

Something like this should work

library(R.utils)
out = numeric()
qr = c("AAC", "ATT")
n =countLines("test.txt")
file = file("test.txt", "r")
for (i in 1:n){
line = readLines(file, n = 1)
A = strsplit (line, split = " ")[[1]][1]
if(is.element(A, qr)) {
value = as.numeric(strsplit (line, split = " ")[[1]][2])
out = c(out, value)
}
}

You may want to improve execution speed by reading data in chunks 
instead of line by line. Code requires a little modification





Carlos J. Gil Bellosta wrote:

On Fri, 2009-01-16 at 18:02 +0900, Gundala Viswanath wrote:
  

Dear all,

I have a repository file (let's call it repo.txt)
 that contain two columns like this:

# tag  value
AAA0.2
AAT0.3
AAC   0.02
AAG   0.02
ATA0.3
ATT   0.7

Given another query vector



qr <- c("AAC", "ATT")
  

I would like to find the corresponding value for each query above,
yielding:

0.02
0.7

However, I want to avoid slurping whole repo.txt into an object (e.g. hash).
Is there any ways to do that?

The reason I want to do that because repo.txt is very2 large size
(milions of lines,
with tag length > 30 bp),  and my PC memory is too small to keep it.

- Gundala Viswanath
Jakarta - Indonesia



Hello,

You can always store your repo.txt into a database, say, SQLite, and
select only the values you want via an SQL query.

Thus, you will prevent loading the full file into memory.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] faster version of split()?

2009-01-16 Thread r...@quantide.com
df = data.frame(x = sample(7:9, 100, rep = T), y = sample(1:5, 100, rep 
= T))

fun = function(x){length(unique(x))}
by(df$x, df$y, fun)


Simon Pickett wrote:

Hi all,

I want to calculate the number of unique observations of "y" in each 
level of "x" from my data frame "df".


this does the job but it is very slow for this big data frame (159503 
rows, 11 columns).


group.list <- split(df$y,df$x)
count <- function(x) length(unique(na.omit(x)))
sapply(group.list, count, USE.NAMES=TRUE)

I couldnt find the answer searching for "slow split" and "split time" 
on help forum.


I am running R version 2.2.1, on a machine with 4gb of memory and I'm 
using windows 2000.


thanks in advance,

Simon.







- Original Message - From: "Wacek Kusnierczyk" 


To: "Gundala Viswanath" 
Cc: "R help" 
Sent: Friday, January 16, 2009 9:30 AM
Subject: Re: [R] Value Lookup from File without Slurping



you might try to iteratively read a limited number of line of lines in a
batch using readLines:

# filename, the name of your file
# n, the maximal count of lines to read in a batch
connection = file(filename, open="rt")
while (length(lines <- readLines(con=connection, n=n))) {
# do your stuff here
}
close(connection)

?file
?readLines

vQ


Gundala Viswanath wrote:

Dear all,

I have a repository file (let's call it repo.txt)
that contain two columns like this:

# tag value
AAA 0.2
AAT 0.3
AAC 0.02
AAG 0.02
ATA 0.3
ATT 0.7

Given another query vector



qr <- c("AAC", "ATT")



I would like to find the corresponding value for each query above,
yielding:

0.02
0.7

However, I want to avoid slurping whole repo.txt into an object 
(e.g. hash).

Is there any ways to do that?

The reason I want to do that because repo.txt is very2 large size
(milions of lines,
with tag length > 30 bp), and my PC memory is too small to keep it.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread r...@quantide.com

I agree on the database solution.
Database are the rigth tool to solve this kind of problem.
Only consider the start up cost of setting up the database. This could 
be a very time consuming task if someone is not familiar with database 
technology.


Using file() is not a real reading of all the file. This function will 
simply open a connection to the file without reading it.

countLines should do something lile "wc -l" from a bash shell

I would say that if this is a one time job this solution should work 
even thought is not the fastest. In case this job is a repetitive one, 
then a database solution is surely better


A.


Wacek Kusnierczyk wrote:

if the file is really large, reading it twice may add considerable penalty:

r...@quantide.com wrote:
  

Something like this should work

library(R.utils)
out = numeric()
qr = c("AAC", "ATT")
n =countLines("test.txt")



# 1st pass

  

file = file("test.txt", "r")
for (i in 1:n){



# 2nd pass

  

line = readLines(file, n = 1)
A = strsplit (line, split = " ")[[1]][1]
if(is.element(A, qr)) {
value = as.numeric(strsplit (line, split = " ")[[1]][2])
out = c(out, value)
}
}



if this is a one-go task, counting the lines does not pay, and why
bother.  if this is a repetitive task, a database-based solution will
probably be a better idea.

vQ




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dates in Common

2009-01-23 Thread r...@quantide.com
The problem is in the intersect function that does x = as.vector(x) and 
therefore transforms date vector into a numeric .

Try to:
d1 = as.character(data1) ; d2 = as.character(data2)
d = intersect(d1, d2)
data = as.Date(d)

A.


Tom La Bone wrote:

I have two collections of dates and I want to figure out what dates they have
in common. This is not giving me what I want (I don't know what it is giving
me). What is the best way to do this?

Tom

  

data1


 [1] "1948-02-24 EST" "1949-04-12 EST" "1950-05-29 EDT" "1951-05-21 EDT"
 [5] "1951-12-20 EST" "1953-01-22 EST" "1955-02-28 EST" "1956-03-08 EST"
 [9] "1957-03-22 EST" "1958-02-07 EST"
  

data2


 [1] "1948-02-24 EST" "1949-04-12 EST" "1950-05-29 EDT" "1951-05-21 EDT"
 [5] "1951-12-20 EST" "1953-01-22 EST" "1955-02-28 EST" "1956-03-08 EST"
 [9] "1957-03-22 EST" "1958-02-07 EST"
  

intersect(data1,data2)


 [1] -689626800 -653943600 -618350400 -587505600 -569098800 -534625200
 [7] -468356400 -436042800 -403297200 -375476400



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Table Modification

2009-01-23 Thread r...@quantide.com

If I understood propelly

> tapply(fact3, list(fact1, fact2) , paste, collapse = ",")

A.

Derek Ogle wrote:

I am trying to construct a two-way table where, instead of printing the
two-way frequencies in the table, I would like to print the values of a
third variable that correspond to the frequencies.

 


For example, the following is easily constructed in R

 

  

fact1 <- factor(sample(LETTERS[1:3],10,replace=TRUE))



  

fact2 <- factor(sample(LETTERS[25:26],10,replace=TRUE))



  

fact3 <- letters[1:10]



  

data.frame(fact1,fact2,fact3)



   fact1 fact2 fact3

1  C Z a

2  A Y b

3  A Y c

4  C Z d

5  A Z e

6  A Y f

7  B Y g

8  B Y h

9  C Z i

10 A Y j

  

table(fact1,fact2)



 fact2

fact1 Y Z

A 4 1

B 2 0

C 0 3

 


But I would like to create something like this (done physically by hand)
...

 


 fact2

fact1Y   Z

A b,c,f,je

B   g,h  -

C- a,d,i

 


Any help would be appreciated.  Thank you in advance.

 


For what it is worth,

 

  

Sys.info()



 sysname 

   "Windows" 

 release 

"XP" 

 version 


"build 2600, Service Pack 2"


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.