[R] Merge data frames but prefer values in one

2009-09-10 Thread JiHO

Hello everyone,

My problem is better explained with an example:

> x=data.frame(a=1:4,b=1:4,c=rnorm(4))
> x
 a b  c
1 1 1 -0.8821089
2 2 2 -0.7082583
3 3 3 -0.5948835
4 4 4 -1.8571443
> y=data.frame(a=c(1,3),b=3,c=rnorm(2))
> y
 a bc
1 1 3 -0.273155973
2 3 3  0.009517862

Now I want to merge x and y by columns a and b, hence creating a  
data.frame with all a:b combinations observed in x and y. That's  
easily done with merge:


> merge(x,y,by=c("a","b"),all=T)
 a bc.x  c.y
1 1 1 -0.8821089   NA
2 1 3 NA -0.273155973
3 2 2 -0.7082583   NA
4 3 3 -0.5948835  0.009517862
5 4 4 -1.8571443   NA

But rather than two c columns I would want the merge to:
- keep the value in x if there is no corresponding value in y
- keep the value in y if there is no corresponding value in x
- prefer the value in y when the a:b combination exists in both x and y

So basically I want my result to look like:
 a b  c
1 1 1 -0.8821089
2 1 3 -0.2731559
3 2 2 -0.7082583
4 3 3  0.0095178
5 4 4 -1.8571443

I can't find a combinations of options for merge that does this. Is  
there another fonction that would do that or do I have to resort to  
some post-processing after merge? It seems that it might be something  
like a "right merge" for data bases but I don't know this world at  
all. I would be happy to look into sqldf if that allows to do things  
like that.


Thanks in advance. Sincerely,

JiHO
---
http://maururu.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge data frames but prefer values in on

2009-09-14 Thread JiHO

On 2009-September-11  , at 13:55 ,  wrote:


Maybe:

do.call(rbind, lapply(with(xy <- rbind(x, y), split(xy, list(a, b),  
drop = TRUE)), tail, 1))


On Fri, Sep 11, 2009 at 3:45 AM, jo  wrote:
Thanks for the post-processing ideas. But is there any way to do that
in one step?


Thanks but by "in one step" I meant within the merge, not in one post- 
processing step ;)


JiHO
---
http://maururu.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Comparing spatial distributions - permutation test implementation

2009-05-20 Thread JiHO
x[idx, 3:4] = x[idx, 4:3]
# compute syrjala stat
return(syrjala.stat(x))
}, n, .progress="text")
}

# Compute the syrjala stat for the observations
psi = syrjala.stat(dataCod)

# Estimate the pvalue
pvalue = (sum(psis>=psi)+1)/nperm

psi
pvalue
# Should be:
# statistic = 0.224
# p-value   = 0.1900

Thank you very much in advance. Sincerely,

JiHO
---
http://jo.irisson.free.fr/

[1] A statistical test for a difference between the spatial  
distributions of two populations. Syrjala SE. Ecology. 1996;77(1):75–80.

http://dl.getdropbox.com/u/1047321/Syrjala1996.pdf

[2] https://stat.ethz.ch/pipermail/r-sig-geo/2008-February/ 
thread.html#3137


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-Geo] Comparing spatial distributions - permutation test implementation

2009-05-21 Thread JiHO

On 2009-May-21  , at 05:40 , Marcelino de la Cruz wrote:

Jose M. Blanco-Moreno an myself have implemented Syrjala's test in  
the ecespa package. As a matter of fact, Jose M. has implemented a  
Frotran version of Syrjala's original QBasic function. Using your  
data the results are very close to your reported # statistic= 0.224  
and # p-value   = 0.1900.


Thanks a lot. Having the fortran implementation is very nice!
I still don't find the values computed with the quickbasic code but  
the values reported in the article for this data set are different  
from those computed with QB and actually closer to yours.


Looking at your code I still don't get what I am doing wrong though.  
It seems we both use sample to get a few of the columns and then swap  
them. Well, I'll use your test anyway.


JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cream Text Editor

2009-05-23 Thread JiHO

On 2009-May-23  , at 17:40 , Paul Heinrich Dietrich wrote:

I'm interested in easing my way into learning VIM by first using the  
Cream
text editor, liking the idea that it will work on both my Linux and  
Windows
computers.  I've installed Cream on my Linux machine, but can't  
figure out
how to make Cream talk to R?  Does anybody know?  I'm using Ubuntu  
if it

makes a difference.  Thanks.


You should install the R Vim Plugin and its dependencies:
http://www.vim.org/scripts/script.php?script_id=2628
This creates commands and icons dedicated to the interaction between  
Vim and R.



Then switch cream Settings > Preferences > Expert Mode.

This will allow you to work in Cream and have all the simple keyboard  
shortcuts (Control-C, Control-V etc.) but still be able to switch  
between modes as in vim. By default you are in insert mode. You need  
to switch to normal mode (by pressing ESC) to be able to use the  
commands of the R-Vim plugin.


The workflow is therefore:
- open a R file
- edit stuff
- press ESC (to switch to non-edit mode)
- start R in a terminal (click the icon or press F2)
- send lines/selection (F9) or document (F5)
- press ESC (to switch back to insert mode)
- edit 2 lines
- ESC
- F9
- F9
- ESC
- edit again
etc...

The terminal opened this way does not work completely as a regular one  
and there are some caveats when reading help and using general command  
line editing shortcuts (Ctrl-R for backward search for example). I  
haven't found a way around them so I usually open a second terminal to  
read the help in, or set R to display the help as HTML files in a  
browser window.


I must say that those caveats can be quite serious and I often find  
myself just using copy-paste from gedit in a terminal:

- set your desktop to "focus follow mouse"
- select text in your editor
- move the mouse to the terminal
- click middle mouse button
- move the mouse back to the editor
etc...
More cumbersome but reliable.

Final note: since you are on ubuntu, you may want to change the  
terminal from the default X-term to gnome-terminal. You have to edit  
the file .vim/ftplugin/r.vim. There is a line commented with the gnome- 
terminal command instead of xterm. Uncomment this one and comment the  
xterm one.


JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cream Text Editor

2009-05-23 Thread JiHO

On 2009-May-23  , at 20:16 , Jakson Alves de Aquino wrote:


Just a note: there is no need of  before . Almost all key
bindings work in insert, normal and visual modes.


Well, without switching to the non-insert mode, I find that pressing  
F9 prints the commands in the file instead of executing them. Maybe  
that's specific to Cream.


JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting Matrix into List - problem (urgent)

2009-03-22 Thread JiHO

On 2009-March-22  , at 12:26 , Nidhi Kohli wrote:

I want to remove the Column name and Row name from the above output.  
Any help on this will be greatly appreciated (I'm open to any other  
alternative way to convert Matrix into List also)


What are you trying to achieve exactly? Do you just want to print a  
clean output on the screen? If yes, look at `cat`, or `print`.


JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 'require' equivalent for local functions

2009-03-22 Thread JiHO

Hello everyone,

I often create some local "libraries" of functions (.R files with only  
functions in them) that I latter call. In scripts that call a function  
from such library, I would like to be able to test whether the  
function is already known in the namespace and, only if it is not,  
source the library file. I.e. what `require` does for packages, I want  
to do with my local functions.


Example:
lib.R
foo <- function(x) { x*2 }

script.R
require.local(foo,"lib.R")
	# that searches for function "foo" and, if not found, executes  
source("lib.R")

foo(2)

Obviously, I want the test to be quite efficient otherwise I might as  
well source the local library every time. I am aware that it would  
probably not be able to check for changes in lib.R (i.e. do  
complicated things such as re-source lib.R if foo in the namespace and  
foo in lib.R are different), but that I can handle manually.


This seems like a common enough workflow but I cannot find a pre- 
existing solution. Does anyone have pointers?


Otherwise I tried to put that together:

require.local <- function(fun, lib)
#
#   Searches for function "fun" and sources "lib" in
#   case it is not found
#
{
	if (! (deparse(substitute(fun)) %in% ls(".GlobalEnv") && class(fun)  
== "function") ) {

cat("Sourcing", lib,"...\n")
source(lib)
}
}

but I am really not confident with all those deparse/substitute things  
and the environment manipulation, so I guess there should be a better  
way.


JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 'require' equivalent for local functions

2009-03-23 Thread JiHO
Thanks very much to everyone. I think I will use a combination of both  
techniques.


On 2009-March-22  , at 20:08 , Duncan Murdoch wrote:

That's pretty hard to make bulletproof.  Why not just put those  
functions in a package, and use that package?



I know it will be impossible to make bullet proof and efficient at the  
same time. However, my functions are pretty specific to each project,  
have long names and do not collide with variable names (because I use  
dots in function names but camel case in variable names) so just  
looking for the name should be OK. Plus I have a simple keyboard  
shortcut in my text editor to source the current file in the currently  
R session, so it will be easy to re-source some files after I modify  
them.


On the other hand I have a bundle of general enough functions that I  
import in many projects (http://github.com/jiho/r-utils/ for those  
that might be interested in netCDF data handling, 2D arrow fields and  
ggplot2 stuff) and this one is a good candidate to be turned into a  
package.


So thanks again to everyone. Sincerely,

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Evaluating content of command line arguments

2009-04-30 Thread JiHO

Hello everyone,

I am writing an R script which will be provided with command line  
arguments by a shell script. I read them with commandArgs() but I have  
an issue to make that fool proof.


For example, with test.R containing:

args=commandArgs(trailingOnly=TRUE)
args

I get:

$ Rscript test_commandLineArgs.R 3 4 5
[1] "3" "4" "5"

Now, to make the input more flexible and robust, I want to pass  
*named* arguments, so that the R script does not depend on the order  
of arguments passed and can check which are present/absent etc. I.e.


$ Rscript test_commandLineArgs.R foo=3 bar=4 5
[1] "foo=3" "bar=4"

But I am stuck on how to actually execute the code within those  
strings so as to get two variables foo and bar equal to 3 and 4  
respectively. I looked at eval, deparse, substitute and all that but  
did not find anything. Is that possible?


Otherwise I will resort to parsing the arguments with strsplit but I  
would much prefer and more elegant solution.


Thank you very much in advance. Sincerely,

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Evaluating content of command line arguments

2009-05-01 Thread JiHO

On 2009-April-30  , at 22:28 , Gabor Grothendieck wrote:


Check out the getopt package.



Thanks! That's what I need.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] quartz() and dpi

2010-01-27 Thread JiHO
Hello all,

I am using quartz (on OS X obviously) to produce PDFs and PNGs from my
plots, for later inclusion in LaTeX.

I am typically using something like:

plot(0)
dev.print(quartz, file="foo.pdf", width=5, height=3)
dev.print(quartz, file="foo.png", width=5, height=3, dpi=72)

I want the sizes of the PDF and PNG to be *equal* in *inches*, which
works with dpi=72. However, when I increase the dpi parameter, instead
of producing an image of the same size with increased resolution, it
creates a larger image of resolution = 72. E.g. try

dev.print(quartz, file="foo-72.png", width=5, height=3, dpi=72)
dev.print(quartz, file="foo-300.png", width=5, height=3, dpi=300)
system("open -a Preview.app foo-*.png")

The inspector in Preview should show 72 dpi for both files. This is with:

> sessionInfo()
R version 2.10.1 (2009-12-14)
x86_64-apple-darwin9.8.0

Is this a know bug/limitation? Is a solution planned? Is there a
workaround for now?

As a final note, I am aware that PDF is superior to PNG, particularly
in a LaTeX workflow; but for particularly complex plots, I sometimes
fall back on high resolution PNGs. Currently it forces me to add a
'scale' argument to includegraphics in latex for those. I would rather
leave the latex document alone, use extension-less file names
includegraphics and decide from R wether to produce a pdf or a png.

Thank you in advance,

JiHO
---
http://maururu.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] quartz() and dpi

2010-01-28 Thread JiHO
> As far as I can tell the Cairo device (CairoPNG) doesn't respect the
> size either. It looks like your best option is to switch between pdf()
> and png(type="cairo") using a wrapper like ggplot2::ggsave.

Using a wrapper function is exactly what I am doing (because I need it
to save the occasional base R plot so ggsave won't work all the time).
However, after extensively testing the different devices, quartz is
the most desirable even for PNGs: png(type="cairo") messes up the
fonts. Helvetica is used by default and the X11 part of OS X (the
pango library specifically) cannot properly read the version of
Helvetica provided by the system (in many software, the normal variant
is actually bold). cairo uses pango so the fonts in the PNG don't look
as good as in quartz.

So basically I either have wrong size but correct fonts or wrong fonts
but correct size. I currently chose the fonts over the size since the
size can be corrected and the fonts cannot. BUt of course I would
rather not have to choose ;)

JiHO
---
http://maururu.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solving Classification problems in R

2014-02-28 Thread JiHO
Do you mean supervised or unsupervised classification.

If supervised, I have had great success using gradient boosted
classification in package gbm. multinomial distribution will get you
multiple classes and it will select relevant predictors by itself
given the training data.

Not sure about the customized cost functions

Jean-Olivier Irisson
—
Université Pierre et Marie Curie
Laboratoire d'Océanographie de Villefranche
181 Chemin du Lazaret 06230 Villefranche-sur-Mer
Tel: +33 04 93 76 38 04
Mob: +33 06 21 05 19 90
http://www.obs-vlfr.fr/~irisson/
Send me large files at: http://www.obs-vlfr.fr/~irisson/upload/

On Fri, Feb 28, 2014 at 5:53 PM, Sergio Fonda  wrote:
> Focus on MASS, CCA and e1071 packages
> Brgds,
> Sergio
> Il 28/feb/2014 17:47 "Luca Cerone"  ha scritto:
>
>> Dear all,
>> I would like some advices on R packages to solve classification problems.
>> I have tried to search among the Task views, but couldn't find anything.
>>
>> Can somebody recommend me some packages?
>>
>> Some of the features I am looking for:
>> - deal with multiple classes
>> - use customized cost functions
>> - perform features/predictors selection
>>
>> Any hint would be greatly appreciated,
>> thanks a lot in advance for the help!
>> Cheers,
>> Luca
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Rearrangement

2014-02-28 Thread JiHO
You actually want cast() from the reshape package, not melt().
I would recommend using dcast() from reshape2, the newer version of reshape.

Jean-Olivier Irisson
—
Université Pierre et Marie Curie
Laboratoire d'Océanographie de Villefranche
181 Chemin du Lazaret 06230 Villefranche-sur-Mer
Tel: +33 04 93 76 38 04
Mob: +33 06 21 05 19 90
http://www.obs-vlfr.fr/~irisson/
Send me large files at: http://www.obs-vlfr.fr/~irisson/upload/

On Fri, Feb 28, 2014 at 9:41 AM, PIKAL Petr  wrote:
> Hi
>
> Did you try what I had suggested? In what aspect it does not fulfil your 
> request?
>
> Petr
>
> From: dila radi [mailto:dilarad...@gmail.com]
> Sent: Friday, February 28, 2014 9:25 AM
> To: PIKAL Petr
> Subject: Re: [R] Data Rearrangement
>
> Dear Petr,
>
> Im so sorry for the inconvenience caused by the data.
>
> I want to rearrange the data in this form:
>
> Year   Day Month Amount
> 2012 1 1   0.0
> 2012 2 1   0.0
> ..
> 2012 8 1  ..
> 2012 1 2  
> 2012 2 2 ..
> ..
> 2012 8 2   .
>
>
> I want it in year by year (all 2012 from Jan-Apr, followed by all 2013 from 
> Jan-Apr) . Could you help me?
>
> Regards,
>
> Dila
>
> On 27 February 2014 23:24, PIKAL Petr 
> mailto:petr.pi...@precheza.cz>> wrote:
> Hi
>
> Your data came scrambled as you in contrary to advice post in HTML. So it is 
> just a guess but maybe you want
>
> library(reshape)
> melt(dat, id=c("Year", "Day"))
>
> Petr
>
>> -Original Message-
>> From: r-help-boun...@r-project.org 
>> [mailto:r-help-bounces@r-
>> project.org] On Behalf Of dila radi
>> Sent: Friday, February 28, 2014 4:25 AM
>> To: r-help@r-project.org
>> Subject: [R] Data Rearrangement
>>
>> Hi all,
>>
>> I know this is easy, but I really do not have any idea to solve it.
>>
>> I have this kind of data set:
>>
>> dat <- read.table(text="Day Year Jan Feb Mar Apr
>>   1 2012 0 2.5 0.5 2
>>   2  2012 0 6.5 0 29
>>   3  2012 0 9.5 0 0
>>   4  2012 0 0 8 0.5
>>   5  2012 0 5 0.5 110.5
>>   6  2012 0 4 3.5 22
>>   7  2012 11 0 12.5 3.5
>>   8  2012 0 5 8 36.5
>>   1  2013 0 2.5 0.5 2
>>   2  2013 0 6.5 0 29
>>   3  2013 0 9.5 0 0
>>   4  2013 0 0 8 0.5
>>   5  2013 0 5 0.5 110.5
>>   6  2013 0 4 3.5 22
>>   7  2013 11 0 12.5 3.5
>>   8  2013 0 5 8 36.5",sep="",header=TRUE)
>>
>> and I want it to be in this form:
>>
>> Year   Day Month Amount  2012 1 1 0  2012 2 1 0  2012 3 1 0
>> 2012 4
>> 1 0  2012 5 1 0  2012 6 1 0  2012 7 1 11  2012 8 1 0  2012 1 2 2.5
>> 2012 2 2
>> 6.5  2012 3 2 9.5  2012 4 2 0  2012 5 2 5  2012 6 2 4  2012 7 2 0  2012
>> 8 2
>> 5  2012 1 3 0.5  2012 2 3 0  2012 3 3 0  2012 4 3 8  2012 5 3 0.5  2012
>> 6 3
>> 3.5  2012 7 3 12.5  2012 8 3 8  2012 1 4 2  2012 2 4 29  2012 3 4 0
>> 2012 4
>> 4 0.5  2012 5 4 110.5  2012 6 4 22  2012 7 4 3.5  2012 8 4 36.5
>> 2013 1 1
>> 0  2013 2 1 0  2013 3 1 0  2013 4 1 0  2013 5 1 0  2013 6 1 0  2013 7 1
>> 11
>> 2013 8 1 0  2013 1 2 2.5  2013 2 2 6.5  2013 3 2 9.5  2013 4 2 0  2013
>> 5 2 5
>> 2013 6 2 4  2013 7 2 0  2013 8 2 5  2013 1 3 0.5  2013 2 3 0  2013 3 3
>> 0
>> 2013 4 3 8  2013 5 3 0.5  2013 6 3 3.5  2013 7 3 12.5  2013 8 3 8  2012
>> 1 4
>> 2  2012 2 4 29  2012 3 4 0  2012 4 4 0.5  2012 5 4 110.5  2012 6 4 22
>> 2012
>> 7 4 3.5  2012 8 4 36.5  2013 1 4 2  2013 2 4 29  2013 3 4 0  2013 4 4
>> 0.5
>> 2013 5 4 110.5  2013 6 4 22  2013 7 4 3.5  2013 8 4 36.5
>> I want to rearrange the data according to the YEAR (year by year)
>>
>> Thank you.
>>
>> Regards,
>> Dila
>>
>>   [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou 
> určeny pouze jeho adresátům.
> Jestliľe jste obdrľel(a) tento e-mail omylem, informujte laskavě neprodleně 
> jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymaľte ze 
> svého systému.
> Nejste-li zamýąleným adresátem tohoto emailu, nejste oprávněni tento email 
> jakkoliv uľívat, roząiřovat, kopírovat či zveřejňovat.
> Odesílatel e-mailu neodpovídá za eventuální ąkodu způsobenou modifikacemi či 
> zpoľděním přenosu e-mailu.
>
> V případě, ľe je tento e-mail součástí obchodního jednání:
> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, 
> a to z jakéhokoliv důvodu i bez uvedení důvodu.
> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany 
> příjemce s dodatkem či odchylkou.
> - trvá odesílatel na tom, ľe přísluąná smlouva je uzavřena teprve v

[R] Plotting a cloud/fog of variable density in rgl

2010-11-22 Thread JiHO
Hi everyone,

I want to plot a 3D interpolation of the concentration of aquatic
organisms. My goal would be to have the result represented as clouds
with a density proportional to the abundance of organisms, so that I
could fly (well, swim actually ;) ) through the scene and see the
patches here and there. Basically, I want to do something like this:
http://www.youtube.com/watch?v=27mo_Y-aU-c
but simpler and with only clouds.

I though about doing it this way:
1- interpolate to a fine grid
2- plot points at each grid intersection of transparency inversely
proportional to abundance
3- blur/fog a bit each point to create the general impression of a cloud

So far I am stuck on 3 but maybe there is a better overall solution.
Here is some code that reads the result of the interpolation on a
coarse grid and plots it:

# read a set of gridded data points in 3D
d = read.table("http://dl.dropbox.com/u/1047321/R/test3Ddata.txt";, 
header=T)

# plot
library("rgl")
spheres3d(d$x, d$y, d$z, alpha=alpha, radius=0.05)

And here is a version that actually performs the interpolation a
random set of points in 3D through kriging in case you want to try
with increase precision.

# create a set of random data points in 3D
n = 50
data3D = data.frame(x = runif(n), y = runif(n), z = runif(n), v = 
rnorm(n))

# do 3d interpolation via kriging
library("gstat")
coordinates(data3D) = ~x+y+z
range1D = seq(from = 0, to = 1, length = 10)
grid3D = expand.grid(x = range1D, y = range1D, z = range1D)
gridded(grid3D) = ~x+y+z
res3D = krige(formula = v ~ 1, data3D, grid3D, model = vgm(1, "Exp", 
.2))

# convert the result to a data.frame
d = as.data.frame(res3D)

# compute transparency (proportional to the interpolated value)
maxD = max(d$var1.pred)
minD = min(d$var1.pred)
alpha = (d$var1.pred - minD)/(maxD - minD)
# reduce maximum alpha (all points are semi-transparent)
alpha = alpha/5

# plot
library("rgl")
spheres3d(d$x, d$y, d$z, alpha=alpha, radius=0.05)


I saw the fog effect but it seems to add a fog in the scene to
increase depth. What I want is my scene to actually look like a fog.

Thanks in advance for any help. Sincerely,

JiHO
---
http://maururu.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting a cloud/fog of variable density in rgl

2010-11-28 Thread JiHO
Thanks for both the replies.

>> I think you can come closest to what you want within rgl by using
>> sprites rather than rendering transparent spheres. See
>> examples(sprites3d).

Sprites helps a lot indeed. With enough transparency I am close to what I want.

> If you only have 2 things with simple properties, namely point emitters
> as your organisms and a uniform concsntration of transparent scatters ( the
> fog) you can probably derive geometrical optics expressions for the ray trace
> results and just integrate those over your source distribution. This should
> be reasonably easy in R. I haven't been to siggraph since 1983 so can't help
> much but you can probably find analyitcal solutions for fog on google
> and just sum up your source distribution. I guess you could even do some
> wave optics etc as presumably the fog could be done as a function
> of wavelength just as easily. In any case, if you only have two basic things
> with simple disto should be reasonably easy to do in R with your own code.

I am afraid this is a bit too advanced for me. I know next to nothing
regarding digital 3D imaging and, even if I could compute this, I am
not sure how I would plot the result. My goal is really to add this to
my R-plotting arsenal and use it in routine, not to develop something
very specific for this particular application. But thank you for
taking the time to reply, maybe I'll come back to this when I know
more.

JiHO
---
http://maururu.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Marginality rule between powers and interaction terms in lm()

2011-01-21 Thread JiHO
Dear all,

I have a model with simple terms, quadratic effects, and interactions.
I am wondering what to do when a variable is involved in a significant
interaction and in a non-significant quadratic effect. Here is an
example

d = data.frame(a=runif(20), b=runif(20))

d$y = d$a + d$b^2

So I create both an simple effect of a and a quadratic effect of b.

m = lm(y ~ a + b + I(a^2) + I(b^2) + a:b, data=d)
drop1(m)
...
   Df Sum of Sq  RSS  AIC
  0.00 -1487.56
I(a^2)  1  0.00 0.00 -1482.04
I(b^2)  1  0.098444 0.098444   -96.28
a:b 1  0.00 0.00 -1488.37

Here R cleverly shows that I can drop a:b or any quadratic term
(suggesting that they have equal marginality?) but not simple terms
since they are marginal to the quadratic or the interaction terms. At
this point the interaction is not significant so the situation is
simple: drop a:b, then drop a^2 and then stop.

Now let's add an interaction

d[d$b > 0.5, "y"] = d[d$b > 0.5, "y"] + 0.01*d[d$b > 0.5, "a"]

m = lm(y ~ a + b + I(a^2) + I(b^2) + a:b, data=d)
summary(m)
...
(Intercept) -3.275e-04  1.585e-03  -0.207  0.83932
a9.988e-01  5.839e-03 171.070  < 2e-16 ***
b   -1.613e-04  5.492e-03  -0.029  0.97698
I(a^2)  -6.515e-05  5.159e-03  -0.013  0.99010
I(b^2)   1.001e+00  4.892e-03 204.593  < 2e-16 ***
a:b  1.191e-02  3.221e-03   3.698  0.00238 **

Now the interaction *is* significant, but a^2 still isn't. drop1()
still suggests that I can remove either the interaction or the
quadratic terms:

drop1(m)
...
   Df Sum of Sq  RSS  AIC
  0.33 -254.306
I(a^2)  1  0.00 0.33 -256.306
I(b^2)  1  0.098611 0.098644  -96.239
a:b 1  0.32 0.65 -242.674

However, this: http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf
suggests that marginality rules between powers of variables might not
be implemented (although they might have been since 2000).

My question is: I am "allowed", according to marginality rules, to remove a^2?

I have found plenty of information on how the coefficients
corresponding to single terms change meaning when a quadratic term or
an interation is involved, and why they should not be removed in most
circumstances. I haven't found anything related to quadratic vs.
interactions.

Thanks in advance for your help. Sincerely,

JiHO
---
http://maururu.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confidence interval around a mean count (poisson based?)

2010-05-05 Thread JiHO
Hello all,

I am observing animals in a behavioural arena and recording their
distance from a specific point at regular time intervals (large enough
so that I can assume two successive positions are independent from
each other). Each animal provides a complete histogram of distances
which reflects its trajectory in the arena. I repeat those
observations with several animals in two scenari and I want to
describe the distribution of distances in each treatment.

I computed the mean histogram per treatment: per bin, I count the
number of distances falling in the bin for each animal and then
average this count over all animals, within treatment. Now I want to
represent the variability around this average count and compute a
confidence interval.

The data is counts so, unsurprisingly, it is not normal. I have less
than 30 animals in each treatment so I cannot assume that the mean
would be normally distributed. The means indeed look
Poisson-distributed, as the counts are. I tried to find ways to
compute confidence intervals for Poisson processes but everything I
come up with in R (poi.ci in NCStats, exactci in the package of the
same name) requires integers. This makes sense for raw counts but
makes me wonder if what I am trying to do is really "right" with those
means (which are floats).

I understand that this is more a statistics question than a R question
but I am a bit stumped and would appreciate any help I can get from
the experts on this list. Thank you very much in advance.

PS: what I did so far was just compute mean +/- SD. The result is here:
http://dl.dropbox.com/u/1047321/hist_mean-SD.pdf
Maybe the SD is already so large that it is not even worth trying to
pursue my goal above...

Sincerely,

JiHO
---
http://maururu.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table or read.csv without row index?

2010-05-05 Thread JiHO
> I tried as.matrix but it did not help.

as.matrix() won't work because a matrix requires everything in it to
be of the same type (number, character, logical etc.). You do not have
only numbers in your data.frame, so it will convert everything to
character strings. If you try as.matrix(temp[,-1]) it should work
(assuming you only have characters in the first column, otherwise
remove all non-numeric columns).

But what you really want is to circumvent the fact that, on a
data.frame, mean works column-wise. In fact, when you call mean() on a
data.frame() it calls mean.data.frame(), which code you can see by
typing its name at the prompt:

> mean.data.frame
function (x, ...)
sapply(x, mean, ...)


And indeed, it uses sapply() to apply the function mean to each
column. You could have tried:

mean.default(temp[2,2:3])

but this returns and error because it needs a numeric vector and does
not automatically convert your data.frame into it. So you need:

mean.default(as.numeric(temp[2,2:3]))

or more simply

mean(as.numeric(temp[2,2:3]))

I hope that helped. Sincerely,

JiHO
---
http://maururu.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Class htest with non-numeric p-values

2011-04-20 Thread JiHO
Hi everyone,

For some tests, tables of critical values are readily available while
the algorithm to compute those critical values is not. In such cases,
only a range for the p-value can be evaluated (e.g. 0.05 < p < 0.1)
from the table of critical values and the statistic.

Is there anyway to include that in an object of class "htest"? I see
in print.htest() that the p.value element goes through format.pval()
which expect a numeric argument. Is there any workaround?

Thank you in advance. Sincerely,

JiHO
---
http://maururu.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Heatmap in R and/or ggplot2

2011-06-15 Thread JiHO
On Tue, Jun 14, 2011 at 19:56, idris  wrote:
> Follow up question: My data contains x, y, height, and day.
>
> I want to create the heatmap for each day, keeping the color coding
> consistent.
>
> I've created an example with 2 days, and you can see the charts below.
>
> Notice that the legend changes from day 1 to day 2. How can I make the
> legend consistent?
>
> Also, my real data contains hundreds of days. Is there a way to create a
> 'movie' or a sequence of the heatmaps in chronological order of days?
>
> The way my code works now is obviously very naive as it just repeats the
> same code for day 1 and day 2. Is there a better way to do this?

legend: use the "limit" argument of scale_fill_gradientn and set it to
the max and min of height across all days

movie: there is no way to do that in R alone (that I know of). you can:
1- use the jpeg or png device with a name including a special code
which increases every time a new plot is produced (this is the
default)
2- plot the successive plots in a for loop
3- turn the sequence of images into a movie using  specialized software.

For step 3 you could use guicktime on Mac OS X or mencoder on Linux/OS
X. I don't know about Windows.

Since mencoder works on the command line, you can call it from R and I
have code to ease that:
https://gitorious.org/r/r-utils/blobs/master/lib_movie.R
but you should get familiar with mencoder a little bit before trying
to read/understand it.

JiHO
---
http://maururu.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 2/3d interpolation from a regular grid to another regular grid

2007-12-04 Thread jiho
Hello R users,

I have numerical data sampled on two grids, each one shifted by 0.5  
from the other.

For example:

grid1 = expand.grid(x=0:3, y=0.5:2.5)
grid2 = expand.grid(x=0.5:2.5, y=0:3)
gridFinal = expand.grid(x=0.5:2.5, y=0.5:2.5)

plot(gridFinal, xlim=c(0,3), ylim=c(0,3), col="black", pch=19)
points(grid1, xlim=c(0,3), ylim=c(0,3), col="red", pch=19)
points(grid2, xlim=c(0,3), ylim=c(0,3), col="blue", pch=19)

I would like to interpolate the quantities on grid1 (red) and grid2  
(blue) on the same grid (black). This scenario is very common in  
geophysical data and models. I only found:
- functions in package akima which are designed for irregular grids
- krigging in package fields, which also requires irregular spaced data
- approx or spline which works in 1D and which I could apply line by  
line and column by column and use a mean of both estimates
I am sure there are plenty of functions already available to do this  
but searching R-help and the packages site did not help. Pointer to a  
function/package would be highly appreciated.

Eventually, the same scenario will occur in 3D so if the function is  
3D capable it would be a plus (but I am sure the solution to this is  
generic enough to work in nD)

Thank you in advance.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 2/3d interpolation from a regular grid to another regular grid

2007-12-05 Thread jiho
On 2007-December-05  , at 16:47 , Scionforbai wrote:
>> I just read the description in ?Krig in the package fields which  
>> says:
>> " Fits a surface to irregularly spaced data. "
>
> Yes, that is the most general case. Regular data location is a subset
> of irregular. Anyway, kriging, just one g, after the name of Danie
> Krige, the south african statistician who first applied such method
> for minig survey.

ooops. sorry about the typo.

>> My problem is simpler
> ...
>> So it is really purely numerical.
> ...
>> I just hoped that R had that already coded ...
>
> Of course R has ... ;) If your grids are really as simple as the
> example you posted above, and you have a really little variability,
> all you need is a "moving average", the arithmetic mean of the two
> nearest points belonging to grid1 and grid2 respectively. I assume
> that your regularly shaped grids are values stored in matrix objects.
>
> The functions comes from the "diff.default" code (downloading the R
> source code, I assure, is worth):

I can imagine it is indeed. I use the source of packages functions  
very often.

> my.interp <- function(x, lag = 1)
> {
> r <- unclass(x)  # don't want class-specific subset methods
> i1 <- -1:-lag
> r <- (r[i1] + r[-length(r):-(length(r)-lag+1)])/2
> class(r) <- oldClass(x)
> return(r)
> }
>
> Finally,
>
> g1 <- apply(grid1val,1,my.interp)
> g2 <- apply(grid2val,2,my.interp)
>
> give the interpolations on gridFinal, provided that all gridFinal
> points are within the grid1 and grid2 ones.
>
> If you want the mean from 4 points, you apply once more with lag=3,
> cbind/rbind to the result columns/rows o NAs, and you calculate the
> mean of the points of the two matrixes.
> This is the simplest (and quickest) moving average that you can do.
> For more complicated examples, and for 3d, you have to go a little
> further, but the principle holds.

Thanks very much. I'll test this soon (and it looks like the vector  
operation might even be directly translatable in Fortran which is  
nice since I'll need to do it in Fortran too).

Thanks again.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 2/3d interpolation from a regular grid to another regular grid

2007-12-05 Thread jiho
On 2007-December-04  , at 21:38 , Scionforbai wrote:
>> - krigging in package fields, which also requires irregular spaced  
>> data
>
> That kriging requires irregularly spaced data sounds new to me ;) It
> cannot be, you misread something (I feel free to say that even if I
> never used that package).

Of Krigging I only know the name and general intent so I gladly line  
up to your opinion.
I just read the description in ?Krig in the package fields which says:
" Fits a surface to irregularly spaced data. "
But there are probably other Krigging methods I overlooked.

> It can be tricky doing kriging, though, if you're not comfortable with
> a little bit of geostatistics. You have to infer a variogram model for
> each data set; you possibly run into non-stationarity or anisotropy,
> which are indeed very well treated (maybe at best) by kriging in one
> of its forms, but ... it takes more than this list to help you then;
> basically kriging requires modelling, so it is often very difficult to
> set up an automatic procedure. I can reccomend kriging if the spatial
> variability of your data (compared to grid refinement) is quite
> important.

This was the impression I had too: that Krigging is an art in itself  
and that it requires you to know much about your data. My problem is  
simpler: the variability is not very large between grid points (it is  
oceanic current velocity data so it is highly auto-correlated  
spatially) and I can get grids fine enough for variability to be low  
anyway. So it is really purely numerical.

> In other simple cases, a wheighted mean using the (squared) inverse of
> the distance as wheight and a spherical neighbourhood could be the
> simpliest way to perform the interpolation.

Yes, that would be largely enough for me. I had C routines for 2D  
polynomial interpolation of a similar cases and low order polynomes  
gave good results. I just hoped that R had that already coded  
somewhere in an handy and generic function rather than having to  
recode it myself in a probably highly specialized and not reusable  
manner.

Thank you very much for you answer and if someone knows a function  
doing what is described above, that would be terrific.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with graphics device in Mac OS X

2007-12-10 Thread jiho
I think you should post this to R-Sig-Mac.
I don't notice any problem at all on my system with the same  
configuration.

On 2007-December-10  , at 21:37 , WAYNE KING wrote:

> Hello List,
>   I am teaching a basic course where students are encouraged to use  
> R. There are a few students using Mac OS X. As a test we downloaded  
> and installed the latest .dmg file (R-2.6.1.dmg) onto a intel Mac  
> running 10.5.1. A device query yields
>
>> getOption("device")
> "quartz"
>
> But any plot command does not bring up a plot (e.g. plot(),  
> boxplot(), hist()).
>
> I found a thread concerning X11 windows under Mac OS X but I feel  
> these users will most likely be just using the native quartz device.
>
> Invoking a call to quartz() first does not seem to help, e.g.
>
>> quartz()
>> plot(rnorm(100,0,1))
>
> produces no output and no error message (Nothing happens). A call to  
> dev.cur() seems to indicate a device is active.
>> quartz()
>> dev.cur()
> quartz
> 2
>
> but again a plot command produces no figure. Sorry am I not a Mac OS  
> user and I did check the archives but found mostly discussions on  
> X11() under Mac OS X.
>
> Wayne
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] use ggplot in a function to which a column name is given

2007-12-13 Thread jiho
Hi everyone, Hi ggplot users in particular,

ggplot makes it very easy to plot things given their names when you  
use it interactively (and therefore can provide the names of the  
columns).
qplot(x,foo,data=A) where A has columns (x,y,foo,bar) for example

but I would like to use this from inside a function to which the name  
of the column is given. I cannot find an elegant way to make this  
work. Here are my attempts:

#

library(ggplot2)

A = data.frame(x=rep(1:10,10), y=rep(1:10,each=10), u=runif(100),  
v=rnorm(100))

# goal: extract values for y<=5 and plot them, either for u or v

foo1 <- function(uv="u")
{
# solution 1: do not use the data argument at all
#   (forces the use of qplot, could be more elegant)
B = A[A$y<=5,]
qplot(B$x, B$y, fill=B[[uv]], geom="tile")
}

foo2 <- function(uv="u")
{
# solution 2: extract and rename the colums, then use the data argument
#   (enables ggplot but could be shorter)
B = A[A$y<=5,c("x","y",uv)]
names(B)[3] = "value"
# rem: cannot use rename since rename(B,c(uv="value")) would not work
qplot(x, y, fill=value, data=B, geom="tile")
# or
# ggplot(B,aes(x=x,y=y,fill=value)) + geom_tile()
}

foo3 <- function(uv="u")
{
# solution 3: use the data argument and perform the extraction  
directly in it
#   (elegant and powerful but can't make it work)
ggplot(A[A$y<=5,c("x","y",uv)],aes(x=x,y=y,fill=???)) + geom_tile()
# or
ggplot(A[A$y<=5,],aes(x=x,y=y,fill=???)) + geom_tile()
# or ...
}

print(foo1("u"))
print(foo1("v"))
print(foo2("u"))
print(foo3("u"))

#

Any help in making foo3 work would be appreciated. Thanks in advance  
for your expertise.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] use ggplot in a function to which a column name is given

2007-12-13 Thread jiho
Follow up.

On 2007-December-13  , at 10:45 , jiho wrote:
> foo1 <- function(uv="u")
> {
>   # solution 1: do not use the data argument at all
>   #   (forces the use of qplot, could be more elegant)
>   B = A[A$y<=5,]
>   qplot(B$x, B$y, fill=B[[uv]], geom="tile")
> }

---> actually this does not even work currently:
Error in eval(expr, envir, enclos) : object "B" not found
Which only leaves the most inelegant solution: 2

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] use ggplot in a function to which a column name is given

2007-12-13 Thread jiho

On 2007-December-13  , at 15:56 , hadley wickham wrote:
> Hi Jiho,
>
> The key to solving this problem is to use aes_string instead of aes.
> Instead of the complicated munging that aes does to get the names of
> the variables, aes_string works directly with strings, so that:
>
> aes_string(x = "mpg", y = "wt") == aes(x = mpg, y = wt)
>
> So your function would look like:
>
> foo4 <- function(uv="u") {
>  ggplot(A, aes_string(x = "x", y= "y", fill = uv)) + geom_tile()
> }
>
> Or
>
> ggplot(A, aes(x=x, y=y)) + aes_string(fill=uv) + geom_tile()
>
> Hope that helps!  (And I've made a note to better document aes_string
> so you can discover after looking at aes)

great! I knew you would have thought this through. That's perfect. As  
always there's the trade-off between writing code and documenting the  
code already written. In this case the trade-off turned toward the  
code part I guess.

Autodetection of strings by aes would be even greater but that would  
prevent me to assign the actual strings "u", "x", "y" to an aes  
element, which I don't see as a problem for non text related functions  
though...

Thanks again.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rscript on OSX

2008-01-09 Thread jiho
On 2008-January-09  , at 15:01 , Andrew Beckerman wrote:
> R is not designed to run from
> Terminal.app (actually, terminal does not do graphics well for R).


Actually it can, provided that you:
- install the package CarbonEL which allows you to plot to quartz  
windows from outside R.app (with minor quirks but that's still very  
nice to have)
- set the default plotting device to X11 in your .Rprofile and:
. do nothing if you are on leopard (since X11 should start  
automatically)
. if you are on previous versions, add
export DISPLAY=:0.0
to you shell startup script (.profile, .bashrc whatever) so that the  
display variable is defined. You'll have to start X11 manually but  
once it is started, you should be able to run X11 apps directly from  
Terminal.app

Hope that helps.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 4 dimensional graphics

2008-01-10 Thread jiho

On 2008-January-10  , at 17:41 , Petr PIKAL wrote:
> Thank you
>
> Basically I have a rectangular space (like an aquarium) in which I  
> made
> some analysis.
> I can make
>
> image(lat, long, value) for each height but what I dream about is to  
> make
> something like scatterplot3d(lat, long, height) with points set  
> according
> to a value.
>
> Up to now i can do
>
> scatterplot3d(sloupecn, radan, vrstvan, color=as.numeric(cut(value,  
> c(0,
> 100, 400, 1000
>
> which will give you green and red points in upper right corner. I  
> started
> to try to make cex.symbols scaled according to value too but up to  
> now I
> did not manage to work correctly.
>
> in
>
> scatterplot3d(sloupecn, radan, vrstvan, cex.symbols = value/ 
> max(value)+2,
> color=as.numeric(cut(value, c(0, 100, 400, 1000
>
> the biggest points are at other places then I expected.

so you have measures at x,y,z points basically. and your measures  
appear to be on z layers so you can probably make several x,y plots  
with point size according to value, stacked on top of each other or  
side by side. one liner ggplots:

A=read.table("petr.txt",header=T)
library("ggplot2")
# stacked
ggplot(A,aes(x=x,y=y)) + geom_point(aes(size=value, colour=factor(z)))  
+ scale_size(to=c(0,10)) + scale_colour_hue(alpha=0.3)
# side by side
ggplot(A,aes(x=x,y=y)) + geom_point(aes(size=value)) +  
scale_size(to=c(0,10)) +facet_grid(z~.)

if you want 3d to explore your data, rgl (in which you can rotate  
plots etc) is probably the best choice

# 3D with rgl
library("rgl")
open3d()
spheres3d(A$x,A$y,A$z,A$value/1000) 
# NB. scaling your value is necessary since the values are so big  
compared to the coordinates
axes3d()

hope that helps.

petr.txt:

x y z value
1   4   1   73.8
1   4   9   54.9
1   4   17  72
1   1   1   96
1   1   9   52.1
1   1   17  53.3
4   4   1   58.4
4   4   9   93.5
4   4   17  140.2
4   1   1   90.3
4   1   9   36.5
4   1   17  55.1
7   4   1   169.1
7   4   9   718
7   4   17  813
7   1   1   73.4
7   1   9   46.5
7   1   17  205


JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2, coord_equal and aspect ratio

2008-01-10 Thread jiho
Hi everyone, Hi Hadley,

I am a heavy user of coord_equal() in ggplot2 since most of my data is  
spatial, on x,y coordinates. Everything works. However by enforcing an  
aspect ratio of 1 for the plotting region, coord_equal() usually  
wastes a lot of space if the region of interest is not a perfect square.

For example:
x=runif(10)
a=data.frame(x=x*3,y=x)
ggplot(data=a, aes(x=x,y=y)) + geom_point() + coord_equal()
produces a square plotting region while all the data is constrained in  
an horizontally extended rectangle. I would expect the coordinates to  
stay equal but the plot to be extended so that it fills as much as the  
plotting region as possible. It does not appear to be currently  
doable. Is it a limitation of ggplot? of the underlying grids  
graphics? Is there a workaround this?

Thanks in advance for your help.

PS:
x=runif(10)
qplot(x*3, x) + coord_equal()
produces some very strange scales. this looks like a bug to me.  
especially since:
foo=x
qplot(foo*3, foo) + coord_equal()
works as expected

PPS: I tried
ggplot(data=a, aes(x=x,y=y)) + geom_point() + coord_equal() +  
scale_y_continuous(limits=c(0,1))
which has no effect. but the side effect of enforcing the square  
domain is that:
ggplot(data=a, aes(x=x,y=y)) + geom_point() + coord_equal() +  
scale_y_continuous(limits=c(0,0.3))
has no effect either (i would expect to see only the points <0.3)

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] things that are difficult/impossible to do in SAS or SPSS butsimple in R

2008-01-15 Thread jiho
On 2008-January-15  , at 21:58 , Greg Snow wrote:
> Some aspects of graphics, adding to graphs I believe is still quite a
> bit easier in R/S-PLUS.


Hadley would give better examples than me (and I'm no expert of SAS- 
SPSS) but with ggplots, in R, it is both very easy produce statistical  
plots (i.e. with results from lm's,various smoothers, data density  
etc.) and to overlay different graphics layers without having to take  
care of the scales, legends, etc.
http://had.co.nz/ggplot2/

NB: the website itself only demonstrates simple graphs done with one  
function (since its purpose is documentation) but overlaying two  
graphs is often as simple as adding them:
p <- ggplot(mtcars, aes(x=wt, y=mpg)) + geom_point()
g <- geom_path(aes(x=wt,y=mpg, colour=qsec))
p + g
(this example is probably useless but it is only for demonstration  
purposes)

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Selecting rows conditionally between 2 data.frames

2008-01-18 Thread jiho
Hello everyone,

I have two data.frames that look like

calib:
place   zoomscale
left0.658
left0.805.6
left1.203
right   0.658.4
right   0.806
right   1.202.9

X:
... place   zoom
... left0.80
... left1.20
... right   0.65
... NA  NA  
... right   0.8 
... left1.20

and I want to get the corresponding values of 'scale' in a new column  
in X, i.e.:
X:
... place   zoomscale
... left0.805.6
... left1.203
... right   0.658.4
... NA  NA  NA
... right   0.8 6
... left1.203

I tried various combination of `which` and `match` but could not make  
it work.
I fell back on defining a custom function and applying it over the lines
get.scales <- function(x)
{
if (is.na(x$zoom)) {
scaleF = NA
} else {
scaleF = calib[calib$zoom==x$zoom & 
calib$place==x$place, "scale"]
}
return(scaleF)
}
and
apply(X, 1, get.scales)
but:
- this is not very practical since using apply apparently converts the  
data to an array and looses column titles (i.e. x$size does not work,  
I need to use column numbers, which I'd rather not)
- I feel there is an easier, vector based way.

I would welcome suggestions. Thank you in advance.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sorting in 'merge'

2008-01-21 Thread jiho
Hello everyone,

I've been advised to use merge to extract information from two  
data.frames with a number of common columns, but I cannot get a grasp  
on how it sorts the result. With sort=FALSE, I would expect it to give  
the result back sorted exactly as the input was but it seems it is not  
always the case, especially when there are repeats in the input.

For example:

 > a = data.frame(field1=c(1,1,2,2),field2=c(1:2,1:2),var1=runif(4))
 > b = data.frame(field1=c(2,2,1,1),field2=c(1,2,2,1),var2=runif(4))
 > a
   field1 field2  var1
1  1  1 0.8327855
2  1  2 0.4309419
3  2  1 0.5134574
4  2  2 0.8063110
 > b
   field1 field2  var2
1  2  1 0.2739025
2  2  2 0.5147113
3  1  2 0.2958369
4  1  1 0.3703116

So b is in an irregular order, if I then merge:

 > merge(b,a)
   field1 field2  var2  var1
1  1  1 0.3703116 0.8327855
2  1  2 0.2958369 0.4309419
3  2  1 0.2739025 0.5134574
4  2  2 0.5147113 0.8063110

in that case the result is sorted, as expected. If i merge it without  
sorting:

 > merge(b,a,sort=F)
   field1 field2  var2  var1
1  2  1 0.2739025 0.5134574
2  2  2 0.5147113 0.8063110
3  1  2 0.2958369 0.4309419
4  1  1 0.3703116 0.8327855

it retains the order in b, which is what I want.
However if I now add a repeated row to b

 > b = rbind(b,b[1,])
 > b
   field1 field2  var2
1  2  1 0.2739025
2  2  2 0.5147113
3  1  2 0.2958369
4  1  1 0.3703116
5  2  1 0.2739025

and merge it, without sorting

 > merge(b,a,sort=F)
   field1 field2  var2  var1
1  2  1 0.2739025 0.5134574
2  2  1 0.2739025 0.5134574
3  2  2 0.5147113 0.8063110
4  1  2 0.2958369 0.4309419
5  1  1 0.3703116 0.8327855

the result is still somehow sorted according to the order of b. I  
would have expected the output to be:

merge(b,a,sort=F)
   field1 field2  var2  var1
1  2  1 0.2739025 0.5134574
2  2  2 0.5147113 0.8063110
3  1  2 0.2958369 0.4309419
4  1  1 0.3703116 0.8327855
5  2  1 0.2739025 0.5134574

Is it possible to get this output (another function similar to merge)?  
What is the overall reason (if someone knows it) for the current  
behaviour of merge?

Thanks in advance.

PS: code

a = data.frame(field1=c(1,1,2,2),field2=c(1:2,1:2),var1=runif(4))
b = data.frame(field1=c(2,2,1,1),field2=c(1,2,2,1),var2=runif(4))
a
b
merge(b,a)
merge(b,a,sort=F)
b = rbind(b,b[1,])
b
merge(b,a,sort=F)


JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] row-wise conditional update in dataframe

2008-01-22 Thread jiho

On 2008-January-22  , at 03:10 , Jon Erik Ween wrote:

> That got me there. I suppose R prefers absolute field references in
> scripts rather than macrosubstitutions of field names like you would
> do in pearl or shell scripts?

no, actually, the problem is that apply works on arrays/matrices[1],  
not data.frames. So it converts the rows of your data.frame in an  
array instead of using a one row data.frame, hence you cannot refer to  
the elements of this array by name.
This behavior has also bitten me several times and I would love to  
have an apply function that works on data.frames directly. Is there  
such a modified apply in some package?

[1] ?apply says
"If X is not an array but has a dimension attribute, apply attempts to  
coerce it to an array via as.matrix if it is two-dimensional (e.g.,  
data frames) or via as.array."

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting two variables with a third used for color

2009-03-21 Thread JiHO

On 2009-March-20  , at 16:23 , David Winsemius wrote:


On Mar 20, 2009, at 3:55 PM, Altaweel, Mark R. wrote:

I have a problem where I have two columns of data that I can simply  
plot using:


plot(wV[0:15,3],wY[0:15,3]).


Perhaps:
plot(wV[0:15,3],wY[0:15,3], col = ifelse(wY[0:15,3]>0, "blue","red") )


And you could look into the package ggplot2 which gives you a legend  
and is well suited for these things.


# quick version:
qplot(wV[0:15,3], wY[0:15,3], colour=ifelse(wY[0:15,3]>0,">0","<0"))

# more explicit version, using a data.frame
dat = data.frame(v=wV[0:15,3], y=wY[0:15,3],  
sign=ifelse(wY[0:15,3]>0,">0","<0"))

ggplot(data=dat) + geom_point(aes(x=v,y=y,colour=sign))

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot and Boxplot in the same graph

2009-03-21 Thread JiHO

On 2009-March-20  , at 23:02 , johnhj wrote:


Is it possible, to use the plot() funktion and the boxplot() funktion
together ?
I will plot a simple graph and additionally to the graph on certain  
places
boxplots. I have imagined to plot the graph a little bit  
transparency and

show in the same graph on certain places boxplots


If you don't mind letting got the base graphics in R, you can look  
into http://had.co.nz/ggplot2/


The representations you are interested in are geom_jitter and  
geom_boxplot and in the help page of geom_jitter there is an example  
of specifically what you just asked:

http://had.co.nz/ggplot2/geom_jitter.html
look at the end. As for transparency, you can use  
geom_boxplot(fill=alpha("white",0.5)) for example.


JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting Matrix into List - problem (urgent)

2009-03-22 Thread JiHO

On 2009-March-22  , at 13:06 , Nidhi Kohli wrote:

I'm reading this file into a variable named congeneric (see my code)  
and now trying to pick up first 20 values and need these 20 values  
in a list format like


0.610251D+00 (row 1, col 1)
0.615278D+00 (row 1, col 2)
0.583581D+00 (row 1, col 3)
...




Basically you want a list in which each element as only one value in  
it. And those elements are the first twenty of the previous matrix,  
reading by line, then column.


I am not sure about:

I want to remove the Column name and Row name from the above output.  
Any help on this will be greatly appreciated (I'm open to any other  
alternative way to convert Matrix into List also)


I don't see any column or row name in


[[1]]
[1] "0.520404D+00"
[[2]]
[1] "0.601942D+00"
[[3]]
[1] "0.603340D+00"
[[4]]
[1] "0.655582D+00"
[[5]]
[1] "0.490995D+00"
.
..
..
...
[[20]]
[1] "0.627368D+00"


That's just a the way a list is printed. If you just want a clean  
visual output, then use cat in a for loop.


So I guess my question is what are you trying to achieve with that?  
What do you need the list for? It would help to know (a bit) of the  
general context to help you.


JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting Matrix into List - problem (urgent)

2009-03-22 Thread JiHO

On 2009-March-22  , at 13:06 , Nidhi Kohli wrote:

Thank you so much for the quick reply. Let me explain you what I  
want. I have a data file in the following format:


0.610251D+00 0.615278D+00 0.583581D+00 0.560295D+00
0.501325D+00 0.639512D+00 0.701607D+00 0.544963D+00
0.589746D+00 0.648588D+00 0.608216D+00 0.582599D+00
0.625204D+00 0.523065D+00 0.627593D+00 0.621433D+00
0.733730D+00 0.498495D+00 0.748673D+00 0.591025D+00
0.578333D+00 0.564807D+00 0.652199D+00 0.579333D+00

I'm reading this file into a variable named congeneric (see my code)  
and now trying to pick up first 20 values and need these 20 values  
in a list format like


0.610251D+00 (row 1, col 1)
0.615278D+00 (row 1, col 2)
0.583581D+00 (row 1, col 3)
...


Can you tell me how can i achieve this? I think i'm pretty close but  
don't know how to remove the Row Name and Col. name from the  
conFirstTwenty list (see my code)


Alternatives:

congeneric <- scan(textConnection("0.610251 0.615278 0.583581 0.560295
0.501325 0.639512 0.701607 0.544963
0.589746 0.648588 0.608216 0.582599
0.625204 0.523065 0.627593 0.621433
0.733730 0.498495 0.748673 0.591025
0.578333 0.564807 0.652199 0.579333"), sep=" ")
# you would use scan with the filename in which the data is, rather  
than the textConnection. That's just for the purpose of the  
demonstration here.


# possible outputs
congeneric[1:20]

as.list(congeneric[1:20])

for (i in 1:20) {
cat(congeneric[i],"\n")
}

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to suppress some tk dialogs

2008-02-05 Thread jiho
Dear List,

I noticed that, when executing R without X11 (e.g. on a remote machine  
without X forwarding), when R needs to display a Tk dialog (e.g. when  
presenting the list of mirrors for install.packages,or of available  
packages containing help on a given keyword) it replaces it by a  
simple numbered text list. I would love this behaviour to be the  
default, even when I have X11, since I find it quicker and less  
intrusive[1]. Is that possible?
man R, RSiteSearch and google were not helpful, it seems that  
everybody is trying to get Tk to work rather than trying to suppress  
it...

Thanks in advance.

[1] I am mainly using R in a simple terminal on OS X (X11 is usually  
not running, so each TK dialog has to start the whole X server) or via  
ssh to a remote machine (each Tk dialog has to start my local X server  
and needs to be sent through the network, twice as painful). This may  
give you the reason behind this seemingly strange question.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to suppress some tk dialogs

2008-02-05 Thread jiho

On 2008-February-05  , at 16:13 , Prof Brian Ripley wrote:

> Without reproduction instructions we have to guess at what you are  
> doing.

Sorry to not have included some. I did not think it was relevant since  
this was not directly a coding issue. I'll be more thorough in the  
future.

> But I think the answer is in the help for options(), and more  
> obvious from ?chooseCRANmirror (which seems to be one of the  
> functions you are using).

Indeed, the option to set is menu.graphics=FALSE

Thank you very much!

> On Tue, 5 Feb 2008, jiho wrote:
>
>> Dear List,
>>
>> I noticed that, when executing R without X11 (e.g. on a remote  
>> machine
>> without X forwarding), when R needs to display a Tk dialog (e.g. when
>> presenting the list of mirrors for install.packages,or of available
>> packages containing help on a given keyword) it replaces it by a
>> simple numbered text list. I would love this behaviour to be the
>> default, even when I have X11, since I find it quicker and less
>> intrusive[1]. Is that possible?
>> man R, RSiteSearch and google were not helpful, it seems that
>> everybody is trying to get Tk to work rather than trying to suppress
>> it...
>>
>> Thanks in advance.
>>
>> [1] I am mainly using R in a simple terminal on OS X (X11 is usually
>> not running, so each TK dialog has to start the whole X server) or  
>> via
>> ssh to a remote machine (each Tk dialog has to start my local X  
>> server
>> and needs to be sent through the network, twice as painful). This may
>> give you the reason behind this seemingly strange question.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Incomplete ouput with sink and split=TRUE

2008-02-05 Thread jiho
Dear List,

I am trying to get R's terminal output to a file and to the terminal  
at the same time, so that I can walk through some tests and keep a log  
concurrently. The function 'sink' with the option split=TRUE seems to  
do just that. It works fine for most output but for objects of class  
htest, the terminal output is incomplete (the lines are there but  
empty). Here is an example session which shows the problem:

 > sink("textout.txt", type="output", split=T)
 > b=bartlett.test(runif(10),c(1,1,1,1,2,2,2,2,2,2))
 > class(b)
[1] "htest"
 > b


data:  runif(10) and c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2)

 > t=t.test(runif(10),c(1,1,1,1,2,2,2,2,2,2))
 > t


data:  runif(10) and c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2)
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
  -1.5807338 -0.7316803
sample estimates:
mean of x mean of y
0.4437929 1.600

 > sink()   # output in the file is complete
 > b

Bartlett test of homogeneity of variances

data:  runif(10) and c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2)
Bartlett's K-squared = 0.9959, df = 1, p-value = 0.3183

 > t

Welch Two Sample t-test

data:  runif(10) and c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2)
t = -5.7659, df = 16.267, p-value = 2.712e-05
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
  -1.5807338 -0.7316803
sample estimates:
mean of x mean of y
0.4437929 1.600

 >

Is this a known bug (I'm using R 2.6.1 on OS X and Linux - FC8)? Is  
there an inherent reason why some portions of this output are not  
redirected?

Thank you in advance for your help.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to search for packages

2008-02-05 Thread jiho
On 2008-February-04  , at 21:10 , hadley wickham wrote:
>> The real answer was Task Views on CRAN (most of the OQs topics  
>> *are* already
>> Task Views), so crantastic is very partial. If you have a little  
>> time and want
>
> I think crantastic and task views solve somewhat different problems
> (although I agree that crantastic should mirror the task views too).
> Task views are of considerable effort to set up, and often written by
> one person.  Tags on crantastic are dead simple to set up and can be
> contributed to by multiple people (although I'm not yet sure what to
> do about potential conflicts)
>
>> to really draw in the masses, try doing clickable image maps from the
>> Bioconductor pkgDepTools:
>>
>> http://bioconductor.org/packages/2.1/bioc/html/pkgDepTools.html
>>
>> because some of the unobtrusive short-name packages are key nodes  
>> in package
>> dependency graphs. The dependency trees are very illuminating.  
>> Automating the
>> updates would be positive. If you could also run against the Task  
>> View listings,
>
> That's a good idea.  I do mean to add dependency links in both
> direction, and to automate the updates I just need to get a cron job
> set up.

I like the fact that Task Views are written by experts but the  
community aspect of crantastic is really appealing. Depending how the  
aforementioned experts feel about crantastic, and how it grows and  
scales, I would be very glad to see:
- some "experts reviews" on some packages. some people considered  
experts in a field are given special credit and their review stand out  
from the rest. While community reviews are nice, I would rather have a  
two word advice from _the_ expert than a full book from a community of  
ignorants (I am voluntarily exaggerating here). Having both (good  
community reviews complemented by one expert advice) would of course  
be the ideal solution. An other solution (but that would be in  
hadley's hands) would be to have reviews in a wiki form: review text  
editable + comments.
- some tag reviews. they would essentially serve the purpose of Task  
Views but would probably be more dynamic (since the number of packages  
linked to a tag can change) and integrate with the rest.

(Thanks for asking the question by the way)

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Incomplete ouput with sink and split=TRUE

2008-02-06 Thread jiho

On 2008-February-06  , at 11:25 , Alex Brown wrote:
> you could use the unix function 'script' before invoking the R  
> interpreter.


Thanks for the suggestion. It would work for some cases and I did not  
know about this utility. But most of the time I just put a list of  
commands in a .R file and source it to execute it. In this case the  
transcript would only contain "source("whatever.R", print.eval=T)"  
which won't be very informative :/. And in the case of sourcing, the  
problem with disappearing text stays the same.

Thanks for you answer anyway.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Incomplete ouput with sink and split=TRUE

2008-02-06 Thread jiho
On 2008-February-06  , at 14:45 , Duncan Murdoch wrote:
> On 2/5/2008 11:12 AM, jiho wrote:
>> Dear List,
>> I am trying to get R's terminal output to a file and to the  
>> terminal  at the same time, so that I can walk through some tests  
>> and keep a log  concurrently. The function 'sink' with the option  
>> split=TRUE seems to  do just that. It works fine for most output  
>> but for objects of class  htest, the terminal output is incomplete  
>> (the lines are there but  empty). Here is an example session which  
>> shows the problem:
>
> stats:::print.htest() uses writeLines to write some of its output to  
> stdout(), and it looks as though sink(split=T) misses those bits.
>
> I'll change print.htest to use cat(), but it is probably a sign of a  
> bigger problem in sink(), and it's too late in the schedule to touch  
> that for 2.6.2.
>
> Duncan Murdoch


Thank you for your attention on this. Given your comment I filed on a  
bug report on this (same subject that this email, no bug ID yet)

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Comparing spatial point patterns - Syrjala test

2008-02-09 Thread jiho
Dear Lists,

At several stations distributed regularly in space[1], we sampled  
repeatedly (4 times) the abundance of organisms and measured  
environmental parameters. I now want to compare the spatial  
distribution of various species (and test wether they differ or not),  
or to compare the distribution of a particular organism with the  
distribution of some environmental variable.
Syrjala's test[2] seems to be appropriate for such comparisons. The  
hamming distance is also used (but it is not associated with a test).  
However, as far as I understand it, Syrjala's test only compares the  
distribution gathered during one sampling event, while I have four  
successive repeats and:
- I am interested in comparing if, on average, the distributions are  
the same
- I would prefer to keep the information regarding the variability of  
the abundances in time, rather than just comparing the means, since  
the abundances are quite variable.

Therefore I have two questions for all the knowledgeable R users on  
these lists:
- Is there a package in which Syrjala's test is implemented for R?
- Is there another way (a better way) to test for such differences?

Thank you very much in advance for your help.

[1] http://jo.irisson.free.fr/work/research_tetiaroa.html
[2] http://findarticles.com/p/articles/mi_m2120/is_n1_v77/ai_18066337/pg_7

Jean-Olivier Irisson
---
UMR 5244 CNRS-EPHE-UPVD, 52 av Paul Alduy, 66860 Perpignan Cedex, France
+336 21 05 19 90
http://jo.irisson.free.fr/work/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-Geo] Comparing spatial point patterns - Syrjala test

2008-02-09 Thread jiho
On Feb 9, 2008 10:39 PM, milton ruser <[EMAIL PROTECTED]> wrote:
> I have no idea of how to solve this issue, but I suggest you write to the
> authors of spatstat package. I think they could help you very much. Another
> thing is that there was a threhead on this many times ago. May be that Alan
> Swanson cold also help you.
>
> http://finzi.psych.upenn.edu/R/Rhelp02a/archive/51226.html

Thank you for your answer. Indeed I saw this email during my research
before posting my message. Unfortunately nobody answered it. I just
hoped that things may have improved in this area of R since then and
that potential authors, such as authors of the spatstat package, would
be reading these lists.
I will contact those people personally indeed.

Thanks again.

> > At several stations distributed regularly in space[1], we sampled
> > repeatedly (4 times) the abundance of organisms and measured
> > environmental parameters. I now want to compare the spatial
> > distribution of various species (and test wether they differ or not),
> > or to compare the distribution of a particular organism with the
> > distribution of some environmental variable.
> > Syrjala's test[2] seems to be appropriate for such comparisons. The
> > hamming distance is also used (but it is not associated with a test).
> > However, as far as I understand it, Syrjala's test only compares the
> > distribution gathered during one sampling event, while I have four
> > successive repeats and:
> > - I am interested in comparing if, on average, the distributions are
> > the same
> > - I would prefer to keep the information regarding the variability of
> > the abundances in time, rather than just comparing the means, since
> > the abundances are quite variable.
> >
> > Therefore I have two questions for all the knowledgeable R users on
> > these lists:
> > - Is there a package in which Syrjala's test is implemented for R?
> > - Is there another way (a better way) to test for such differences?
> >
> > Thank you very much in advance for your help.
> >
> > [1] http://jo.irisson.free.fr/work/research_tetiaroa.html
> > [2] http://findarticles.com/p/articles/mi_m2120/is_n1_v77/ai_18066337/pg_7

-- 
Jean-Olivier Irisson
UMR 5244 CNRS-EPHE-UPVD, 52 av Paul Alduy, 66860 Perpignan Cedex, France
+336 21 05 19 90
http://jo.irisson.free.fr/work/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-Geo] Comparing spatial point patterns - Syrjala test

2008-02-10 Thread jiho
Hi,

I went ahead and implemented something. However:
- I cannot garantie it gives correct results since, unfortunately, the  
data used in Syrjala 1996 is not published along with the paper. To  
avoid mistakes, I started by coding things in a fast and simple way  
and then tried to optimize the code. At least all versions given the  
same results.
- As expected, the test is still quite slow since it relies on  
permutations to compute the p.value. The successive optimizations  
allowed to go from 73 to 13 seconds on my machine, but 13 seconds is  
still a long time. Furthermore, I don't know how the different  
versions would scale according to the number of points (I only tested  
with one dataset). I'm not very good at "thinking vector" so if  
someone could look at this and further improve it, I would welcome  
patches. Maybe the only real solution would be to go the Fortran way  
and link some code to R, but I did not want to wander in such scary  
places ;)

The code and test data is here:

http://cbetm.univ-perp.fr/irisson/svn/distribution_data/tetiaroa/trunk/data/lib_spatial.R
Warning: it probably uses non canonical S syntax, sorry for those with  
sensitive eyes.

On 2008-February-10  , at 17:02 , Jan Theodore Galkowski wrote:
> I'm also interested here in comparing spatial point patterns.  So, if
> anyone finds any further R-based, or S-plus-based work on the  
> matter, or
> any more recent references, might you please include me in the
> distribution list?
>
> Thanks much!


Begin forwarded message:
> From: jiho <[EMAIL PROTECTED]>
> Subject: Comparing spatial point patterns - Syrjala test
>
> Dear Lists,
>
> At several stations distributed regularly in space[1], we sampled  
> repeatedly (4 times) the abundance of organisms and measured  
> environmental parameters. I now want to compare the spatial  
> distribution of various species (and test wether they differ or  
> not), or to compare the distribution of a particular organism with  
> the distribution of some environmental variable.
> Syrjala's test[2] seems to be appropriate for such comparisons. The  
> hamming distance is also used (but it is not associated with a  
> test). However, as far as I understand it, Syrjala's test only  
> compares the distribution gathered during one sampling event, while  
> I have four successive repeats and:
> - I am interested in comparing if, on average, the distributions are  
> the same
> - I would prefer to keep the information regarding the variability  
> of the abundances in time, rather than just comparing the means,  
> since the abundances are quite variable.
>
> Therefore I have two questions for all the knowledgeable R users on  
> these lists:
> - Is there a package in which Syrjala's test is implemented for R?
> - Is there another way (a better way) to test for such differences?
>
> Thank you very much in advance for your help.
>
> [1] http://jo.irisson.free.fr/work/research_tetiaroa.html
> [2] http://findarticles.com/p/articles/mi_m2120/is_n1_v77/ai_18066337/pg_7


JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R on Mac PRO does anyone have experience with R on such a platform ?

2008-02-11 Thread jiho
On 2008-February-11  , at 19:14 , Roger Day wrote:
> My experience with R.app on a MACbook has been mostly very positive.
> I like the interface much better than that of Windows--
> with two exceptions.
>
> a)  I use stepping thru code with control-R.  It's not as convenient  
> on Mac-
> the code you want to run has to be actually selected; not good  
> enough just
> to be on the line you want.
> That slows down code-stepping.
> b)  saveHistory() doesn't save the history of the current session --  
> beware,
> I lost some work that way.  you have to actually click a button.
> c) no resizing graphs post-hoc,
> d) saving graphics to a file is inconvenient except for pdf output.
>
> Some plusses are:
> a) better built-in editor (if you're not using ESS), including  
> delimiter
> matching
> b) the history pane is nice,
> c) the package installer and manager are nicer than on Win,
> d) autocompletion with ctrl-period,
> e) you can select text on the current or past command line much  
> easier,
> f) attractive interface with lots of cosmetic options.
>
> I've done some tkrplot work in both (using X11 in OSX)
> -- some inconsistencies with placement of widgets show up.
>
> This is off the top of my head.
> Check out the mailing list R-sig-mac for more info.

After using R via R-app (which is indeed very nice to start with) I  
eventually switched to a combination of TextMate + Terminal + CarbonEL
- TextMate[1] is a very powerful editor, well worth the $40 price tag,  
and has nice goodies for R besides syntax highlighting such as command  
autocompletion, command templates, plenty of snippets, etc.
- I run R in a regular Terminal window. This way I get command line  
editing and searching through history. In addition it makes it as easy  
to run R on my local machine that on a remote server (useful to run  
demanding tasks on a large CPU). I can send code from TextMate to the  
terminal prompt using AppleScript commands in TextMate[2]. This allows  
to send selected text _or_ current line directly to the Terminal with  
just a keystroke.
- CarbonEL is a package which allows to plot to a quartz window even  
from a simple Terminal (quartz is Mac OS X graphics engine). The plots  
on quartz look gorgeous and going back to X11 would have been a pain.  
Another similar solution would be to use the Cairo package.

All in all, I fond it a very convenient and flexible way to use R. It  
has the added bonus that the same combination (TM+Terminal) works for  
anything that can run in a terminal window (MATLAB, Scilab, python  
etc.). So, even if you don't use only R, you can keep the same habits  
with a nice editor.

I haven't tried Emacs+ESS. I've heard a lot of good things about it  
but learning Emacs is a task in itself.

[1] http://macromates.com/
[2] modification of those http://jo.irisson.free.fr/?p=32 for the  
built-in Terminal, since Terminal on Leopard finally has tabs

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R on Mac PRO does anyone have experience with R on such a platform ?

2008-02-12 Thread jiho
On 2008-February-11  , at 21:06 , Charilaos Skiadas wrote:
> JiHO, in case you are not following TextMate's mailing list, you  
> might want to check out Hans-Jorg Bibiko's work on Rdaemon:
>
> http://article.gmane.org/gmane.editors.textmate.general/24195/
>
> It provides a lot of the terminal functionality within a TextMate  
> window, uses X11 for the plots, and opens help files either in a  
> browser or in a TextMate HTML window. It essentially runs an R  
> process in the background, and communicates with it, so I'm not sure  
> it would allow you to run R on a remote server. But I think it is  
> worth checking out otherwise. Currently you have to install the  
> bundles from the above link, but I'm hoping soon we'll be able to  
> commit these bundles to TextMate's bundle repository.


That's sweet, particularly the new R bundle. I hope this will be  
merged soon in the main bundles repository. I'm not sure about Rdeamon  
since I have a pretty good workflow with a terminal (plus, I always  
have one open for git related stuff anyway) and I never liked the  
Rconsole thing.

For those wanting a powerful/easy to customize editor (for R or else)  
but who feel a bit scared by Emacs, TextMate is really a worthy  
alternative.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fun.aggregate=mean in reshape

2008-02-12 Thread jiho

On 2008-February-12  , at 11:11 , Henrique Dallazuanna wrote:
> You can use aggregate:
>
> aggregate(data[,c("ozone", "solar.r", "wind", "temp")],
> list(month=data$month), mean)
>
> On 12/02/2008, [Ricardo Rodriguez] Your XEN ICT Team <[EMAIL PROTECTED] 
> > wrote:
>> Hi all,
>>
>> We are facing a problem while introducing ourselves to Reshape  
>> package
>> use. Melt seems to work fine, but cast fails when we use mean as
>> fun.aggregate. As you see here, length and sum work fine, but mean
>> throws this same error whatever dataset we use.
>>
>>> cast(aqm, month ~ variable, length)
>>  month ozone solar.r wind temp
>> 1 526  27   31   31
>> 2 6 9  30   30   30
>> 3 726  31   31   31
>> 4 826  28   31   31
>> 5 929  30   30   30
>>> cast(aqm, month ~ variable, sum)
>>  month ozone solar.r  wind temp
>> 1 5   6144895 360.3 2032
>> 2 6   2655705 308.0 2373
>> 3 7  15376711 277.2 2601
>> 4 8  15594812 272.6 2603
>> 5 9   9125023 305.4 2307
>>> cast(aqm, month ~ variable, mean)
>> Error in get(as.character(FUN), mode = "function", envir = envir) :
>>  variable "fun" of mode "function" was not found
>>>
>>
>>
>> Our environment:
>>
>>> version
>>   _
>> platform   i386-apple-darwin8.10.1
>> arch   i386
>> os darwin8.10.1
>> system i386, darwin8.10.1
>> status
>> major  2
>> minor  6.2
>> year   2008
>> month  02
>> day08
>> svn rev44383
>> language   R
>> version.string R version 2.6.2 (2008-02-08)
>>
>>
>>> installed.packages()
>>
>> reshape"reshape"
>> "/Library/Frameworks/R.framework/Resources/library" "0.8.0"
>> NANA
>>
>>
>> Please, could you help use to work out this issue? Thanks!

It probably won't help much, but may be interesting for diagnostic  
purposes: it works for me on the same environment except for R 2.6.1  
instead of 2.6.2.
Have you tried specifying the argument name explicitely?

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot and xlim/ylim

2007-09-20 Thread jiho
Hello everyone,

I am (happily) using ggplot2 for all my plotting now and I wondered  
is there is an easy way to specify xlim and ylim somewhere when using  
the ggplot syntax, as opposed to the qplot syntax. Eg.

  qplot(data=mtcars,y=wt, x=qsec,xlim=c(0,30))

<->

ggplot(mtcars, aes(y=wt, x=qsec)) + geom_point() + ???

Indeed the ggplot syntax is in general more flexible and powerful and  
I usually rely on it in scripts. It would be nice to know how to use  
xlim/ylim with this syntax.

Thank you in advance.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plotting from different data sources on the same plot (with ggplot2)

2007-09-27 Thread jiho
Hello everyone (and Hadley in particular),

I often need to plot data from multiple datasets on the same graph. A  
common example is when mapping some values: I want to plot the  
underlying map and then add the points. I currently do it with base  
graphics, by recording the maximum region in which my map+point will  
fit, plotting both with these xlim and ylim parameters, adding par 
(new=T) between plot calls and setting the graphical parameters (to  
draw axes, titles, to set aspect ratio) by hand. This is not easy nor  
practical when the plots become more and more complicated.

The ggplot book specifies that "[ggplot] makes it easy to combine  
data from multiple sources". Since I use ggplot2 as much as I can  
(thanks it's really really great!) I thought I would try producing  
such a plot with ggplot2.

NB: If this is possible/easy with an other plotting package please  
let me know. I am not looking for something specific to maps but  
rather for a generic mechanism to throw several pieces of data to a  
graph and have the plotting routine take care of setting up axes that  
will fit all data on the same scale.

So, now for the ggplot2 part. I have two data sources: the  
coordinates of the coastlines in a region of interest and the  
coordinated of sampling stations in a subset of this region. I want  
to plot the coastline as a line and the stations as points, on the  
same graph. I can plot them independently easily:

p1 = ggplot(coast,aes(x=lon,y=lat)) + geom_path() + coord_equal(ratio=1)
p1$aspect.ratio = 1

p2 = ggplot(coords,aes(x=lon,y=lat)) + geom_point() + coord_equal 
(ratio=1)
p2$aspect.ratio = 1

but I cannot find how to combine the two graphs. I suspect this has  
probably to be done via different layers but I really can't find how.  
In particular, I would like to know how to deal with the scales: can  
ggplot take care of plotting the two datasets on the same coordinates  
system or do I have to manually record the maximal range of x and y  
and force ggplot to use this on both layers, as I did with base  
graphics? (of course I would prefer the former ;) ).

To test it further with real data, here is my code and data:
http://jo.irisson.free.fr/dropbox/test_ggplot2.zip

A small additional precision: I would like the two datasets to stay  
separated. Indeed I could probably combine them and plot everything  
in one step by clever use of ggplot arguments. However this is just a  
simple example and I would like to add more in the future (like  
trajectories at each station, points proportional to some value at  
each station etc.) so I really want the different data sources to be  
separated and to produce the plot in several steps, otherwise it will  
soon become too complicated to manage.

Thank you very much in advance for your help.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Graphics and LaTeX documents with the same font

2007-09-28 Thread jiho

On 2007-September-28  , at 16:57 , Frank E Harrell Jr wrote:
> jiho wrote:
>> On 2007-September-28  , at 15:18 , Paul Smith wrote:
>>> On 9/28/07, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
>>>>> I know how to export graphics as pdf files and then how to include
>>>>> them in LaTeX documents. However, I do not know how to do in   
>>>>> order to
>>>>> have the text of the graphics written with the font selected  
>>>>> for the
>>>>> LaTeX document. Is that possible?
>>>> [...]
>> If you don't mind an extra step between R and LaTeX, you could  
>> use  Inkscape to modify your graphics:
>>  http://www.inkscape.org/
>> It is a (very nice!) vector graphics editor which:
>> - works with SVGs (as produced with the RSvgDevice package)
>> - imports PDFs (really well in the latest development version)
>> - is available for free, on most platforms
>> and
>> - exports PDFs that nicely integrate in LaTeX documents
>> - exports PSTricks graphics
>> Then two roads are opened for you:
>> 1- either get a TTF version of the LaTeX fonts (there are  
>> packages  for this on all linux distros I know, for use with Lyx  
>> and you can  probably find them on the web otherwise) and change  
>> all the fonts to  those once your document is in Inkscape (select  
>> all > text and font >  select the font)
>> 2- or open the document with inkscape and export it to pstricks
>> I personally use Inkscape on all my R graphics because I find it   
>> easier and quicker to get decent graphics and R and refine their  
>> look  in Inkscape than to get them perfect in R in one shot  
>> ( though with  ggplot2 things are improving on R's side).

> As this works against principles of reproducible research, I  
> wouldn't recommend it.

Do you consider that changing the font size of the graphic would be  
altering the research result? Or laying out a 2d contour and a 3d  
plot in parallel, or changing the line color/pattern...? My  
modifications are usually of this kind. Of course those things are  
doable with R but they are usually immensely easier in a graphics  
program (where the color palettes are predefined, the dash patterns  
are more diverse etc.).

For example, I often find myself using the same plot in an article, a  
presentation, and a poster, usually with different color palettes and  
font requirements. I just open the pdf, change the colors, font and  
font size to match the design of the article/presentation/poster,  
realign the labels a bit and re-save it. I don't think that I am  
doing any harm to my result or present any false information to the  
readers, I just make the graphics easier on their eyes.

But maybe I am a bit too much of a purist on these maters. I just  
find that, much too often, research results that represent months of  
work are presented as narrow, black and white (possibly even  
pixallated!) captures of article graphics which don't do justice to  
the quality of the work behind them. I don't think there is any harm  
in making (good) science look a bit "sexier", do you?

Jean-Olivier Irisson
---
UMR 5244 CNRS-EPHE-UPVD, 52 av Paul Alduy, 66860 Perpignan Cedex, France
+336 21 05 19 90
http://jo.irisson.free.fr/work/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Graphics and LaTeX documents with the same font

2007-09-28 Thread jiho

On 2007-September-28  , at 15:18 , Paul Smith wrote:
> On 9/28/07, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
>>> I know how to export graphics as pdf files and then how to include
>>> them in LaTeX documents. However, I do not know how to do in  
>>> order to
>>> have the text of the graphics written with the font selected for the
>>> LaTeX document. Is that possible?
>>
>> Well, it depends on what that font is.  But if it is TeX font,
>> see the section called 'TeX fonts' in ?postscript and the detailed
>> description in the article in R-news 6/2 by Paul Murrell and myself.
>>
>> If it is an Adobe Type1 font such as Times New Roman, just specify an
>> appropriate family in the pdf() call.
>>
>> Dietrich Trenkler wrote:
>>
>>> maybe you will find the psfrag package useful.
>>
>> I doubt it will be even usable with PDF (there are pdfrack and  
>> Xfigfrag,
>> though), and with postscript it is at best a kludge as R does its own
>> micro-positioning of text based on the font metrics.
>
> Thanks to both. PSTricks
>
> http://en.wikipedia.org/wiki/PSTricks
>
> draws figures that, when inserted in a LaTeX document, their font
> matches the one selected for the LaTeX document. If I may, I would
> like to submit to your consideration the suggestion of implementing
> the exportation of R graphics to PSTricks.

If you don't mind an extra step between R and LaTeX, you could use  
Inkscape to modify your graphics:
http://www.inkscape.org/
It is a (very nice!) vector graphics editor which:
- works with SVGs (as produced with the RSvgDevice package)
- imports PDFs (really well in the latest development version)
- is available for free, on most platforms
and
- exports PDFs that nicely integrate in LaTeX documents
- exports PSTricks graphics
Then two roads are opened for you:
1- either get a TTF version of the LaTeX fonts (there are packages  
for this on all linux distros I know, for use with Lyx and you can  
probably find them on the web otherwise) and change all the fonts to  
those once your document is in Inkscape (select all > text and font >  
select the font)
2- or open the document with inkscape and export it to pstricks

I personally use Inkscape on all my R graphics because I find it  
easier and quicker to get decent graphics and R and refine their look  
in Inkscape than to get them perfect in R in one shot ( though with  
ggplot2 things are improving on R's side).

Cheers,

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Graphics and LaTeX documents with the same font

2007-09-29 Thread jiho
On 2007-September-28  , at 18:25 , Frank E Harrell Jr wrote:
> jiho wrote:
>> On 2007-September-28  , at 16:57 , Frank E Harrell Jr wrote:
>>> jiho wrote:
>>>> On 2007-September-28  , at 15:18 , Paul Smith wrote:
>>>>> On 9/28/07, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
>>>>>>> I know how to export graphics as pdf files and then how to  
>>>>>>> include
>>>>>>> them in LaTeX documents. However, I do not know how to do in   
>>>>>>> order to
>>>>>>> have the text of the graphics written with the font selected  
>>>>>>> for the
>>>>>>> LaTeX document. Is that possible?
>>>>>> [...]
>>>> If you don't mind an extra step between R and LaTeX, you could  
>>>> use  Inkscape to modify your graphics:
>>>> [...]
>>>> I personally use Inkscape on all my R graphics because I find  
>>>> it  easier and quicker to get decent graphics and R and refine  
>>>> their look  in Inkscape than to get them perfect in R in one  
>>>> shot ( though with  ggplot2 things are improving on R's side).
>>> As this works against principles of reproducible research, I  
>>> wouldn't recommend it.
>> Do you consider that changing the font size of the graphic would  
>> be altering the research result? Or laying out a 2d contour and a  
>> 3d plot
>
> Not per se, but accidents happen when editing graphics.  More  
> importantly it creates more work.  Datasets get updated/corrected  
> and graphics need to be reproduced.
>
>> in parallel, or changing the line color/pattern...? My  
>> modifications are usually of this kind. Of course those things are  
>> doable with R but they are usually immensely easier in a graphics  
>> program (where the color palettes are predefined, the dash  
>> patterns are more diverse etc.).
>> For example, I often find myself using the same plot in an  
>> article, a presentation, and a poster, usually with different  
>> color palettes and font requirements. I just open the pdf, change  
>> the colors, font and font size to match the design of the article/ 
>> presentation/poster, realign the labels a bit and re-save it. I  
>> don't think that I am doing any harm to my result or present any  
>> false information to the readers, I just make the graphics easier  
>> on their eyes.
>
> A great application for a wrapper graphics function with an  
> argument for presentation mode.

I could do that indeed but it would require changing the margins,  
device size, fonts, colors etc. all by hand in R. I am not saying  
this is impossible (well in some things are: R may not have access to  
all the fonts in my system, R won't produce print-ready CMYK pdfs  
etc.) but it is just much more trouble than producing one "OK"  
graphic with R and handling the finer presentational details in a  
program more suited for these maters. Not to mention that it would  
also suppose that I know all the presentation requirements in  
advance, when writing the plotting function, which is usually not the  
case. If I have to redo the plot months later I may as well rewrite a  
new plot script based on the old one and go with that.
Once again I am not saying this is impossible, I am just skeptical  
about the balance between the cost of producing pixel perfect  
graphics from code and the reproducibility benefit associated,  
particularly in R. MATLAB's or Scilab plotting models are more suited  
in this aspect: the plot is represented as an object, that can be  
saved, with properties that you can change _after_ its creation. So  
it is easy to come back, even months after the analysis, and change  
the colors, the margins etc. of the plot and to produce a pdf again.  
The grids packages goes this way fortunately!

>> But maybe I am a bit too much of a purist on these maters. I just  
>> find that, much too often, research results that represent months  
>> of work are presented as narrow, black and white (possibly even  
>> pixallated!) captures of article graphics which don't do justice  
>> to the quality of the work behind them. I don't think there is any  
>> harm in making (good) science look a bit "sexier", do you?
>
> Yes there is harm.  But to make bold lines, easy to read titles is  
> fine.  See the spar function in http://biostat.mc.vanderbilt.edu/ 
> SgraphicsHints for a starter.  Also see the setps, ps.slide, and  
> setpdf functions in the Hmisc package.

Thanks for the pointers, these functions look useful indeed.

I try to do the more I can in R [1], to reduce Inkscape to 

Re: [R] Plotting from different data sources on the same plot (with ggplot2)

2007-09-30 Thread jiho
On 2007-September-30  , at 18:35 , hadley wickham wrote:
>> The ggplot book specifies that "[ggplot] makes it easy to combine
>> data from multiple sources". Since I use ggplot2 as much as I can
>> (thanks it's really really great!) I thought I would try producing
>> such a plot with ggplot2.
>>
>> NB: If this is possible/easy with an other plotting package please
>> let me know. I am not looking for something specific to maps but
>> rather for a generic mechanism to throw several pieces of data to a
>> graph and have the plotting routine take care of setting up axes that
>> will fit all data on the same scale.
>
> I don't think it's easy with any other plotting system (although I'd
> be happy to be proven wrong), and was one of the motivations for the
> construction of ggplot.
>
>> So, now for the ggplot2 part. I have two data sources: the
>> coordinates of the coastlines in a region of interest and the
>> coordinated of sampling stations in a subset of this region. I want
>> to plot the coastline as a line and the stations as points, on the
>> same graph. I can plot them independently easily:
>>
>> p1 = ggplot(coast,aes(x=lon,y=lat)) + geom_path() + coord_equal 
>> (ratio=1)
>> p1$aspect.ratio = 1
>>
>> p2 = ggplot(coords,aes(x=lon,y=lat)) + geom_point() + coord_equal
>> (ratio=1)
>> p2$aspect.ratio = 1
>
> There are a few ways you could describe the graph you want.  Here's
> the one that I'd probably choose:
>
> ggplot(mapping = aes(x = log, y = lat)) +
> geom_path(data = coast) +
> geom_point(data = coords) +
> coord_equal()
>
> We don't define a default dataset in the ggplot call, but instead
> explicitly define the dataset in each of the layers. By default,
> ggplot will make sure that all the data is displayed on the plot -
> i.e. the x and y scales show the union of the ranges over all
> datasets.
>
> Does that make sense?

It makes perfect sense indeed... unfortunately it does not work  
here ;) :

 > p = ggplot(mapping = aes(x=lon, y=lat)) + geom_path(data = coast)  
+ geom_point(data = coords) + coord_equal()
 > p
Error in get("get_scales", env = .$.scales, inherits = TRUE)(. 
$.scales,  :
 invalid subscript type

As expected there is nothing in the data part of the p object
 > p$data
NULL

But there is no data specification either in the layers
 > p$layers
[[1]]
geom_path: (colour=black, size=1, linetype=1) + ()
stat_identity: (...=) + ()
position_identity: ()
mapping: ()

[[2]]
geom_point: (shape=19, colour=black, size=2) + ()
stat_identity: (...=) + ()
position_identity: ()
mapping: ()

  There are no scales either, which apparently causes the error
 > p$scales
Scales:   ->

Should I get a newer version of ggplot? (I have version 0.5.4)

About the other solution:

>> When tinkering a bit more with this I thought that the more natural
>> and "ggplot" way to do it, IMHO, would be to have a new addition (`
>> +`) method for the ggplot class and be able to do:
>> p = p1 + p2
>> and have p containing both plots, on the same scale (the union of the
>
> You were obviously pretty close to the solution already!  - you just
> need to remove the elements that p2 already has in common with p1 and
> just add on the components that are different.

I would love to be able to do so because this way I can define custom  
plot functions that all return me a ggplot object and then combine  
these at will to get final plots (Ex: one function for the coastline,  
another for stations coordinates, another one which gets one data  
value, yet another for bathymetry contours etc etc.). This modular  
design would be more efficient than to have to predefine all  
combinations in ad hoc functions (e.g. one function for coast+bathy 
+stations, another for coast+stations only, another for coast+bathy 
+stations+data1, another for... you get the point).
However I don't see what to add and what to remove from the objects.  
Specifically, there is only "data" element in the ggplot object while  
my two objects (p1 and p2) both contain something different in $data.  
Should I define p$data as a list with p$data[[1]]=p1$data and p$data 
[[2]]=p2$data?

> You also need to
> remember that the ggplot function just sets up a list of defaults that
> can be overridden within each layer - there is very little
> functionality provided by the ggplot object itself.
>
>> scales of p1 and p2), and just one set of axes. And even:
>> p = add(p1, p2, drop=T)
>> which would give p1 and p2 plots clipped to the xlim and ylim of p2.
>
> Yes, it would be nice to have some syntax to overrule the default
> policy of showing all the data, although

Re: [R] plot graph with error bars trouble

2007-09-30 Thread jiho
On 2007-September-30  , at 22:40 , hadley wickham wrote:
>> hadley wickham wrote:
>>> [...]
>> PS if one specifies "errorbars" without specifying min and max one  
>> gets
>> the error
>>
>> Error in rbind(max, max, max, min, min, min) :
>> cannot coerce type closure to list vector
>>
>>   perhaps a more transparent error message could be supplied in this
>> (admittedly
>> stupid-user-error-obvious-in-hindsight) case?
>
> Yes, that's a good idea.  I'm still working on making the error
> messages more user friendly.  I think I'm making some progress, but
> it's fairly slow.

BTW, have you thought about opening ggplot2 development (provide a  
way to check out the dev code and have the possibility to submit  
patches at least) or do you prefer to keep it a personal project for  
now? I don't know how intricate your research and the development of  
ggplot2 are and would understand that you want to keep in 100% hadley  
wickham if you are to be judged on it academically. But boring work  
such as improving error messages, writing documentation and chasing  
small bugs is probably more efficiently done by a team than by a  
single person, with little free time. Furthermore, most of these  
things can be done without deep knowledge of the architecture of  
ggplot2.
I probably won' t be able to make significant contributions before a  
while but I would be happy to see how ggplot2 progresses and which  
directions are taken by following an SVN tree.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting from different data sources on the same plot (with ggplot2)

2007-09-30 Thread jiho
This was meant to be sent on the list:

On 2007-September-30  , at 23:12 , jiho wrote:
> On 2007-September-30  , at 21:01 , hadley wickham wrote:
>>>> [...]
>>> As expected there is nothing in the data part of the p object
>>>> p$data
>>> NULL
>>>
>>> But there is no data specification either in the layers
>>>> p$layers
>>> [[1]]
>>> geom_path: (colour=black, size=1, linetype=1) + ()
>>> stat_identity: (...=) + ()
>>> position_identity: ()
>>> mapping: ()
>>>
>>> [[2]]
>>> geom_point: (shape=19, colour=black, size=2) + ()
>>> stat_identity: (...=) + ()
>>> position_identity: ()
>>> mapping: ()
>>
>> Compare geom_point(data=mtcars) with str(geom_point(data =mtcars))
>> (which throws an error but you should be able to see enough).  So the
>> layers aren't printing out their dataset if they have one - another
>> bug.  I'll add it to my todo.
>
> I see. I did not know the `str` function. very useful.
>
>>> [...]
>>> About the other solution:
>>>
>>>>> When tinkering a bit more with this I thought that the more  
>>>>> natural
>>>>> and "ggplot" way to do it, IMHO, would be to have a new  
>>>>> addition (`
>>>>> +`) method for the ggplot class and be able to do:
>>>>> p = p1 + p2
>>>>> and have p containing both plots, on the same scale (the union  
>>>>> of the
>>>>
>>>> You were obviously pretty close to the solution already!  - you  
>>>> just
>>>> need to remove the elements that p2 already has in common with  
>>>> p1 and
>>>> just add on the components that are different.
>>>
>>> I would love to be able to do so because this way I can define  
>>> custom
>>> plot functions that all return me a ggplot object and then combine
>>> these at will to get final plots (Ex: one function for the  
>>> coastline,
>>> another for stations coordinates, another one which gets one data
>>> value, yet another for bathymetry contours etc etc.). This modular
>>> design would be more efficient than to have to predefine all
>>> combinations in ad hoc functions (e.g. one function for coast+bathy
>>> +stations, another for coast+stations only, another for coast+bathy
>>> +stations+data1, another for... you get the point).
>>> However I don't see what to add and what to remove from the objects.
>>> Specifically, there is only "data" element in the ggplot object  
>>> while
>>> my two objects (p1 and p2) both contain something different in  
>>> $data.
>>> Should I define p$data as a list with p$data[[1]]=p1$data and p$data
>>> [[2]]=p2$data?
>>
>> You can do this already :
>>
>> sample <- c(geom_point(data = coast), geom_path(data = streams),  
>> coord_equal())
>> p + sample
>>
>> I think the thing you are missing is that the elements in ggplot()  
>> are
>> just defaults that can be overridden in the individual layers
>> (although the bug above means that isn't working quite right at the
>> moment).  So just specify the dataset in the layer that you are
>> adding.
>>
>> You can do things like:
>>
>> p <- ggplot(mapping = aes(x=lat, y = long)) + geom_point()
>> # no data so there's nothing to plot:
>> p
>>
>> # add on data
>> p %+% coast
>> p %+% coords
>
> That's great!
> In fact I think I found exactly what I was looking for. I can just do:
>   p = ggplot() + coord_equal()
>   p$aspect.ratio = 1
> to set up the plot, and then add the layers and have ggplot take  
> care of resizing and laying out everything automagically:
>   p = p + geom_path(data=coast, mapping=aes(x=lon, y=lat))
>   p = p + geom_point(data=coords, mapping=aes(x=lon, y=lat))
>   p = p + geom_text(data=coords, mapping=aes(x=lon, y=lat,  
> label=station))
>   etc...
> Oh, I love ggplot ;) !
>
>> The data is completely independent of the plot specification.   
>> This is
>> very different from the other plotting models in R, so it may take a
>> while to get your head around it.
>
> Yes, indeed. That's a completely new way of thinking (especially  
> given my MATLAB, Scilab background) but how powerful! I found the  
> whole "data mapping" concept very elegant but did not grasp all the  
> flexibility behind it. I wonder how mainstream it can

Re: [R] Plotting from different data sources on the same plot (with ggplot2)

2007-09-30 Thread jiho
This would probably also be interesting to some:

On 2007-October-01  , at 00:48 , hadley wickham wrote:
>> That's great!
>> In fact I think I found exactly what I was looking for. I can just  
>> do:
>> p = ggplot() + coord_equal()
>> p$aspect.ratio = 1
>> to set up the plot, and then add the layers and have ggplot take care
>> of resizing and laying out everything automagically:
>> p = p + geom_path(data=coast, mapping=aes(x=lon, y=lat))
>> p = p + geom_point(data=coords, mapping=aes(x=lon, y=lat))
>> p = p + geom_text(data=coords, mapping=aes(x=lon, y=lat,
>> label=station))
>> etc...
>> Oh, I love ggplot ;) !
>
> Or even less verbosely:
>
> p = ggplot(mapping = aes(x=lon, y= lat) + coord_equal()
> p = p + geom_path(data=coast)
> p = p + geom_point(data=coords)
> p = p + geom_text(data=coords, aes(label = station))
>
>>> The data is completely independent of the plot specification.   
>>> This is
>>> very different from the other plotting models in R, so it may take a
>>> while to get your head around it.
>>
>> Yes, indeed. That's a completely new way of thinking (especially
>> given my MATLAB, Scilab background) but how powerful! I found the
>> whole "data mapping" concept very elegant but did not grasp all the
>> flexibility behind it. I wonder how mainstream it can get since so
>> many people are used to an other graphics paradigm.
>
> It's definitely a big change, but I hope that people will see the
> potential benefits and invest some time learning it.  I definitely
> have a lot to improve on the documentation though!
>
>> Anyway, I just need to define a new geom_arrow now, to plot wind
>> velocities arrows at several locations, and I'll be a happy man. Is
>> there a specific reason why '...' arguments are not passed to grid
>> functions or is it just to keep the complexity under control? I am
>> thinking in particular that:
>> p = ggplot(coords) + geom_segment(mapping=aes(x=lon,  
>> y=lat, xend=lon
>> +0.03 ,yend=lat+-0.02), arrow=arrow(length=unit(0.1,"inches")))
>> would do exactly what I want provided that the 'arrow' argument is
>> passed on to segmentsGrob which is used in geom_segment.
>
> In general, ... doesn't get passed on to the underlying grid function
> because there isn't a one-to-one mapping from geoms to grobs (take
> geom_boxplot for example).  However, funnily enough, I have just added
> the arrows argument to geom_segment for my sister, so if you let me
> know what platform you're on, I can send you an updated version.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] clipping viewports

2007-10-01 Thread jiho
On 2007-October-01  , at 08:42 , Paul Murrell wrote:
> Mikkel Grum wrote:
>> Dear useRs,
>>
>> Why are the rotated blue and yellow boxes in the example below  
>> clipped outside of 6 x 6 inch window in the middle of the page??  
>> Where does the 6 x 6 inch window come from? I would like to make  
>> use of the entire page.
>
>
> 6x6 corresponds to the default size of the pdf() device, so it looks
> like an issue with the implementation of paper="a4".  Indeed, this
> variation works as expected for me ...
>
> pdf(file = "FarmMaps.pdf", width=7.6, height=11)
>
> I will look at why the original approach does not work.

I have been bitten by this several times already. From what I  
understood, the 'paper' argument just specifies the size of the sheet  
on which a device of size 'width'x'height' is printed. The only  
relations between the paper argument and the device size arguments are:
- when using the default size "special"
- when the device size is > paper size, in which case the device is  
resized to fit on the paper sheet
I would also like the device to take the full sheet when using the  
paper argument, unless overridden by specific width and height  
arguments.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Version control and R package development (was: plot graph with error bars trouble)

2007-10-01 Thread jiho
First a quick summary of the beginning of an interesting discussion  
(not all indented correctly but in the correct order at least):

> From: "Gabor Grothendieck" <[EMAIL PROTECTED]>
> On 9/30/07, hadley wickham <[EMAIL PROTECTED]> wrote:
>> On 9/30/07, jiho <[EMAIL PROTECTED]> wrote:
>>> [...]
>>> BTW, have you thought about opening ggplot2 development (provide a
>>> way to check out the dev code and have the possibility to submit
>>> patches at least) or do you prefer to keep it a personal project for
>>> now? [...]
>>
>> It's something I have thought a little bit about, but I haven't made
>> much progress. Ideally, if it's something that I do for ggplot2, I
>> should do it for all my other R packages too.  I have thought about
>> setting up google code projects for each package, which would also
>> provide a nice set of bugtracking tools.  I've cc'd Gabor on this
>> email in the hope that he might describe his experiences with this
>> approach.
>>
>>> [...]
>>
>> The one thing that google code currently lacks is a nice timeline +
>> browser interface.  I find this very useful for GGobi
>> (http://src.ggobi.org) and would like to maintain that functionality
>> somehow.  It also makes it easier to track progress of the code
>> through rss, or intermittent reading of the trac site.
>>
>> [...]
>
> If you already know svn then google code is very easy to use.  Setting
> yourself up on it is really just a few minutes of work in that  
> case.  I have
> used other similar sites but google code is by far the easiest one to
> work with of the ones I have tried. By default everyone has read  
> access
> and only you have write access so you still control the project.   
> You can
> browse through the R projects that are already in google code here:
> http://code.google.com/hosting/search?q=label:R


> From: "hadley wickham" <[EMAIL PROTECTED]>
> On 10/1/07, Gabor Grothendieck <[EMAIL PROTECTED]> wrote:
>> On 10/1/07, hadley wickham <[EMAIL PROTECTED]> wrote:
>>>
>>> The biggest drawback (to me) to both google code and R-forge, is  
>>> their
>>> failure to offer a nice interface to browser the svn repository and
>>> view the timeline of changes.  I particularly like trac (e.g.
>>> http://src.ggobi.org/) despite it's many problems, and I don't  
>>> think I
>>> want to do without that convenient view of my code.
>>
>> Maybe you are referring to something else but both R-Forge and
>> Google code allow you to browse the svn repository over the
>> intenet from within a web browser.  In Google code click on the  
>> Source
>> tab and then the Subversion repository link.  For example,
>
> Yes, but compare with:
>
> http://src.ggobi.org/timeline for seeing what has changed recently  
> and by who
> http://src.ggobi.org/browser for easily navigating the repository and
> setting back through revisions
>
> You can also subscribe to the RSS feed of the project timeline to keep
> track of what is changing.


On 2007-October-01  , at 20:24 , Gabor Grothendieck wrote:
> On 10/1/07, hadley wickham <[EMAIL PROTECTED]> wrote:
>>> These seem nearly identical to what you can get with R-Forge or with
>>> TortoiseSVN (and likely other svn clients too).  Since any developer
>>> is likely to have an svn client a web interface more  
>>> sophisticated than
>>> what is already available via the net has less utility than if  
>>> this info were
>>> not already available anyways.  Google code can send out email  
>>> alerts.
>>> On the other hand the complexity in dealing with Trac is a  
>>> significant
>>> disadvantage for projects the size of an R package.  I previously  
>>> used Trac
>>> for Ryacas but currently use a WISHLIST and NEWS file (both plain  
>>> text
>>> files created in a text editor) plus the svn log and find that  
>>> adequate.
>>> Clearly a lot of this is a matter of taste and of project size  
>>> and there is no
>>> right answer.
>>
>> That's true.  From my perspective, using a command line svn client on
>> OS X, I certainly prefer the web interface for exploring past  
>> commits.
>>  However, while any developer will have a svn client, a more casual
>> user or someone just interested in looking at the code won't, and I
>> don't think the google interface is that friendly.  (Mind you, that's
>> probably not a very common use case).

Re: [R] Version control and R package development (was: plot graph with error bars trouble)

2007-10-01 Thread jiho

On 2007-October-01  , at 23:13 , jiho wrote:
> [...]
> The end result looks like this:
>   http://cbetm.univ-perp.fr/irisson/svn/
> to see what an inner view looks like:
>   http://cbetm.univ-perp.fr/irisson/svn/distribution_data/tetiaroa/ 
> trunk/data/?rev=0&sc=1
> You can play around with diffs, view logs, compare files etc.

Forgot to mention that it has RSS feeds also (this seemed important  
for Hadley), I just disabled them since they are of little utility  
for me.

Cheers,

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pdf() device uses fonts to represent points - data alteration?

2007-10-04 Thread jiho
Hello all,

I discovered that the pdf device uses fonts to represent "points"  
symbols (as in plot(...,type="p",...) ). Namely it uses ZapfDingbats  
with symbol U+25cf. This can lead to problems when the font is not  
available, or available in another version (such as points being  
replaced by other symbols, or worst: slightly displaced).  
Furthermore, it also causes problems when opening the pdf files for  
editing in other programs. I know that for reproducibility one should  
avoid doing this but there are cases where R is simply not suited to  
produce the end result graphic directly using code (Ex: replace some  
colors by CMYK versions for color consistency in print). In addition,  
publishers also often like being able to retouch graphics to ensure  
fonts consistency or such, and this will be destructive in the case  
of these pdfs. For example, Inkscape interprets points as squares  
(more like U+2751 in ZapfDingbats) and Adobe Illustrator does not  
even recognize the font (substituting AdobePiStd).
I tried to embed fonts with embedFonts() but his does not solves the  
issue with editing (Inkscape produces a kind of star and AI still  
chokes on the font) and worst, it modifies how the original graphic  
renders in pdf viewers: the circles are now filled (I believe this is  
because this is the default state of the ZapfDingbats character).

So my questions are:
- does anyone have a work around this?
- why can't the pdf device use shapes instead of fonts to represent  
data point? It would appear as a much more robust approach and would  
ensure that the points are rendered the same everywhere. Font  
substitution in axes labels is not as bad since it does not modify  
the data itself (at worst the labels are offset a little bit) but  
font substitution on the data points can really harm the graphic.

Examples of code:
pdf("test.pdf")
plot(0,0,xlab="",ylab="",bty="n",xaxt="n",yaxt="n"); grid(lty=1);
dev.off()
embedFonts("test.pdf","pdfwrite","test_embed.pdf")

visualize the fonts:
pdffonts test.pdf

and a package with the two pdf files and bitmaps of how they render  
or are interpreted in various programs:
http://jo.irisson.free.fr/dropbox/test_R_pdf_fonts.zip

Thank you in advance for your attention and help.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linux editor for R+LaTeX, but not Emacs

2007-10-05 Thread jiho

On 2007-October-05  , at 18:56 , Christian Salas wrote:
> Using Tinn-R (in windows) is possible to run latex and R from the same
> editor, which was great. Now, I am using Ubuntu-linux, which has been
> better than WinXP. Currently, I am using Emacs (and then install ESS)
> for running LaTeX and R from a same editor-program (like i was doing
> with Tinn-R in windows). Nevertheless, and even though i have been  
> using
> Emacs for almost 2 years, it is not as friendly as i would like.
> I was wondering if you know of a linux editor that is able to run R  
> and
> compile LaTeX files from it.

I would look for an editor with scripting capabilities in which case  
sending the selection/current line to a terminal and running latex on  
the current file are really easy. Then if you want special features  
such as snippets, auto-completion and all you should probably keep a  
dedicated editor.
I am using TextMate on OS X currently and find it very powerful in  
combination with a plain old terminal[1]. An "equivalent" of TextMate  
on Linux could be Scribes:
http://scribes.sourceforge.net/
I don't know wether it has some functions designed explicitly for R  
or LaTeX but it is extensible and chances are that people already  
wrote scripts for you to use.
I would be interested in knowing what you find, since I essentially  
have the same needs and stayed on OS X in great part because of  
TextMate. I would be happy to find a linux replacement and ditch my  
half-bitten apple for a penguin.

[1] http://jo.irisson.free.fr/?p=32

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pdf() device uses fonts to represent points - data alteration?

2007-10-31 Thread jiho
Thank you very much for your answer, even so long after I first  
posted the message.

On 2007-October-31  , at 12:00 , Paul Murrell wrote:
> Hi
>
> jiho wrote:
>> Hello all,
>> I discovered that the pdf device uses fonts to represent "points"   
>> symbols (as in plot(...,type="p",...) ). Namely it uses  
>> ZapfDingbats  with symbol U+25cf. This can lead to problems when  
>> the font is not  available, or available in another version (such  
>> as points being  replaced by other symbols, or worst: slightly  
>> displaced).  Furthermore, it also causes problems when opening the  
>> pdf files for  editing in other programs. I know that for  
>> reproducibility one should  avoid doing this but there are cases  
>> where R is simply not suited to  produce the end result graphic  
>> directly using code (Ex: replace some  colors by CMYK versions for  
>> color consistency in print). In addition,  publishers also often  
>> like being able to retouch graphics to ensure  fonts consistency  
>> or such, and this will be destructive in the case  of these pdfs.  
>> For example, Inkscape interprets points as squares  (more like U 
>> +2751 in ZapfDingbats) and Adobe Illustrator does not  even  
>> recognize the font (substituting AdobePiStd).
>> I tried to embed fonts with embedFonts() but his does not solves  
>> the  issue with editing (Inkscape produces a kind of star and AI  
>> still  chokes on the font) and worst, it modifies how the original  
>> graphic  renders in pdf viewers: the circles are now filled (I  
>> believe this is  because this is the default state of the  
>> ZapfDingbats character).
>> So my questions are:
>> - does anyone have a work around this?
>> - why can't the pdf device use shapes instead of fonts to  
>> represent  data point? It would appear as a much more robust  
>> approach and would  ensure that the points are rendered the same  
>> everywhere. Font  substitution in axes labels is not as bad since  
>> it does not modify  the data itself (at worst the labels are  
>> offset a little bit) but  font substitution on the data points can  
>> really harm the graphic.
>
> If I recall correctly, the PDF device uses a character for small  
> circles because that looks better.  There is no PDF circle  
> primitive, so circles have to be drawn using bezier curves.  The  
> original author may be able to elaborate on that.

OK. I was suspecting that PDF did not have circle primitives indeed.  
That's a good reason.

> Two suggestions for workarounds:
> (i)  produce PostScript and then convert to PDF using something  
> like ghostscript (e.g., ps2pdf)
> (ii)  use an almost-but-not-quite opaque colour, e.g., rgb(0, 0,  
> 0, .99) for the points.  If the points are not fully opaque, the  
> character is not used.

(ii) is really good to know (and I would probably never have found it  
myself). (i) is not applicable since I use PDF to keep transparency.

Thanks for your help. I still think that not using fonts at all  
should be preferred because really strange things can happen with  
fonts while bezier curves are robust and do not depend at all on the  
rest of the OS. In this precise matter, robustness is probably to be  
preferred over appearance, since it involves the data directly.  
Anyway, I'm fine now with your workaround. I should file a bug report  
for this to be solved in a future release maybe.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pdf() device uses fonts to represent points - dataalteration?

2007-10-31 Thread jiho

On 2007-October-31  , at 17:01 , Greg Snow wrote:
> Another approach is to use the my.symbols function in the  
> TeachingDemos
> package (in place of the points function) and define how you want your
> circles represented (a polygon with enough sides is a good  
> approximation
> to a circle for most cases).

Thank you for your suggestion but unfortunately this won't solve the  
problem here since I a not using the base graphics and the 'points'  
function. I am using ggplot which in turn uses Grid for the plotting  
part. In fact I never even call the point drawing function in Grid to  
do the plots, it is called when appropriate (e.g. when adding  
outliers to a boxplot). I could of course modify the source of Grid  
to plot polygons instead of points and recompile the package but the  
problem is really in the pdf part (since outputing to SVG draws real  
circles) so I would rather not tweak the rest to get the pdf right.
Thanks anyway.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pdf() device uses fonts to represent points - data alteration?

2007-11-01 Thread jiho
On 2007-November-01  , at 20:18 , Thomas Petzoldt wrote:
> I had the same problem. When opening PDFs with a recent developer  
> version of inkscape all circles were replaced by the letter "q",  
> see a screenshoot of the imported figure:
>
> http://www.simecol.de/figs/R_pdf_inkscape.png
>
> I spent at least two hours trying different development versions of  
> inkscape, different versions of R, reading docs, trying different  
> machines, installing fonts etc., finally giving up. Now, the two  
> workarounds of Paul Murrell indeed solved the problem for *me*.  
> Thank you. Here are example results of workarounds (i) and (ii):
>
> http://www.simecol.de/figs/R_ps_pdf_inkscape.png
>
> or
>
> http://www.simecol.de/figs/R_pdftrans_inkscape.png
>
> One problem remains. I wanted to demonstrate post-editing PDFs with  
> inkscape to motivate students to use R for graphics even if they  
> dont want to "become programmers". However, double conversion (via  
> postscript) or the magics of transparency and opaqueness are not  
> yet the way to increase the trust of point-and-click users to R.  
> Maybe this is a topic for r-devel?

By the way, depending on what OS you are, you may find an entirely  
SVG workflow more suitable:
R with RSVvgDevice package -1-> SVG figures -2-> Inkscape -3->  
whatever you like (SVG, PNG, PDF...)
This gives all transparency, fonts etc to Inkscape so it is fine on  
this side. The only "problem" with this workflow for me is that many  
of my plots stay between stage 1 and 2 and I like to be able to view  
them quickly. I would need a quick SVG viewer but there are none on  
OS X. If you are on Linux, many documents viewers (eog, evince,  
gThumbs) can display SVGs so you would be all set.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.