[R] Pattern Matching within Vector?
Dear mailing list, I'm stuck with a tricky problem here - at least it seems tricky to me, being not really talented in pattern matching and regex matters. I'm analysing amino acid mutations by position and type of mutation. E.g. (fictitious example) in position 92, I can find L92V, L92MV, L92I... L is in this example the wild-type amino-acid, and everything behind the position number is a mutation (single amino acid or mixture). I'm only interested in the mutation information, so: Say I've got this vector: bla -> c("V", "MV", "I", "IL", "PT", "M", "E", "OM") I'd like to count only those elements that are "truly unique" mutations, i.e.count "V", "MV" as 1, "I", "IL" as 1, "PT" as 1, "M" as 1, "E" as 1, not count "OM". I could do it iteratively: Element 1: V. Keep. Element 2: MV. Match Keep vs New -> 1. I got already a V, so don't count. Element 3: I. Match Keep vs New -> 0. I is new, keep. Keep = V,I Element 4: IL. Match Keep vs New -> 1. I got already an I, so don't count. Element 5: PT. Match Keep vs New -> 0. PT is new, keep. Keep = V,I,PT Element 6: M: Match Keep vs New -> 0. M is new, keep. Keep = V,I,PT,M Element 7: E. Match Keep vs New -> 0. E is new, keep. Keep = V,I,PT,M,E Element 8: OM. Match Keep vs New -> 1. I got already M, so don't count. Keep vector= (V,I,PT,M,E), count =5 OK. There must be a more elegant way to do this! Something with vector-wise pattern matching or so?... By the way, I dont care e.g. which of "V" or "MV" is counted, what is important is that they are only counted as 1. Thanks for your help! Anne-Marie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] position legend below x-axis title
Dear helpers, I'm using a R script on several different datasets, which makes that axis scales may vary quite a lot from dataset to dataset. So what I'm looking for now, is how to automagically find out how to position the legend (horizontal) in the space below the x-axis title, and how to make sure that the legend is within the limits of the lower inner or outer margin? I'm aware of plot, device and figure regions, and of the "din, fin, pin, usr, mai, mar, omi, oma and xpd" parameters, of inner and outer margins. I cannot simply position my legend on "minus something", as that depends on "usr" coordinates, and those depend on the scale of the y-axis. I tried finding the ideal spot by taking the figure height and subtracting the upper margin, the height of the plot region, and half of the lower margin. This is quite tedious and doesn't help me, as I get a spot in inches, which doesn't to correspond to the "usr" coordinates. I also tried to convert between inches and "usr" coordinates using "xy.coords", but the result did not correspond to the position returned by "locator". I'm also aware of "mtext" and its fabulous arguments "side" and "line", but there I loose the functionality of displaying the legend symbols. Finally, I found a hint about using "layout", but there I admit I would need more than a hint - a small tutorial or graphical example with code would be very helpful. So, the question is, if there is per chance a way to pass "side" and "line" arguments to "legend", and if not, what is the best way to do what I try to do? BTW, the plot types involved are mostly line plots and barplots. Thanks a lot for your help, Anne-Marie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave / Latex per-chapter output
Hello to all who have helped me on this topic, first I need to apologize for apparently replying only now... In fact I use the "Pan" Newsreader to read the list, and I posted a reply to the thread a week after your suggestions through Pan, and I only now realised that the posting never arrived on the list although Pan gave me no error message at all! So, let me try again using the good old email-to-email way. You all helped me so much! This is what I'm doing now: - I separated my large file into several chapter files - In each file, I include a Sweave options file using "\SweaveInput" - I make sure to first run a "pre" file, which checks if some files containing data from the database that I need repeatedly are there or not, and if they are not too old. If the files are absent or expired, the database is queried and the files recreated. - In each file, I then "source" an init.R file, which reads the previously created files and sets some "global" variables I'll need all the time. So, the querying is done maximum 1 time, while the file reading and variable setting is re-run for each chapter, which doesn't seem to be a problem (performance-wise). - In my master tex file, I include the different chapters. If I want to generate PDF for only a single chapter, I cannot use the "\includeonly" directive, because I will always need to run "pre" first, and then the chapter I want. So, I just comment out the things I don't want to run. - In "my" Makefile (it's Mark's really, with some minor adaptations), I specify the following to make sure "pre" is run first before running the individual chapters: RNWFILES = pre.Rnw intro.Rnw $(wildcard c*.Rnw) So, with all this I can get "whole document" PDFs or "per-chapter" PDFs - Great! Actually, then I go on and feed the tex file(s) to "latex2html", a great tool, to generate HTML equivalents all ready with navigation, buttons and all. Maybe it's not yet perfect, but for sure it is much, much better than before! Greetings from Luxembourg, Anne-Marie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] recursively divide a value to get a sequence
Hi, if given the value of, say, 15000, I would like to be able to divide that value recursively by, say, 5, and to get a vector of a determined length, say 9, the last value being (set to) zero- i.e. like this: 15000 3000 600 120 24 4.8 0.96 0.192 0 These are in fact concentration values from an experiment. For my script, I get only the starting value (here 15000), and the factor by which concentration is divided for each well, the last one having, by definition, no antagonist at all. I have tried to use "seq", but it can "only" do positive or negative increment. I didn't either find a way with "rep", "sweep" etc. These function normally start from an existing vector, which is not the case here, I have only got a single value to start with. I suppose I could do something "loopy", but I'm sure there is a better way to do it. Thanks a lot for your help, hope the question is not too dumb... Anne-Marie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] recursively divide a value to get a sequence
Keith, I am simply baffled! Didn't think a second about doing it this way, tsss - Great! Thanks also for Daniel, Jim's and Bart's proposals! R is cool, I realise it every day again :-) Thanks!! On Wed, Jul 9, 2008 at 12:33 PM, Jim Lemon <[EMAIL PROTECTED]> wrote: > On Wed, 2008-07-09 at 11:40 +0200, Anne-Marie Ternes wrote: >> Hi, >> >> if given the value of, say, 15000, I would like to be able to divide >> that value recursively by, say, 5, and to get a vector of a determined >> length, say 9, the last value being (set to) zero- i.e. like this: >> >> 15000 3000 600 120 24 4.8 0.96 0.192 0 >> >> These are in fact concentration values from an experiment. For my >> script, I get only the starting value (here 15000), and the factor by >> which concentration is divided for each well, the last one having, by >> definition, no antagonist at all. >> >> I have tried to use "seq", but it can "only" do positive or negative >> increment. I didn't either find a way with "rep", "sweep" etc. These >> function normally start from an existing vector, which is not the case >> here, I have only got a single value to start with. >> >> I suppose I could do something "loopy", but I'm sure there is a better >> way to do it. >> > Well, if you really want to do it recursively (and maybe loopy as well) > > recursivdiv<-function(x,denom,lendiv,firstpass=TRUE) { > if(firstpass) lendiv<-lendiv-1 > if(lendiv > 1) { > divvec<-c(x/denom,recursivdiv(x/denom,denom,lendiv-1,FALSE)) > cat(divvec,ndiv,"\n") > } > else divvec<-0 > if(firstpass) divvec<-c(x,divvec) > return(divvec) > } > > Jim > > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tryCatch - return from function to main script
Dear helpers, I've got a main script, which calls 4 times a function on 4 different datasets respectively. This function runs "nls" and is located in another R script which is sourced into my main script. What I would like to have is this: If, e.g. in the 3rd call of the function, nls fails, because it can't converge, I would like it to return an error (value or message), and continue with the 4th call in my main script. I've tried "try", but it always completely stops execution. I've also played around with "tryCatch", but to be honest, the help page is quite cryptic to me. I'm sure "tryCatch" has a way of being told to "ok, stop this, and continue with the main script". As I'm quite in a hurry (need to finish this before leaving tomorrow), I'd be glad if you could give me a very practical example - I promise to look deeper into the details of exception handling in R when I'm back. Thanks a lot!! Anne-Marie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] genotype analysis
Dear mailing list, I'm still quite a newbie in the statistical analysis of genotype/allele data, resp. more generally in the analysis of categorical variables. Moreover, I'm currently totally confused by the many R packages available to do such analysis. Here is my case: I've got a list of genes, and a number of case-control population pairs, and for each population and gene, the various genotypes that have been found. I've got both aggregate data (ex. gene1: homozygote wildtype: 201, heterozygote mutation carrier: 34, homozygote mutation carrier: 5) and per-gene data (i.e. for gene1 a list of e.g. "V/V", "V/I", "II" etc). The question asked is whether there is a difference in the mutation pattern between the case and the control groups influencing the outcome, both at the level of a single gene, and at the level of their combination. Moreover, I would like to check for linkage desequilibrium (LD), as I know that some of these genes are located quite closely on the chromosome. OK, so up to now I've been doing the Chi-square tests, McNemar matched pairs test, Fisher test if my numbers were too small. As for the LD question, if I have understood correctly, I have to use log-linear regression. I have been trying several R packages, and I'm so confused now, because I don't know which one is best suited for my problem. I have to add that I'm new also to log-linear regression... I've used "hwde", and read the paper on which it is based (see hwde doc), but the package leaves out certain output rows that are shown in the paper, and it doesn't show which of the output rows is significant, as the paper does. Is there any simply way to interpret "hwde" output (something like a p-value)? Then there are the "GeneticsBase", "Genetics", "mapLD", "Hardy-Weinberg" packages. Some work only for a single gene, some apply a thing called "MLE", some "general linearized models", etc. I know these questions are as much basic statistical than R questions. But I'd be glad if you could help me find the best solution for my type of analysis, resp. point me to good resources that show me how to do this. The problem is that most resources show "how to" do the analysis, but they don't explain at all how to *interpret* their output. Thanks a lot in advance, Anne-Marie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sweave / Latex per-chapter output
Dear R-help, I am using Sweave and pdflatex to generate a large report from data contained in my database (Postgres via RODBC). Currently, I work with a single R/Sweave file, containing several "chapter" indications for the Latex engine. My master tex file sets the document class, and includes the introduction, the main Sweave file, and a conclusions and reference file. I use a makefile to produce the final PDF (based on the thread "Sweave, R and complex latex projects: http://tolstoy.newcastle.edu.au/R/e2/help/06/11/4891.html) What I would like to do, is to be able to get 2 types of output with the same code (I'm lazy ;-) ): 1. my large report in a single PDF file, for printing out and distributing 2. a PDF and HTML file *per chapter*, for displaying on our website and allowing people to download individual chapters I have tried the following things: - see if pdflatex has an option to split PDF output per chapter; as far as I see, it doesn't - separate the Sweave file into chapter parts. The problems here are 1) that I do a certain number of R preparations (variables setting, table querying) which are data that I will need in later parts of the code, 2) that I would need to embed the generated tex files with per-chapter master tex files setting the documentclass and other options and including the chapter; I also tried to see if it was possible to tell pdflatex to assume documentclass X even if it wasn't specified in the file, but that doesn't seem to work either - generate my large PDF report as usual and manually cut it into chapters (tedious) - use R via PHP to output per-chapter HTMLs which I could turn into PDFs using output buffering; this works for the graphics, but I'm unable to get back e.g. tabular data for proper display; also I would loose the latex-y beauty of my PDF As I'm a novice in Latex and Makefiles usage, I'd be glad if you could tell me if what I want to do is feasible (I'm sure it is), and which would be the best, fussless method to do it (i.e. generate both types of output without changing the R/Sweave code). I know you'll probably tell me to break my long Sweave code into smaller parts, but as I briefly said above, I do some variable setting and table querying at the start - things I will repeatedly need in later chapters (e.g. I query a population table for computing incidence rates several times in later chapters). If there is a better way to split the code without having to requery the database at each chapter, I'll be glad to know about that too! BTW, I'm working on Ubuntu Linux. Thanks a lot for your insight, Anne-Marie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.