Dear All I am reposting because I my problem is real issue and I have been working on this. I know this might be simple to those who know it ! Anyway I need help ! Let me clear my point. I have huge number of datapoints plotted using either base plot function or xyplot in lattice (I have preference to use lattice). name xvar p 1 M1 1 0.107983837 2 M2 11 0.209125624 3 M3 21 0.163959428 4 M4 31 0.132469859 5 M5 41 0.086095130 6 M6 51 0.180822010 7 M7 61 0.246619925 8 M8 71 0.147363687 9 M9 81 0.162663127 ........ 5000 observations I need to plot xvar (x variable) and p (y variable) using either plot () or xyplot(). And I want show (print to graph) datapoint name labels to those rows that have p value < 0.01 (means that they are significant). With my limited R knowlege I can use text (x,y, labels) option to manually add the text, but I have huge number of data point(though I provide just 1000 here, potentially it can go upto 50,000). So I want to display name corresponding to those observations (rows) that have pvalue less than 0.05 (threshold). Here is my example dataset and my status: name <- c(paste ("M", 1:5000, sep = "")) xvar <- seq(1, 50000, 10) set.seed(134) p <- rnorm(5000, 0.15,0.05) dataf <- data.frame(name,xvar, p) # using lattice (my first preference) require(lattice) xyplot(p ~ xvar, dataf) #I want to display names for the following observation that meet requirement of p <0.01. which (dataf$p < 0.01) [1] 811 854 1636 1704 2148 2161 2244 3205 3268 4177 4564 4614 4639 4706 Thus significant observations are: name xvar p 811 M811 8101 0.0050637068 854 M854 8531 -0.0433901783 1636 M1636 16351 -0.0279014039 1704 M1704 17031 0.0029878335 2148 M2148 21471 0.0048898232 2161 M2161 21601 -0.0354130557 2244 M2244 22431 0.0003255200 3205 M3205 32041 0.0079758430 3268 M3268 32671 0.0012797145 4177 M4177 41761 0.0015487439 4564 M4564 45631 0.0024867152 4614 M4614 46131 0.0078381964 4639 M4639 46381 -0.0063151605 4706 M4706 47051 0.0032200517
I want the datapoint (8101, 0.0050637068) with M811 in the plot. Similarly for all of the above (that are significant). I do not want to label all out of 5000 who do have p value < 0.01. I know I can add manually - text (8101, 0.0050637068, M811) in plot() in base. plot (dataf$xvar,p) text (8101, 0.0050637068, "M811") text (8531, -0.0433901783, "M854") I need more automation to deal with observations as high as 50,000. In real sense I do not know how many variables there will be. You help is highly appreciated. Thank you; Best Regards Umesh R _____ From: Umesh Rosyara [mailto:rosyar...@gmail.com] Sent: Saturday, March 05, 2011 12:30 PM To: 'r-help@r-project.org' Subject: displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function Dear R users, Here is my problem: # example data name <- c(paste ("M", 1:1000, sep = "")) xvar <- seq(1, 10000, 10) set.seed(134) p <- rnorm(1000, 0.15,0.05) dataf <- data.frame(name,xvar, p) plot (dataf$xvar,p) abline(h=0.05) # I can know which observation number is less than 0.05 which (dataf$p < 0.05) [1] 12 20 80 269 272 338 366 368 397 403 432 453 494 543 592 691 723 789 811 [20] 854 891 931 955 I want to display (label) corresponding names on the plot above: means that 12th observation M12, 20th observation M20 and so on. Please note that I have names not in numerical sequience (rather different names), just provided for this example to create dataset easily. Thanks in advance Umesh R [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.