Hi, I have been trying unsuccessfully to plot data using different colors
based on a variable within a subset of an imported file. The file I am
reading is about 20000 lines long and has a column (in the example called
FILE) that contains approximately 100 unique entries. I would like to plot a
subset of the data from the file and key the color from the FILE column,
This is what my file looks like :
 
CHR          SNP         BP    NMISS       BETA         SE         R2
T            P    REGION    FILE    RANDOM
   1  rs17035189   10519610      135     0.3518      1.928  0.0002501
0.1824       0.8555     TCTX    4730341    0.284627081
   6   rs3763311   32484154      109      -2.05      1.624    0.01467
-1.262       0.2096     TCTX    670603    0.083147673
   6   rs3892710   32790839      106     0.5695      4.743  0.0001386
0.1201       0.9047     TCTX    7150403    0.549192815
   6   rs3864300   32379785      102      9.208      6.416    0.02018
1.435       0.1544     TCTX    7210017    0.837265988
   6   rs6912002   32873245       13     -1.295      5.043   0.005963
-0.2569        0.802     TCTX    2710441    0.170566699
   5    rs4024109   35955374        9      26.19      31.01    0.09245
0.8444       0.4263     TCTX    2650653    0.298573497
   6   rs3129719   32769757       16      10.35       7.44     0.1215
1.391       0.1859     TCTX    2900504    0.378538235
   6    rs476885   32402690      109   -0.09378      1.552  3.411e-05
-0.06041       0.9519     TCTX    670603    0.017970964
  10   rs12570766    5602540      139     0.6182       6.66  6.289e-05
0.09283       0.9262     TCTX    4560767    0.004973939
etc 


And this is the code that I have:

assoc_data <- read.table("master.out", header =TRUE)
par(fig=c(0, 10, 0,  10 )/10, mar=c(10,8,2,8),xpd=NA, cex.axis=2)
attach(assoc_data)
curr_assoc <- assoc_data[CHR == 1 & BP > 500000 & BP < 1000000, ] #these
criteria change based on input from another file

#count the number of transcripts
transcripts <- length(unique(curr_assoc$FILE))

#generate that number of unique ³FILE² entries in my subset of data
my_colors <- rainbow(transcripts)

plot(curr_assoc$BP, log10(curr_assoc$P)*-1, pch=20,
col=c(my_colors)[curr_assoc$FILE], ylim=c(-15, 15),xaxs="i", xlab=NA,
cex=0.7, cex.lab=2)
detach(assoc_data)


The problem is that when I plot this I only see (for example) 2 colors
instead of the expected 10. I believe that the problem I am having is that
the FILE column is being recoded when I read the table (as a factor?) and
that only factors within the range of my colors are being plotted (so if I
have 10 colors but there are 100 unique entries in my FILE column, and the
variables recoded 2, 7, 12, 34, 60, 64, 65, 70 and 71 are used, only 2 and 7
will be plotted). 

Many thanks for any suggestions/pointers, I have dug around in the help
archives for a couple of hours but no joy.
-----------------------
Andrew Singleton


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to