Hi,
I am given the following table:
> head(hsa_refseq)
   chr       genome region    start     stop nu strand nu.1    nu.2
gene_id
1 chr1 hg19_refGene    CDS 67000042 67000051  0      +    0 gene_id
NM_032291
2 chr1 hg19_refGene   exon 66999825 67000051  0      +    . gene_id
NM_032291
3 chr1 hg19_refGene    CDS 67091530 67091593  0      +    2 gene_id
NM_032291
4 chr1 hg19_refGene   exon 67091530 67091593  0      +    . gene_id
NM_032291
5 chr1 hg19_refGene    CDS 67098753 67098777  0      +    1 gene_id
NM_032291
6 chr2 hg19_refGene   exon 67098753 67098777  0      +    . gene_id
NM_032291

What I've done is to find out how many of the elements on 3rd column are
"CDS", "exon".
sum(hsa_refseq$region=="CDS")
sum(hsa_refseq$region=="exon")

 But what I would like is to print for each chromosome how many are exons
and how many CDS. For example
chr1  has 5 CDS and 2 exons
chr2  has 10 CDS and 3 exons...

Can you tell what should I add? Or if I am doing this wrong, how should I do
it?

Thank you,
Regards,
Nanami

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to