I am using a simple R statement to read in the file:

a <- read.csv("Sample.dat", header=TRUE)

There is alot of data but the first few lines look like:

DayOfYear,Quantity,Fraction,Category,SubCategory
1,82,0.0000390392720794458,(Unknown),(Unknown)
2,78,0.0000371349173438631,(Unknown),(Unknown)
. . .
71,2,0.0000009521773677913,WOMEN,Piratesses
72,4,0.0000019043547355827,WOMEN,Piratesses
73,3,0.0000014282660516870,WOMEN,Piratesses
74,14,0.0000066652415745395,WOMEN,Piratesses
75,2,0.0000009521773677913,WOMEN,Piratesses

If I read the data in as above, the command

a[1]

results in the output 

[ reached getOption("max.print") -- omitted 16193 rows ]]

Shouldn't this be the first row?

a$Category[1]

results in the output

[1] (Unknown)
4464 Levels:   Tags ... WOMEN

But

a$Category[365]

gives me:

[1] 7 Plates   (Dessert),Western\n120,5,0.0000023804434194784,7 Plates   
(Dessert)
4464 Levels:   Tags ... WOMEN

There is something fundamental about either vectors of the read.csv command 
that I am missing here.

Thank you.

Kevin

---- jim holtman <[EMAIL PROTECTED]> wrote: 
> Please provide commented, minimal, self-contained, reproducible code,
> or at least a before/after of what you data would look like.  Taking a
> guess at what you are asking, here is one way of doing it:
> 
> 
> > x <- data.frame(cat=sample(LETTERS[1:3],20,TRUE),a=1:20, b=runif(20))
> > x
>    cat  a          b
> 1    B  1 0.65472393
> 2    C  2 0.35319727
> 3    B  3 0.27026015
> 4    A  4 0.99268406
> 5    C  5 0.63349326
> 6    A  6 0.21320814
> 7    C  7 0.12937235
> 8    A  8 0.47811803
> 9    A  9 0.92407447
> 10   A 10 0.59876097
> 11   A 11 0.97617069
> 12   A 12 0.73179251
> 13   B 13 0.35672691
> 14   C 14 0.43147369
> 15   C 15 0.14821156
> 16   C 16 0.01307758
> 17   B 17 0.71556607
> 18   B 18 0.10318424
> 19   C 19 0.44628435
> 20   B 20 0.64010105
> > # create a list of the indices of the data grouped by 'cat'
> > split(seq(nrow(x)), x$cat)
> $A
> [1]  4  6  8  9 10 11 12
> 
> $B
> [1]  1  3 13 17 18 20
> 
> $C
> [1]  2  5  7 14 15 16 19
> 
> > # or do you want the data
> > split(x, x$cat)
> $A
>    cat  a         b
> 4    A  4 0.9926841
> 6    A  6 0.2132081
> 8    A  8 0.4781180
> 9    A  9 0.9240745
> 10   A 10 0.5987610
> 11   A 11 0.9761707
> 12   A 12 0.7317925
> 
> $B
>    cat  a         b
> 1    B  1 0.6547239
> 3    B  3 0.2702601
> 13   B 13 0.3567269
> 17   B 17 0.7155661
> 18   B 18 0.1031842
> 20   B 20 0.6401010
> 
> $C
>    cat  a          b
> 2    C  2 0.35319727
> 5    C  5 0.63349326
> 7    C  7 0.12937235
> 14   C 14 0.43147369
> 15   C 15 0.14821156
> 16   C 16 0.01307758
> 19   C 19 0.44628435
> 
> 
> On Sat, Jul 12, 2008 at 3:32 AM,  <[EMAIL PROTECTED]> wrote:
> > I have search the archive and I could not find what I need so I will try to 
> > ask the question here.
> >
> > I read a table in (read.table)
> >
> > a <- read.table(.....)
> >
> > The table has column names like DayOfYear, Quantity, and Category.
> >
> > The values in the row for Category are strings (characters).
> >
> > I want to get all of the rows grouped by Category. The number of unique 
> > category names could be around 50. Say for argument sake the number of 
> > categories is exactly 50. Can I somehow get a vector of length 50 
> > containing the rows corresponding to the category (another vector)? I 
> > realize I can access any row a[i]$Category (right?). But I wanta vector 
> > containing the rows corresponding to each distinct Category name.
> >
> > Thank you.
> >
> > Kevin
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 
> 
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
> 
> What is the problem you are trying to solve?

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to