On Tue, Jun 17, 2008 at 9:28 AM, Tom Backer Johnsen <[EMAIL PROTECTED]> wrote:
> In a research project we are using a web-based tools for collecting data
> from questionnaire.  The system generates files that are simple to read as a
> data frame in the "long" format, which are simple to convert to the  "wide"
> format.
>
> Something that might happen are: (a) there are two (multiple) references to
> the same cell, and (b) if there are missing values?  So, the data set has
> two references to S2/T2 and none to the S2/T1 combination:
>
>> d
>     values person time
>  1       1     S1   T1
>  2       2     S1   T2
>  3       3     S1   T3
>  4       4     S1   T4
>  5      22     S2   T2
>  6       6     S2   T2
>  7       7     S2   T3
>  8       8     S2   T4
>  9       9     S3   T1
>  10     10     S3   T2
>  11     11     S3   T3
>  12     12     S3   T4
> reshape (d, idvar="person", v.names=c("values"), timevar="time",
> direction="wide")
>   person values.T1 values.T2 values.T3 values.T4
>  1     S1         1         2         3         4
>  5     S2        NA        22         7         8
>  9     S3         9        10        11        12
>
> The missing cell gets an NA as expected.  But the surprise is in the case
> where there are two references to the same cell.  The the *first* is used
> (22 rather than 6).

You might try using the reshape package instead:

last <- function(x) x[length(x)]
names(d) <- c("value", "person", "time")
cast(d, person ~ time, last)

You can find out more at http://had.co.nz/reshape

Hadley


-- 
http://had.co.nz/

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to