I have a computer simulation in which a virtual agent end up in different areas of a layout based on several factors. There are 18 conditions in total. If I collapse the datapoint into bins, where each bin is one of the areas, the data would look like this:
x0 <- c(3,3,5,5,2) # computer simulation Now I would like to validate this model having human subjects going trough the same conditions, but I run into two sets of issues: 1. the first issue is due to the fact that the dataset is discrete and small (there may be less than 5 counts in a bin, and that's a problem for a Chi-Square Goodness of Fit test), also there may be ties. After some online digging I found two options: - a permutation test - a Cramer-von Mises test of goodness-of-fit (see this paper <https://journal.r-project.org/archive/2011/RJ-2011-016/RJ-2011-016.pdf> https://journal.r-project.org/archive/2011/RJ-2011-016/RJ-2011-016.pdf) I thought the Cramer-von Mises test of goodness-of-fit test could work, so I ran it with made-up data for *one human subject* and I get the following result: x0 <- c(3,3,5,5,2) # computer simulation x1 <- c(4,2,5,4,3) # subject 1 library(goftest) cvm.test(x0, ecdf(x1)) >Cramer-von Mises test of goodness-of-fit >Null hypothesis: distribution ‘ecdf(x1)’ >data: x0 >omega2 = 0.14667, p-value = 0.4106 So far so good. But now let’s say I would like to have more than one human subject, let’s say four of them. These are the results from the additional subjects: x2 <- c(3,3,5,2,5) # subject 2 x3 <- c(2,2,5,6,3) # subject 3 x4 <- c(3,2,5,6,2) # subject 4 Now I run in the second set of issues: 2. on the one side I have a single computer simulation, on the other side I have data from four subjects. Should I take the mean of the results for the human subjects? Then would my data still be “discrete”? Or should I run my simulation four times? But I would get always the same results, so the variance between the two datasets would be different. Any ideas? Maybe I should change the design and have more levels for my factors, so that I have more trials and the bins get bigger? [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.