On Mon, 2007-10-08 at 15:35 +0200, Birgit Lemcke wrote: > Hello James, > > all of your suggestions work very well except of this: > > FemMal <- cbind(FemV1gezählt[2,], MalV1gezählt[2,]) > > colnames(FemMal) <- ("Females", "Males") > Fehler: syntax error
The OP missed out c() above, hence the syntax error. > colnames(FemMal) <- c("Females", "Males") > FemMal Females Males 1 133 79 2 203 237 3 51 76 > But it works if I do that: > > Namen<-c("Female","Male") > colnames(FemMal) <- (Namen) ^ ^ This is a bit redundant, unless you actually need Namen for something else. You also don't need the "(" ")" around Namen in the second line there. G > FemMal > > Female Male > 1 133 79 > 2 203 237 > 3 51 76 > > Greetings > > Birgit > > > > Am 04.10.2007 um 17:19 schrieb James Reilly: > > > > > Hi Birgit, > > > > First, can I suggest that you don't copy off-list conversations to > > the mailing list partway through? Not that I minded in this case, > > but it probably confuses people and the posting guide warns against > > it. > > > > I'll address your questions in reverse order. > > > > To get tables for each column, try: > > apply(FemV1Test, 2, table) > > > > Likewise for males: > > apply(MalV1, 2, table) > > > > To compare them, perhaps put them side by side: > > FemMal <- cbind(apply(FemV1Test, 2, table)[2,], apply(MalV1, 2, > > table)[2,]) > > colnames(FemMal) <- ("Females", "Males") > > FemMal > > > > You can then do arithmetic, plot them, sort by the difference, etc. > > plot(FemMal) > > FemMal[order(FemMal[,1]-FemMal[,2]),] > > > > About crossprod, cell (i,j) in the resulting matrix shows the > > number of cases with a 1 for attribute i and attribute j. This > > shows which attributes overlap most and least. > > > > The command "tab <- tab - diag(diag(tab))" puts zeroes down the > > diagonal, as was requested. One cosmetic reason for doing this is > > that the diagonal elements are often much larger than the off- > > diagonal ones, and zeroing them makes the table easier to read or > > display graphically. E.g. > > http://pbil.univ-lyon1.fr/ADE-4/ade4-html/table.dist.html > > > > Yes, any row with all NAs will make the crossprod all NAs too. You > > can ignore any rows with NAs as follows: > > CrossFemMal1_3<-crossprod(as.matrix(CrossFemMalVar1_3[apply > > (CrossFemMalVar1_3, 1, function (x) !any(is.na(x))),])) > > > > I'm not sure if I follow why you want to know about statistical > > significance here. Do you really think of the species in your study > > as a sample from a larger population of plant species, which you > > are trying to generalise about? > > > > If so, is the population much larger than your sample? And was your > > sample of species selected randomly, i.e. with equal selection > > probabilities? If not, standard tests probably won't apply. > > > > Regards, > > James > > > > > > On 2/10/07 2:44 AM, Birgit Lemcke wrote: > >> Hello James, > >> first I have to thank you for your help but there are some things > >> I don´t understand now. > >> I am not sur if I understand what this example gives me back: > >> ratings <- data.frame(id = c(1,2,3,4), att1 = c(1,1,0,1), att2 = c > >> (1,0,0,1), att3 = c(0,1,1,1)) > >> ratings > >> id att1 att2 att3 > >> 1 1 1 1 0 > >> 2 2 1 0 1 > >> 3 3 0 0 1 > >> 4 4 1 1 1 > >> tab <- crossprod(as.matrix(ratings[,-1])) > >> tab <- tab - diag(diag(tab)) > >> tab > >> att1 att2 att3 > >> att1 0 2 2 > >> att2 2 0 1 > >> att3 2 1 0 > >> As I understood it gives me the number how often we find the same > >> value for example comparing att1 and att2 for all id´s?. Is that > >> right? > >> What is this line doing: tab <- tab - diag(diag(tab)) > >> And what does the original output of crosspod mean: > >> att1 att2 att3 > >> att1 3 2 2 > >> att2 2 2 1 > >> att3 2 1 3 > >> I tried to do this with a part of my dataset > >> I used a table with 3 variables (only binary) > >> In the first part of the table I have the females (348 rows) and > >> in the second part the males (also 348 rows). > >> Then I tried this: > >> CrossFemMal1_3<-crossprod(as.matrix(CrossFemMalVar1_3)) > >> The output: > >> CrossFemMal1_3 > >> V1 V2 V3 > >> V1 NA NA NA > >> V2 NA NA NA > >> V3 NA NA NA > >> There was one row of NAs in my dataset. I presume this is > >> responsible for the NA results? So how can I deal here with NAs? > >> If I use two matrices (male and female) I get back amongst others > >> the comparison of att1male to att1 female. In the case that I use > >> the possibility of a percentage table output I get for example > >> 40%. Can I say then that if the percentage is lower than 50% the > >> attributes are significantly different? > >> Corresponding to your other suggestion: > >> sapply(c("1","2","3"), function(x) ifelse(regexpr(x, FemV1) > 0, > >> 1, 0)) > >> It gives me this output > >> 1 2 3 > >> [1,] 1 0 0 > >> [2,] 1 0 0 > >> [3,] 1 0 0 > >> [4,] 1 0 0 > >> [5,] 1 0 0 > >> [6,] 1 0 0 > >> [7,] 1 0 0 > >> [8,] 1 0 0 > >> [9,] 0 1 0 > >> . . . . > >> . . . . > >> I think now I should count the ones for 1, 2 and 3? > >> I tried to use table but it gives me only the counts for 1 and zero: > >> table(FemV1Test) > >> FemV1Test > >> 0 1 > >> 657 387 > >> How can I specify that it gives me the counts for every column? > >> And then do the same for MalV1 and compare both somehow? > >> Another time thanks in advance for your help. > >> Greetings Birgit > >> Am 29.09.2007 um 14:45 schrieb James Reilly: > >>> > >>> Hi Birgit, > >>> > >>> The first argument to regexpr should be just one character value, > >>> not a vector. Your call: > >>> regexpr(c("1","2","3"),FemV1) > >>> seems to have been interpreted as: > >>> regexpr("1",FemV1) > >>> > >>> I think you probably need something more like: > >>> sapply(c("1","2","3"), function(x) ifelse(regexpr(x, FemV1) > 0, > >>> 1, 0)) > >>> This will also work on multiple response data such as > >>> FemV1 <- c("13", "2", "13", "123", "1", "23") > >>> Then colSums will give you frequency counts for each attribute. > >>> > >>> I think you would need greatly simplify the multiple response > >>> data to apply anything like a paired t-test. Have you considered > >>> just crosstabulating the attributes of male plants versus the > >>> females? For some R code, see > >>> https://stat.ethz.ch/pipermail/r-help/2007-February/126125.html > >>> > >>> Regards, > >>> James > >>> > >>> > >>> On 29/9/07 3:37 AM, Birgit Lemcke wrote: > >>>> Hello James, > >>>> sorry that I have to ask you a second time but I don´t > >>>> understand what regexpr () is doing and how the syntax works. > >>>> I have a vectors that I converted to character string > >>>> as.character(FalV1) > >>>> [1] "1" "1" "1" "1" "1" "1" "1" "1" "2" > >>>> after that I did this, but without knowing what I am really doing > >>>> regexpr(c("1","2","3"),FemV1) > >>>> The output looked like that > >>>> [1] 1 1 1 1 1 1 1 1 -1 As i undertsood the function > >>>> looks for in this case 1, 2 or 3. If there is a match it gives > >>>> me back 1 if not it gives me back -1 > >>>> But I don´t know how this helps me now si I hope you will > >>>> explain me. > >>>> And there is another problem I have. cor the continous variables > >>>> I used a paired T-Test can I perform this approach also paired? > >>>> Thanks a lot in advance. > >>>> Greetings > >>>> Birgit > >>>> Am 21.09.2007 um 11:38 schrieb James Reilly: > >>>>> > >>>>> If I understand you right, you have several multiple response > >>>>> variables (with the responses encoded in numeric strings) and > >>>>> you want to see whether these are associated with sex. To > >>>>> tabulate the data, I would convert your variables into > >>>>> collections of dummy variables using regexpr(), then use table > >>>>> (). You can use a modified chi-squared test with a Rao-Scott > >>>>> correction on the resulting tables; see Thomas and Decady > >>>>> (2004). Bootstrapping is another possible approach. > >>>>> > >>>>> @article{, > >>>>> Author = {Thomas, D. Roland and Decady, Yves J.}, > >>>>> Journal = {International Journal of Testing}, > >>>>> Number = {1}, > >>>>> Pages = {43 - 59}, > >>>>> Title = {Testing for Association Using Multiple Response Survey > >>>>> Data: Approximate Procedures Based on the Rao-Scott Approach.}, > >>>>> Volume = {4}, > >>>>> Year = {2004}, > >>>>> Url=http://search.ebscohost.com/login.aspx? > >>>>> direct=true&db=pbh&AN=13663214&site=ehost-live <http:// > >>>>> search.ebscohost.com/login.aspx? > >>>>> direct=true&db=pbh&AN=13663214&site=ehost-live <http:// > >>>>> search.ebscohost.com/login.aspx? > >>>>> direct=true&db=pbh&AN=13663214&site=ehost-live>> > >>>>> } > >>>>> > >>>>> Hope this helps, > >>>>> James > >>>>> -- > >>>>> James Reilly > >>>>> Department of Statistics, University of Auckland > >>>>> Private Bag 92019, Auckland, New Zealand > >>>>> > >>>>> On 21/9/07 7:14 AM, Birgit Lemcke wrote: > >>>>>> First thanks for your answer. > >>>>>> Now I try to explain better: > >>>>>> I have species in the rows and morphological attributes in > >>>>>> the columns coded by numbers (qualitative variables; nominal > >>>>>> and ordinal). > >>>>>> In one table for the male plants of every species and in the > >>>>>> other table for the female plants of every species. The > >>>>>> variables contain every possible occurrence in this species > >>>>>> and this gender. > >>>>>> I would like to compare every variable between male and female > >>>>>> plants for example using a ChiSquare Test. > >>>>>> The Null-hypothesis could be: Variable male is equal to > >>>>>> variable Female. > >>>>>> The question behind all is, if male and female plants in this > >>>>>> species are significantly different and which attributes are > >>>>>> responsible for this difference. > >>>>>> I really hope that this is better understandable. If not > >>>>>> please ask. > >>>>>> Thanks a million in advance. > >>>>>> Greetings > >>>>>> Birgit > >>>>> > >>>> Birgit Lemcke > >>>> Institut für Systematische Botanik > >>>> Zollikerstrasse 107 > >>>> CH-8008 Zürich > >>>> Switzerland > >>>> Ph: +41 (0)44 634 8351 > >>>> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> > >> Birgit Lemcke > >> Institut für Systematische Botanik > >> Zollikerstrasse 107 > >> CH-8008 Zürich > >> Switzerland > >> Ph: +41 (0)44 634 8351 > >> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> > >> > > > > -- > > James Reilly > > Department of Statistics, University of Auckland > > Private Bag 92019, Auckland, New Zealand > > Birgit Lemcke > Institut für Systematische Botanik > Zollikerstrasse 107 > CH-8008 Zürich > Switzerland > Ph: +41 (0)44 634 8351 > [EMAIL PROTECTED] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.