Hi David, Thank you for your suggestions. I am quite the beginner at R and don’t understand how to actually implement your suggestion and am hoping for some further advice on that, if possible.
This is a subset of my data. Rows are host species, and columns parasite species. Three of the parasites are generalists, but P4L is a strict specialist on FORCOL (27 individuals have this parasite). H17L P25L P41L P4L AUTINF 39 0 0 0 GLYSPI 16 2 15 0 FORCOL 1 0 0 27 HYLPOE 3 0 2 0 HYLNAE 1 4 2 0 MYRMYO 2 5 2 0 THAARD 0 8 0 0 This is a list of host trait values for each of the hosts: abundance weight survival AUTINF 488 38 0.48 GLYSPI 827 14.1 0.59 FORCOL 156 44.3 0.55 HYLPOE 322 17.5 0.54 HYLNAE 309 14.5 0.73 MYRMYO 475 20.8 0.59 THAARD 429 18.4 0.67 And this is an estimate of host specificity of the parasites, incorporating prevalence and phylogeny: Specificity H17L 2.08 P25L 1.72 P41L 2.19 P4L 0 I want to determine whether specificity of the parasites relates to any of the host traits. For this, I would like to do a multiple regression. To avoid psedureplication, I want to include a host species only once in the matrix. So, for H17L, I could pick either of the hosts (except THAARD), etc., but once a host is picked for one parasite, it cannot be picked for another. For example, if I pick GLYSPI for H17L, GLYSPI has to be removed as a choice for P25L and P41L. Thus, I also have to randomize which parasite has its host picked first. In all cases, I want to lock FORCOL and P4L, so FORCOL will not be an option for H17L anymore. This last part I’m still uncertain about, I might just randomly pick hosts for all parasites and then risk losing the strict host species specialists from some matrices. If I make 2 random selections I might end up with: Random1 Random2 H17L AUTINF GLYSPI P25L GLYSPI HYLNAE P41L HYLPOE MYRMYO P4L FORCOL FORCOL For the first random table I would then do a multiple regression on the dependent specificity variable and independent host trait values: Specificity abundance weight survival 2.08 488 38 0.48 1.72 827 14.1 0.59 2.19 322 17.5 0.54 0 156 44.3 0.55 If I generate 1000 randomly selected host-parasite combinations, I would have 1000 such tables, on which I would have to run 1000 independent regressions. Since I’m using model selection and multimodel inference to estimate parameter values, I will end up doing the model selection 1000 times. Your second suggestion makes most sense to me, but I don’t understand how to implement it. Would you (or someone else) please give me some advise on that? Also, once I have the 1000 random host-parasite matrices, how do I link these to the tables of actual values (host traits and parasite specificity)? Thanks so much! Maria -- View this message in context: http://r.789695.n4.nabble.com/Random-resampling-of-columns-in-species-association-matrices-tp4620618p4623563.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.