Dear all, I’m always amazed at how these discussions balloon. Although I don’t think it has been intentional, there’s the potential for a juxtaposition where one should not really exist. I might not be reading the room correctly, but it seems that there is a tendency to view code-based software as mistake-laden because of flexibility inducing spurious results, and canned software as tried and trued, with limitations that prevent misadventures in data analysis. Essentially, there is only code-based software, and “canned” software is a constrained presentation of programming that creates a user interface (UI), which alleviates the user from coding. For example, as Andrés pointed out, gmShiny is canned software that alleviates users from having to perform the amount of coding required to use geomorph in R. But it is still geomorph in R. (One benevolence of gmShiny — like other good canned software packages — is that one can export the code that was used “under the hood” to perform the canned analysis, so one could better develop coding skills that achieve results found in the canned software.). Comparatively, it is a more limited presentation of geomorph, simply because some analytical procedures are difficult to can, which I think is support for Murat’s original point that it is better to embrace the challenge of learning the coding.
I also empathize with all parties regarding validation, and agree with James’ comment about the responsibility of programmers to be validators. I think there are two types of validation. One is statistical validation, i.e., evaluation of statistical properties like type I error rates and statistical power. This should be a requisite component of any development of a statistical method, whether offered via code-based or canned versions of software. I empathize with the concerns of others, whether this has been performed, as publishing R packages on CRAN does not require this. There are many publications of new statistical methods without statistical validation and I can attest that as someone who often reviews manuscripts for new methods, many do not recognize statistical validation as fundamental. The other type of validation is computational validation. This one is tricky because weird results can elude programmers, and unfortunately, might not cause errors. Andrea remarked on his many occasions finding bugs. I feel he was being kind to me by not revealing one recent one he helped to expose in our software. (This was related to QR decomposition in R producing computational 0s — numbers that should be 0 but are computationally small numeric values, like 1 x 10^-17 — which caused issues with phantom linear model parameters.) Although good programmers will try to anticipate all possible issues with their coding, it is near impossible. Even a reading of the NEWS file with any R update will reveal the many missteps of many programmers of the base software, who are certainly more numerous and experienced than most package developers. Computational validation is certainly important and programmers should be willing to assume this responsibility. (Unfortunately, many do not. It is easy to make software and never update it.) Again, this is not something unique to code-based software packages. Canned software packages require computational validation, as well. And if canned software packages offer only very limited options to avoid validation concerns, this is worse, in my opinion, than potentially introducing bugs in more comprehensive software. One comment made originally by Murat was important and should not be overlooked, “There are tons of resources and a very supportive community to help you with your challenges.” My recommendation for software choice is use the options that provide the most flexibility, have obvious statistical and computational validation, and are curated by people who are responsive, helpful, and proactive with improving their software. I might be biased in my perception, but I don’t think there is a shortage of these attributes in the R packages developed within the morphometric community. I would not worry about validation concerns, at least compared to the canned software options. Best, Mike > On Jan 7, 2025, at 9:43 AM, 'Adams, Dean [EEOB]' via Morphmet > <[email protected]> wrote: > > The main issue with canned software is their inflexibility. (Actually this > could be an issue with either canned software or R packages, but one can > modify R code.) What they do is force a statistical philosophy on their > users whether or not that implementation is correct for the particular > application the user wishes to investigate. That is undesirable. Statistical > analytics are not a one-size-fits-all; they must be adapted for the > particular hypothesis. Further, having easy to use software results in > biologists not thinking about their analytics to the extent that they should. > My perspective is that we should all spend as much time learning our > analytical approaches as we do learning the ‘biology’ that underlies our > hypotheses and our hard-earned data. Both are critical to sound quantitative > biology. > > Furthermore, as Andrea alluded, canned software is coded by people capable of > introducing bugs and is no different than R in this regard. If one relies on > the programmer to provide reliable results, there is no difference between > canned software or the R environment. However, because open source options > allow better scrutiny by the user, solutions to programming bugs are easier > to elucidate in code-based packages. The exception to this might be if one > pays a lot of money for software that comes with a large support department. > > To the questions stemming from Marianna’s post: > > Andrea, the default in base-R and most R-packages is type I SS. However, > geomorph and RRPP allow the user to choose type I, type II or type III SS > depending on their specific circumstance. That is more flexible and more > appropriate (not to be confused with model I and model II anova/regression, > which is a different thing). > > Marianna, Procrustes ANOVA in geomorph allows type I,II,III SS, uses correct > permutation procedures, and allows one to select the residual SS against > which terms are evaluated. The default in R and most other software is > testing terms against the model residual error. For nested factors and random > effects this is not correct, so one can select the correct term based on your > specific design. > > Regarding Murat’s original response, it is worth reiterating: embracing > coding can only expand one’s professional development and enhance creative > research opportunities. Avoiding it — especially because of potential > pitfalls caused by mistakes made by people brave enough to offer software to > others — will not provide safety or inspire innovation. > > Dean > > Dr. Dean C. Adams > Distinguished Professor of Evolutionary Biology > Department of Ecology, Evolution, and Organismal Biology > Iowa State University > https://faculty.sites.iastate.edu/dcadams/ > phone: 515-294-3834 > > From: [email protected] <[email protected]> On Behalf Of > alcardini > Sent: Tuesday, January 7, 2025 12:56 AM > To: Ann Ross <[email protected]> > Cc: Morphmet <[email protected]> > Subject: Re: [MORPHMET2] MorphoJ ProcrustesANOVA > > Dear All, > R has many advantages, but I am very sympathetic with Ann's point: in theory > we can and should check the software and, if open source, one has the code; > in practice most of us, users, don't do it, as, if we had those advanced > skills, we would probably program the functions ourselves in the first place. > I tend to double check results with at least two independent software, > whenever possible. That has limitations and cannot of course exclude errors > in both. Over the years, I've found bugs (including in R) in almost all the > programs I have used. Often they were minor ones, but sometimes they were > serious. In a small field, I do wonder how many independent validations are > done and that means both the theory and code behind functions in any type of > software. As I wrote, with open source software in theory one can check > everything, but even peer-reviewed functions are not always checked by > reviewers. > I suspect there is literature on this type of issues in science, but haven't > had time to search. > > For the Procrustes ANOVA, if Marianna looks for an alternative to MorphoJ > with the same identical design, that is difficult in my clearly limited > experience. > PAST has a permutational ANOVA but at least the version I use requires a > perfectly balanced design and cannot replicate, for instance, the > symmetry/asymmetry analysis (if that's Marianna's case). > R packages have functions for Procrustes ANOVA and the like, but they (again > in my limited experience as a basic R user) use type II sum of squares, > unlike MorphoJ which, I believe, uses type I. To avoid misunderstandings on > this, I am not advocating one or the other type of SSQ but simply stating > that different programs may use different types and that usually matters > unless the design is perfectly balanced. > > R is a great tool and has been a revolution. However, as Ann pointed out in a > context such as forensics, where errors can have dramatic consequences, most > users, like me, probably have to trust the statisticians and coders. I may be > wrong and surprisingly discover that the vast majority of R users can and do > check carefully the functions (and theory) they use. To avoid unnecessary > discussions on an obvious point, I completely agree that, at the very least, > R provides the option, unlike proprietary software. I admire those who use it > routinely and wish I could do the same. However, as a biologist without > expertise in numerical methods, I find most of the underlying theory and code > well beyond my understanding. > > All the best > > Andrea > > On Mon, 6 Jan 2025 at 04:47, 'Ann Ross' via Morphmet > <[email protected] <mailto:[email protected]>> wrote: > Dear Murat, > Exactly, the point. When dealing with black box and invalidated code to > determine if said code is doing what it’s supposed to do such as permutations > is an issue. So many untested and invalidated GUI’s exist that do not meet > forensic practice standards. I guess ok in general research perhaps fine. And > should not fall on the user. However, in forensic practice validation and > testing required. I will respectfully disagree. > A > Ann H. Ross, Ph.D., D-ABFA > > > > On Jan 5, 2025, at 10:04 PM, Murat Maga <[email protected] > <mailto:[email protected]>> wrote: > > > Dear Ann, > Not sure what you mean by not validated? What is the validation of a t-test > function in R or any other software library? Validation is the responsibility > of the person using the tools, not the developers. Open-source tools make > this validation for more simple, since if you have any concern you can look > under the hood. Evaluate the code line by line and then if you find an issue > you can easily take it up with the developers. > > In closed-source software this is almost near impossible. I would argue this > is a far more superior method of "validation" then to appeal to the authority. > > On Sunday, January 5, 2025 at 6:46:15 PM UTC-8 Ann Ross wrote: > Hi All, > The one thing to keep in mind that all this coding is important but not > validated. One needs to trust the results and if not validated leaves a lot > of questions. This is from a forensic perspective. > Ann > Ann H. Ross, Ph.D., D-ABFA > Ann H. Ross, Ph.D., D-ABFA > > > > On Jan 5, 2025, at 7:48 PM, 'Adams, Dean [EEOB]' via Morphmet > <[email protected] <mailto:[email protected]>> wrote: > > > Marianna, > > I completely understand that R and other coding approaches can seem daunting. > But I agree 100% with Murat, and encourage you and others to steer into the > wind! Coding is empowering! > > First, a bit of R-coding allows you to improve your data manipulation and > curation. This helps with scientific repeatability, compliance with journal > requirements (which increasingly require one to submit curated data and > scripts to a public repository), and basically enhances and encourages > open-source science. > > But more importantly, coding empowers you as a scientist. By moving to R, you > remove yourself from the restraints that exist with canned software, whose > options are necessarily limited by the buttons and options that the user has > available to point-and-click. In essence, while prepackaged software is easy > to use, it limits thinking and creativity by restricting > one’s analysis to those options that the software happens to have. > > The unfortunate outcome of such canned software is that our GM literature > becomes filled with many studies of similar analyses: not because those > biological topics are inherently interesting per se, but because that is what > the software happens to allow. This is the analytical version of what one of > my mentors (Larry Slobodkin) once called ‘intellectual painting by numbers.’ > I seriously hope that our field can do better now in the 21stcentury. > > My message: learning a bit of code breaks this cycle, and frees one to > investigate the questions that one actually wishes to explore, not just those > for which canned software has already provided. I strongly encourage you (and > others) to learn a bit of R, Python, or some other language so that your > creative science is not restricted! > > Best of luck in your journey. > > Dean > > Dr. Dean C. Adams > Distinguished Professor of Evolutionary Biology > Department of Ecology, Evolution, and Organismal Biology > Iowa State University > https://faculty.sites.iastate.edu/dcadams/ > phone: 515-294-3834 <tel:(515)%20294-3834> > > From: [email protected] <mailto:[email protected]> > <[email protected] <mailto:[email protected]>> On Behalf Of > Murat Maga > Sent: Sunday, January 5, 2025 12:29 PM > To: Morphmet <[email protected] <mailto:[email protected]>> > Subject: Re: [MORPHMET2] MorphoJ ProcrustesANOVA > > Dear Marianna, > > A quick comment: Instead of trying to work this with PAST, I highly encourage > you to spend that time trying to do the same in R using geomorph, or other > shape analysis libraries. Yes, it will probably take longer, yes it will be > somewhat bumpy road initially. But you will be much better set for the next > challenge. Graphical user interface applications are good up to a point (for > common tasks). And I am telling you as someone who is developing UI based > morphometrics analysis. > > Most often in biology, you will have to customize your analysis to the > specific question you are trying to answer. This is best done via scripting > in a flexible programming environment (whether that's R or Python or some > other language is irrelevant). There are tons of resources and a very > supportive community to help you with your challenges. Going forward, all > fields of biology will be more computational not less, and the sooner you > start warming up to the idea it will be better for your career. > > Your future self will thank you for that decision. > > M > > > On Sunday, January 5, 2025 at 5:05:47 AM UTC-8 [email protected] > <mailto:[email protected]>wrote: > Thanks everyone... I suspected this was the case but wanted to be sure I > wasn't missing anything. I'm going to look into using PAST first, since I'm > not the most comfortable in R. > > On Saturday, January 4, 2025 at 5:34:57 AM UTC-8 Adams, Dean [EEOB] wrote: > In R use geomorph. > > Dean > > Get Outlook for Android <https://aka.ms/AAb9ysg> > From: [email protected] <mailto:[email protected]> > <[email protected] <mailto:[email protected]>> on behalf of > alcardini <[email protected] <mailto:[email protected]>> > Sent: Friday, January 3, 2025 11:56:13 PM > To: [email protected] <mailto:[email protected]> <[email protected] > <mailto:[email protected]>>; morphmet2 <[email protected] > <mailto:[email protected]>> > Subject: Re: [MORPHMET2] MorphoJ ProcrustesANOVA > > Dear Marianna, > the tests are parametric in MorphoJ as far as I know. > One can do the permutations in R. If you have also a main factor, however, it > is a bit convoluted to design it (or at least I found only a convoluted way > of doing it). You'll find a description in a footnote in Table 1 of > https://europeanjournaloftaxonomy.eu/index.php/ejt/article/view/2527 > If you're analyzing symmetry/asymmetry, the 'trick' I used needs to be > reworked and may not work but it should be possible to do an equivalent > analysis in one of the morphometric packages (morpho, geomorph, not sure > about momocs). > I am sure a skilled R coder will be able to suggest better ways. > Good luck. > Cheers > > Andrea > > On Sat, 4 Jan 2025 at 04:37, [email protected] > <mailto:[email protected]><[email protected] <mailto:[email protected]>> > wrote: > Dear morphmet members, > I expect this will be a relatively easy question. > Does the Procrustes ANOVA in MorphoJ use permutations? I don't see it as an > option, though I see it in regression and in Matrix correlation. > I expect that since there isn't an option for it, then it is not a > permutation-based test, but it seems so odd that it wouldn't be. > I can't find it in the documentation, and I'm convinced it must be there and > I'm missing it. If there is something that discusses this, could you kindly > point me to it. > Thanks in advance, > Marianna C. > -- > You received this message because you are subscribed to the Google Groups > "Morphmet" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To view this discussion visit > https://groups.google.com/d/msgid/morphmet2/549e6df3-2eaf-46cc-9b34-f30f5caf2777n%40googlegroups.com > > <https://groups.google.com/d/msgid/morphmet2/549e6df3-2eaf-46cc-9b34-f30f5caf2777n%40googlegroups.com?utm_medium=email&utm_source=footer>. > > > -- > E-mail address: [email protected] <mailto:[email protected]>, > [email protected] <mailto:[email protected]> > WEBPAGE: https://sites.google.com/view/alcardini2/ > or https://tinyurl.com/andreacardini > -- > You received this message because you are subscribed to the Google Groups > "Morphmet" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To view this discussion > visithttps://groups.google.com/d/msgid/morphmet2/CAJ__j7O9uG-7sAMvKFZVmm3EgojQU-o-Hs_oZSbVMhTE9zLsQw%40mail.gmail.com > > <https://groups.google.com/d/msgid/morphmet2/CAJ__j7O9uG-7sAMvKFZVmm3EgojQU-o-Hs_oZSbVMhTE9zLsQw%40mail.gmail.com?utm_medium=email&utm_source=footer>. > -- > You received this message because you are subscribed to the Google Groups > "Morphmet" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To view this discussion visit > https://groups.google.com/d/msgid/morphmet2/5a1598a7-96b9-4cd4-9a1e-ec45c61fcc64n%40googlegroups.com > > <https://groups.google.com/d/msgid/morphmet2/5a1598a7-96b9-4cd4-9a1e-ec45c61fcc64n%40googlegroups.com?utm_medium=email&utm_source=footer>. > -- > You received this message because you are subscribed to the Google Groups > "Morphmet" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To view this discussion visit > https://groups.google.com/d/msgid/morphmet2/CO6PR04MB84274559E14A8D833235AD6FA2102%40CO6PR04MB8427.namprd04.prod.outlook.com > > <https://groups.google.com/d/msgid/morphmet2/CO6PR04MB84274559E14A8D833235AD6FA2102%40CO6PR04MB8427.namprd04.prod.outlook.com?utm_medium=email&utm_source=footer>. > -- > You received this message because you are subscribed to the Google Groups > "Morphmet" group. > To unsubscribe from this group and stop receiving emails from it, send an > email [email protected] > <mailto:[email protected]>. > To view this discussion visit > https://groups.google.com/d/msgid/morphmet2/27206e9b-5373-4c01-8a80-99a007c8e8c4n%40googlegroups.com > > <https://groups.google.com/d/msgid/morphmet2/27206e9b-5373-4c01-8a80-99a007c8e8c4n%40googlegroups.com?utm_medium=email&utm_source=footer>. > -- > You received this message because you are subscribed to the Google Groups > "Morphmet" group. > To unsubscribe from this group and stop receiving emails from it, send an > email [email protected] > <mailto:[email protected]>. > To view this discussion visit > https://groups.google.com/d/msgid/morphmet2/EE95733E-EEF5-4073-ACD7-F5EBE0F8CA70%40ncsu.edu > > <https://groups.google.com/d/msgid/morphmet2/EE95733E-EEF5-4073-ACD7-F5EBE0F8CA70%40ncsu.edu?utm_medium=email&utm_source=footer>. > > > -- > E-mail address: [email protected] <mailto:[email protected]>, > [email protected] <mailto:[email protected]> > WEBPAGE: https://sites.google.com/view/alcardini2/ > or https://tinyurl.com/andreacardini > -- > You received this message because you are subscribed to the Google Groups > "Morphmet" group. > To unsubscribe from this group and stop receiving emails from it, send an > email [email protected] > <mailto:[email protected]>. > To view this discussion visit > https://groups.google.com/d/msgid/morphmet2/CAJ__j7OOMdDodm-7V46VC7MpTZ420YYk%2BNqPXwOUA6qP5n1vyA%40mail.gmail.com > > <https://groups.google.com/d/msgid/morphmet2/CAJ__j7OOMdDodm-7V46VC7MpTZ420YYk%2BNqPXwOUA6qP5n1vyA%40mail.gmail.com?utm_medium=email&utm_source=footer>. > > -- > You received this message because you are subscribed to the Google Groups > "Morphmet" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To view this discussion visit > https://groups.google.com/d/msgid/morphmet2/CO6PR04MB84275E8E7B57F8942FB92B88A2112%40CO6PR04MB8427.namprd04.prod.outlook.com > > <https://groups.google.com/d/msgid/morphmet2/CO6PR04MB84275E8E7B57F8942FB92B88A2112%40CO6PR04MB8427.namprd04.prod.outlook.com?utm_medium=email&utm_source=footer>. -- You received this message because you are subscribed to the Google Groups "Morphmet" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/morphmet2/F46C366B-1345-4A71-8248-271B9E5B70FC%40gmail.com.
