Hello,
I've got a small dataset on box turtle shell measurements that I would like to
perform a detrended correspondence analysis on. I thought that it would be
interesting to examine the morphometrics for each species in the area of
overlap and in areas where neither species occurs.
I've taken a look at the dune and dune.env datasets in vegan. Using the str()
command gives me
> str(dune)
'data.frame': 20 obs. of 30 variables:
$ Belper: num 3 0 2 0 0 0 0 2 0 0 ...
$ Empnig: num 0 0 0 0 0 0 0 0 0 0 ...
$ Junbuf: num 0 3 0 0 0 0 0 0 0 0 ...
$ Junart: num 0 0 0 3 0 0 4 0 0 3 ...
...
However, when I try looking directly at the data frame using the edit command I
see that there is a column called "row.names" to the left of "Belper".
Likewise, when I use the str() command on dune.env I get
> str(dune.env)
'data.frame': 20 obs. of 5 variables:
$ A1 : num 3.5 6 4.2 5.7 4.3 2.8 4.2 6.3 4 11.5 ...
$ Moisture : Ord.factor w/ 4 levels "1"<"2"<"4"<"5": 1 4 2 4 1 1 4 1 2 4 ...
$ Management: Factor w/ 4 levels "BF","HF","NM",..: 1 4 4 4 2 4 2 2 3 3 ...
$ Use : Ord.factor w/ 3 levels "Hayfield"<"Haypastu"<..: 2 2 2 3 2 2 3 1
1 2 ...
$ Manure : Ord.factor w/ 5 levels "0"<"1"<"2"<"3"<..: 3 4 5 4 3 5 4 3 1 1
...
but using the edit() command shows a column named "row.names".
I assume that the the "row.names" column is used to link the two files together.
My turtle data is saved as a *.csv, and I've added a column called "row.names",
so that it looks like this
row.names,CL,CCL,CW,CCW,CH,CCH
1,104.4,131.8,89.887,137.4,43.391,89.7
2,108.79,135.9,87.78,118.1,50.72,71.2
3,114.12,126.1,89.33,132.8,142.39,78.3
4,102.87,128.2,84.2,125,45.42,72.4
5,84.6,104.8,72.61,111.8,41.1,57.3
I've called this file "turtles_dca.csv". I've also created a file called
"turtles_dca_env.csv" that looks like this
row.names,Species,Sex,Distribution,Concatenated,Species_overlap
1,Terrapene_ornata,Female,overlap,TO_F_Overlap,TO_Overlap
2,Terrapene_ornata,Female,overlap,TO_F_Overlap,TO_Overlap
3,Terrapene_ornata,Female,overlap,TO_F_Overlap,TO_Overlap
4,Terrapene_ornata,Female,overlap,TO_F_Overlap,TO_Overlap
5,Terrapene_ornata,Female,overlap,TO_F_Overlap,TO_Overlap
However, when I read the data into R using this command
turtles.env = read.csv("turtles_dca_env.csv", header = TRUE)
and then using the str() command I get
> str(turtles)
'data.frame': 67 obs. of 7 variables:
$ row.names: int 1 2 3 4 5 6 7 8 9 10 ...
$ CL : num 104.4 108.8 114.1 102.9 84.6 ...
$ CCL : num 132 136 126 128 105 ...
$ CW : num 89.9 87.8 89.3 84.2 72.6 ...
$ CCW : num 137 118 133 125 112 ...
$ CH : num 43.4 50.7 142.4 45.4 41.1 ...
$ CCH : num 89.7 71.2 78.3 72.4 57.3 73.4 67 57 68.8 68 ...
When I run decorana() on this dataset, it appears that the column "row.names"
is included in the analysis, which isn't what I'm looking for.
If I go ahead and delete the column "row.names" from my data frames (i.e.
removing it from turtles and turtles.env), I don't believe that the analysis is
performed correctly. The two species differ significantly in most of their
measurements, but the ordihull() and ordispider() commands show them
overlapping almost completely.
I think that I'm missing something pretty basic about inputting and formatting
this data for this analysis. Can anyone offer a suggestion on where I'm going
astray? I can send a copy of the data if anyone wants to look at it.
Best wishes,
Chris
University of Central Oklahoma
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.