On Jun 20, 2012, at 10:34 PM, C W wrote:

I am a noob. I am familiar with factors, but not familiar with how that relates to "two distinct values". How were you able to tell?

Please point me out.
Mike

The first part of a factor's structure is a vector of integers, the part I copied. The second part, the .Label's is a vector of character class. The value of the factor is the 'n'-th item in the character vector where 'n' is the integer in the first part. I noted that you only had two unique values, 78 and 1.

What you should have done was convert the Excel column to numeric from percentage using the "Format/Cell" menu and then import.


On Wed, Jun 20, 2012 at 9:53 PM, David Winsemius <dwinsem...@comcast.net > wrote:
Dear Conventional Wisdom;

You do realize you only have two distinct values in that factor variable, right?

dat <- structure( c(78L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)....

After assigning your structure to 'dat':

> levels(dat)[78]
[1] "7.14%"

> levels(dat)[1]
[1] ""

You have 9 values of the empty string and one value of 7.14%.

You should start your journey to understanding factors by reading the FAQ entry on converting factors to numbers. This problem happens to be more complex because R has no 'percentage' type, so there is no as.numeric.percent coercion function for vectors of class factor or class character, although it would not be that difficult to construct one.

> setClass("percent", representation(a="factor")  )
[1] "percent"
> setAs("percent", "numeric", function(from) as.numeric(sub("%", "", as.character(from)))/100)
> class(dat) <- c("percent", class(dat))
> class(dat)
[1] "percent" "factor"
> as(dat, "numeric")
[1] 0.0714 NA NA NA NA NA NA NA NA NA

--
David.

-- David.


On Jun 20, 2012, at 9:26 PM, C W wrote:

Hi R list,
I imported values from Excel, there is a column with numbers like 45%, 65%,
12%.

I want to find its mean.  What should I use?

strisplit()
split()
parse()

Data from dput(),

structure(c(78L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("",

"-0.15%", "-0.34%", "-1.3%", "-10.77%", "-100.00%", "-11.45%",

"-12.53%", "-13.06%", "-15.36%", "-15.82%", "-16.96%", "-18.71%",

"-2.02%", "-2.94%", "-21.23%", "-25.00%", "-26.20%", "-29.79%",

"-3.16%", "-3.67%", "-30.52%", "-33.44%", "-37.48%", "-37.89%",

"-39.42%", "-45.88%", "-5.09%", "-51.64%", "-61.58%", "-62.87%",

"-63.51%", "-7.00%", "-7.90%", "-8.33%", "-8.58%", "-8.88%",

"-91.10%", "-94.08%", "-96.01%", "0.98%", "10.00%", "10.04%",

"10.64%", "11.11%", "114.32%", "12.09%", "12.68%", "13.77%",

"14.10%", "15.51%", "16.25%", "16.93%", "16.94%", "18.57%", "18.88%",

"2.46%", "2.55%", "2.79%", "2.93%", "20.00%", "22.67%", "24.50%",

"25.76%", "28.18%", "3.26%", "3.80%", "3.83%", "36.05%", "37.22%",

"40.63%", "5.53%", "5.70%", "6.19%", "6.62%", "6.72%", "63.33%",

"7.14%", "7.21%", "7.39%", "9.15%", "9.99%", "95.00%"), class = "factor")

Thanks,

Mike

       [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to