Hi Sarah, apologies for the excess. A smaller example:
f<-structure(list(c("GDP per capita (LCU)", "Ratio to EZ GDP Per Cap" ), `2005` = c(32128, 0.1), `2009` = c(52163, 0.1), `2010` = c(63100, 0.1), `2011` = c(72461, 0.1), `2012` = c(81313, 0.1)), .Names = c("", "2005", "2009", "2010", "2011", "2012"), row.names = 3:4, class = c("cast_df", "data.frame")) nam2<- structure(list(var1 = c("GDP per capita (LCU)", "Ratio to EZ GDP Per Cap" ), digi = c(0, 1)), .Names = c("var1", "digi"), row.names = c("98", "110"), class = "data.frame") I'm trying to place a thousand separator in the numbers in the table f: > f 2005 2009 2010 2011 2012 3 GDP per capita (LCU) 32128.0 52163.0 63100.0 72461.0 81313.0 4 Ratio to EZ GDP Per Cap 0.1 0.1 0.1 0.1 0.1 and also have precision given by variable digi: > nam2 var1 digi 98 GDP per capita (LCU) 0 110 Ratio to EZ GDP Per Cap 1 format hi<-format(f,big.mark=",",scientific=F) gives me the comma, but now I'm not sure how to get the precision. Your answer seems to be doing what I want, although when I changed the testdata slightly >testdata[1,1]<-10000 > hi<-format(testdata,big.mark=",",scientific=F) > hi values digits 1 10,000.0 0 2 5.3 1 3 1.1 2 > apply(hi, 1, function(x)sub(paste("(^.*\\.\\d{", x[2], "})(\\d*)", sep=""), > "\\1", x[1])) 1 2 3 "10,000." " 5.3" " 1.1" The decimal appears to be left behind in 10,000. Unfortunately your approach is a bit too advanced for me, so I can't adapt it. Perhaps you could recommend somewhere where I could read up on what the caret and other symbols mean in your paste call? thanks for your help! Aidan On Wed, Dec 7, 2011 at 12:05 PM, Sarah Goslee <sarah.gos...@gmail.com> wrote: > Hi, > > Example data is crucial, but small simple example data is even better. > I'm too lazy to figure out which bits I need from your data, so here's > a simple example of one way to approach your question. You could > use gsub() in very much the same manner if you need more complex > output. > >> testdata <- data.frame(values=c(2.0, 5.3, 1.1), digits=c(0, 1, 2)) >> testdata > values digits > 1 2.0 0 > 2 5.3 1 > 3 1.1 2 > # a nice way that works on numbers >> apply(testdata, 1, function(x)sprintf(paste("%0.", x[2], "f", sep=""), x[1])) > [1] "2" "5.3" "1.10" > > # a messy way that works on strings >> apply(testdata, 1, function(x)sub(paste("(^.*\\.\\d{", x[2], "})(\\d*)", >> sep=""), "\\1", x[1])) > [1] "2" "5.3" "1.1" > > Also note that the second method will not add zeros to pad out the > end. If you need that, I'd consider rearranging the order of your > steps so that you can use sprintf(). > > Someone else might have a more flexible way too; I'd be interested to see it. > Unfortunately I don't think sprintf() has a way to insert a thousands > separator, > or that would be a one-step solution. > > Sarah > > On Wed, Dec 7, 2011 at 6:05 AM, Aidan Corcoran > <aidan.corcora...@gmail.com> wrote: >> Dear all, >> >> I'm trying to remove some text after the period (a decimal point) in >> the data frame 'hi', below. This is one step in formatting a table. So >> I would like e.g. >> "2.0" to become "2" >> and "5.3" to be "5.3", >> where the variable digordered contains the number of digits after the >> decimal that I would like to display, in the same order in which the >> variables appear in hi. If it makes it easier to use, this info is >> also contained in the dataframe nam2. The reason the numbers are >> recorded as characters is because I used format to get a thousand >> separator, which I also need. >> >> The string manipulation functions in R generally don't seem to work >> with matrices or data frames, so e.g. regexpr("\\.", hi[1,2]) works >> but not regexpr("\\.", hi). Finding the location of the period and >> then using substring was the approach I was thinking of taking, but >> this would seem to need for loops here. I was wondering if anyone >> knows any easier ways. >> >> Thanks very much for any help! >> >> Aidan >> >> >> digordered<- c(0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1) >> f<-structure(list(c("GDP (LCU,bn)", "GDP ($, bn)", "GDP per capita (LCU)", >> "Ratio to EZ GDP Per Cap", "Share of World GDP (Intl $, %)", >> "Real GDP Growth (%)", "Population (mn)", "Unemployment Rate (%)", >> "Ratio of Employed/Unemployed", "PPP Exchange Rate", "Nominal Exchange >> Rate (LCU per $)", >> "Inflation (%)", "Main Lending Rate to Private Sector (%)", "Claims on >> Central Gov", >> "Claims on Private Sector", "Bank Assets", "Regulator Capital to RWA", >> "Tier 1 Capital to RWA", "Return on Equity", "Liquid Assets to ST >> Liabilities" >> ), `2005` = c(35662, 809, 32128, 0.1, 4.3, 9, 1110, 3.5, NA, >> 14.7, 44.1, 4, 10.8, 7, 15, 22835, NA, NA, NA, NA), `2009` = c(61240, >> 1265, 52163, 0.1, 5.2, 6.8, 1174, NA, NA, 16.8, 48.4, 10.9, 12.2, >> 14, 31, 47180, 13.6, 9, 10.8, 42.8), `2010` = c(75122, 1632, >> 63100, 0.1, 5.5, 10.1, 1191, NA, NA, 18.5, 45.7, 12, NA, 15, >> 39, 56787, 14.7, 9.9, 10.5, 41.1), `2011` = c(87455, 1843, 72461, >> 0.1, 5.7, 7.8, 1207, NA, NA, 19.6, NA, 10.6, NA, NA, NA, NA, >> 13.5, 9.3, 14.3, 35.8), `2012` = c(99459, 2013, 81313, 0.1, 5.9, >> 7.5, 1223, NA, NA, 20.5, NA, 8.6, NA, NA, NA, NA, NA, NA, NA, >> NA)), .Names = c("", "2005", "2009", "2010", "2011", "2012"), row.names = >> c(NA, >> 20L), class = c("cast_df", "data.frame")) >> >> hi<-format(f,big.mark=",",scientific=F) >> regexpr("\\.", hi) #don't know to get location of "." in a dataframe of >> chars >> >> >> nam2<- structure(list(var1 = c("GDP (LCU,bn)", "GDP ($, bn)", "GDP >> per capita (LCU)", >> "Ratio to EZ GDP Per Cap", "GDP per capita (Intl $)", "EU GDP per >> capita (Intl $)", >> "Share of World GDP (Intl $, %)", "Real GDP Growth (%)", "Population (mn)", >> "Unemployment Rate (%)", "Ratio of Employed/Unemployed", "Employment >> (1000s)", >> "Unemployment (1000s)", "PPP Exchange Rate", "Nominal Exchange Rate >> (LCU per $)", >> "Inflation (%)", "Main Lending Rate to Private Sector (%)", "Claims on >> Central Gov", >> "Claims on Private Sector", "Bank Assets", "Regulator Capital to RWA", >> "Tier 1 Capital to RWA", "Return on Equity", "Liquid Assets to ST >> Liabilities", >> "Reserves"), digi = c(0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, >> 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0)), .Names = c("var1", "digi" >> ), row.names = c("96", "97", "98", "110", "99", "100", "101", >> "102", "103", "111", "112", "104", "105", "106", "107", "108", >> "109", "114", "115", "113", "119", "120", "121", "122", "116" >> ), class = "data.frame") >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > Sarah Goslee > http://www.stringpage.com > http://www.sarahgoslee.com > http://www.functionaldiversity.org ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.