Hi

Please do not use html formating in your post. It does not bring any advantage.
See inline.

From: Verena Weinbir [mailto:vwein...@gmail.com]
Sent: Thursday, May 29, 2014 3:33 PM
To: PIKAL Petr
Subject: Re: [R] Dataframe: Average cells of two rows and replace them with one 
row

Hey,
Thank you for your reply!

I've attached some sample data. When I tried your code it gave me the error 
message, that arguments must have same
Why you attached data? Preferable is using dput. When I tried to read your data 
it had some flaw with number of items in row 13 (and probably others), Excel is 
not famous for keeping same formating across versions.
> test<-read.table("clipboard", header=T, na.string="NA", dec=",")
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  line 13 did not have 25 elements
So I read only lines 1:10.
> test<-read.table("clipboard", header=T, na.string="NA", dec=",")
Which results in data frame with two factor variables Author and Test. BTW 
there is no variable “Name” in your data.
> str(test)
'data.frame':   10 obs. of  25 variables:
$ Author  : Factor w/ 4 levels "Beck","Joll",..: 2 2 2 2 1 1 1 1 3 4
$ Year    : int  2006 2006 2006 2006 1988 1988 1988 1988 2004 2004
$ Number  : int  720 720 720 720 33 41 41 41 19 26
$ NumberA : int  344 344 344 344 5 6 6 6 9 12
$ NumberB : int  376 376 376 376 28 35 35 35 10 14
$ Age     : num  15 15 15 15 25.5 NA NA NA 37.4 37.2
$ AgeA    : int  NA NA NA NA 27 NA NA NA NA NA
$ AgeB    : int  NA NA NA NA 24 NA NA NA NA NA
$ Test    : Factor w/ 2 levels "green","red": 2 2 2 2 1 1 1 1 1 1
$ ScoreA  : num  64.8 63 64.7 60.6 61 ...
$ ScoreAdv: num  9.96 9.96 9.96 9.96 20.64 ...
$ ScoreB  : num  75.5 73.4 74.6 69.2 70.8 ...
$ ScoreBdv: num  9.04 9.04 9.04 9.04 16.36 ...
$ Sub1    : logi  NA NA NA NA NA NA ...
$ Sub2    : logi  NA NA NA NA NA NA ...
$ Sub3    : logi  NA NA NA NA NA NA ...
$ Sub4    : logi  NA NA NA NA NA NA ...
$ Sub5    : logi  NA NA NA NA NA NA ...
$ Sub6    : logi  NA NA NA NA NA NA ...
$ Sub7    : logi  NA NA NA NA NA NA ...
$ Sub8    : logi  NA NA NA NA NA NA ...
$ Sub8.1  : logi  NA NA NA NA NA NA ...
$ Sub10   : logi  NA NA NA NA NA NA ...
$ yi      : num  1.124 1.092 1.04 0.903 0.515 ...
$ vi      : num  0.00643 0.00638 0.0063 0.00612 0.23337 ...
Here is output from dput which you can use to inspect if my data are the same 
as yours (that is why dput is preferable)
> dput(test)
structure(list(Author = structure(c(3L, 3L, 3L, 3L, 1L, 1L, 1L,
1L, 4L, 5L, 2L), .Label = c("Beck", "Con", "Joll", "Per(a)",
"Per(b)"), class = "factor"), Year = c(2006L, 2006L, 2006L, 2006L,
1988L, 1988L, 1988L, 1988L, 2004L, 2004L, 2012L), Number = c(720L,
720L, 720L, 720L, 33L, 41L, 41L, 41L, 19L, 26L, 312L), NumberA = c(344L,
344L, 344L, 344L, 5L, 6L, 6L, 6L, 9L, 12L, 156L), NumberB = c(376L,
376L, 376L, 376L, 28L, 35L, 35L, 35L, 10L, 14L, 156L), Age = c(15,
15, 15, 15, 25.5, NA, NA, NA, 37.4, 37.2, 37.25), AgeA = c(NA,
NA, NA, NA, 27, NA, NA, NA, NA, NA, 38.3), AgeB = c(NA, NA, NA,
NA, 24, NA, NA, NA, NA, NA, 36.2), Test = structure(c(3L, 3L,
3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 1L), .Label = c("blue", "green",
"red"), class = "factor"), ScoreA = c(64.8, 63, 64.7, 60.6, 61,
60.66, 58.5, 61.66, 87.58, 91.2, 0.26), ScoreAdv = c(9.955, 9.955,
9.955, 9.955, 20.64, 19.38, 20.35, 19.44, 16.79, 15.6, 0.27),
    ScoreB = c(75.5, 73.4, 74.6, 69.2, 70.83, 70.34, 70.91, 71.19,
    98.08, 86.87, 0.3), ScoreBdv = c(9.043, 9.043, 9.043, 9.043,
    16.36, 17.78, 18.23, 18.93, 16.35, 15.73, 0.26), Sub1 = c(NA,
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Sub2 = c(NA, NA,
    NA, NA, NA, NA, NA, NA, NA, NA, NA), Sub3 = c(NA, NA, NA,
    NA, NA, NA, NA, NA, NA, NA, NA), Sub4 = c(NA, NA, NA, NA,
    NA, NA, NA, NA, NA, NA, NA), Sub5 = c(NA, NA, NA, NA, NA,
    NA, NA, NA, NA, NA, NA), Sub6 = c(NA, NA, NA, NA, NA, NA,
    NA, NA, NA, NA, NA), Sub7 = c(NA, NA, NA, NA, NA, NA, NA,
    NA, NA, NA, NA), Sub8 = c(NA, NA, NA, NA, NA, NA, NA, NA,
    NA, NA, NA), Sub8.1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA,
    NA, NA), Sub10 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
    NA), yi = c(1.12396298138735, 1.09245000060079, 1.03992836595652,
    0.90337211588142, 0.514940166844419, 0.510422808437657, 0.629923007603453,
    0.487074464117519, 0.605177248294008, -0.26766583881062,
    0.150551071047105), vi = c(0.0064268782069221, 0.00637821308397186,
    0.00630017975096319, 0.00611528303580472, 0.233373905723904,
    0.212826775760406, 0.211924228535386, 0.222536036643126,
    0.224889816220824, 0.158901797586393, 0.0128772400934118)), .Names = 
c("Author",
"Year", "Number", "NumberA", "NumberB", "Age", "AgeA", "AgeB",
"Test", "ScoreA", "ScoreAdv", "ScoreB", "ScoreBdv", "Sub1", "Sub2",
"Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8", "Sub8.1", "Sub10",
"yi", "vi"), class = "data.frame", row.names = c(NA, -11L))
>
I can use aggregate without problems
> test.ag<-aggregate(test[,-1], list(test[,1]), mean, na.rm=T)
Here is the result
> dput(test.ag)
structure(list(Group.1 = structure(1:5, .Label = c("Beck", "Con",
"Joll", "Per(a)", "Per(b)"), class = "factor"), Year = c(1988,
2012, 2006, 2004, 2004), Number = c(39, 312, 720, 19, 26), NumberA = c(5.75,
156, 344, 9, 12), NumberB = c(33.25, 156, 376, 10, 14), Age = c(25.5,
37.25, 15, 37.4, 37.2), AgeA = c(27, 38.3, NaN, NaN, NaN), AgeB = c(24,
36.2, NaN, NaN, NaN), Test = c(NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_), ScoreA = c(60.455, 0.26, 63.275, 87.58,
91.2), ScoreAdv = c(19.9525, 0.27, 9.955, 16.79, 15.6), ScoreB = c(70.8175,
0.3, 73.175, 98.08, 86.87), ScoreBdv = c(17.825, 0.26, 9.043,
16.35, 15.73), Sub1 = c(NaN, NaN, NaN, NaN, NaN), Sub2 = c(NaN,
NaN, NaN, NaN, NaN), Sub3 = c(NaN, NaN, NaN, NaN, NaN), Sub4 = c(NaN,
NaN, NaN, NaN, NaN), Sub5 = c(NaN, NaN, NaN, NaN, NaN), Sub6 = c(NaN,
NaN, NaN, NaN, NaN), Sub7 = c(NaN, NaN, NaN, NaN, NaN), Sub8 = c(NaN,
NaN, NaN, NaN, NaN), Sub8.1 = c(NaN, NaN, NaN, NaN, NaN), Sub10 = c(NaN,
NaN, NaN, NaN, NaN), yi = c(0.535590111750762, 0.150551071047105,
1.03992836595652, 0.605177248294008, -0.26766583881062), vi = 
c(0.220165236665706,
0.0128772400934118, 0.00630513851941547, 0.224889816220824, 0.158901797586393
)), .Names = c("Group.1", "Year", "Number", "NumberA", "NumberB",
"Age", "AgeA", "AgeB", "Test", "ScoreA", "ScoreAdv", "ScoreB",
"ScoreBdv", "Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7",
"Sub8", "Sub8.1", "Sub10", "yi", "vi"), row.names = c(NA, -5L
), class = "data.frame")
You can see that the Test variable is remuved as it is not mumeric and cannot 
be averaged.
> test.ag$Test
[1] NA NA NA NA NA
length.  Regarding the test variable I want it to look the same as before.
You can check if Test variable is same across aggregated values.
> aggregate(test$Test, list(test[,1]), paste)
  Group.1                          x
1    Beck green, green, green, green
2     Con                       blue
3    Joll         red, red, red, red
4  Per(a)                      green
5  Per(b)                      green
and if yes you can pick up one
> aggregate(test$Test, list(test[,1]), function(x) x[1])
  Group.1     x
1    Beck green
2     Con  blue
3    Joll   red
4  Per(a) green
5  Per(b) green
I believe this can be accomplished also by other ways. Now you can add these 
values to aggregated data e.g. by.
test.ag$Test <- aggregate(test$Test, list(test[,1]), function(x) x[1])$x
> test.ag$Test
[1] green blue  red   green green
Levels: blue green red

I hope it solves your problem. Again please use plain text and dput for 
presenting data. It is much more convenient
Regards
Petr
Best,
Verena


On Thu, May 29, 2014 at 10:16 AM, PIKAL Petr 
<petr.pi...@precheza.cz<mailto:petr.pi...@precheza.cz>> wrote:
Hi

So what do you want to do with the test variable when averaging?
Did you try aggregate function?
What was results?

Please real data (at least structure) and code you used.

Regards
Petr

From: Verena Weinbir [mailto:vwein...@gmail.com<mailto:vwein...@gmail.com>]
Sent: Thursday, May 29, 2014 9:48 AM
To: PIKAL Petr

Cc: r-help
Subject: Re: [R] Dataframe: Average cells of two rows and replace them with one 
row

Hello,
thank you for your reply.

Actually, the whole rows would have to be averaged anyways - my mistake :-)
Besides the first column "name" there is one other string (chr) variable "Test" 
in the dataset (the rows I want to average have always the same Testvariable), 
the other variables are numeric or integer.
Best,
Verena

On Wed, May 28, 2014 at 2:57 PM, PIKAL Petr 
<petr.pi...@precheza.cz<mailto:petr.pi...@precheza.cz>> wrote:
Hi

AFAIK you can not average values only in 2 columns leaving others intact. The 
exact code depends on what are in columns 2-39 in your data frame. If numbers, 
you can averege them as well.

Something like

dat.ag<http://dat.ag> <- aggregate(dat[,-1], list(dat$Name), mean, na.rm=TRUE)

if your data frame is named dat and first column calls Name. You get new object 
with aggregated values for the same Name.

If some columns are nonnumeric the problem gets trickier and solution strongly 
depends what mode are those columns and what you want to do with them when 
aggregating values in column 40 and 41.

Show us at least structure of your data frame.

?str

Regards
Petr

> -----Original Message-----
> From: r-help-boun...@r-project.org<mailto:r-help-boun...@r-project.org> 
> [mailto:r-help-bounces@r-<mailto:r-help-bounces@r->
> project.org<http://project.org>] On Behalf Of Verena Weinbir
> Sent: Wednesday, May 28, 2014 2:00 PM
> To: arun
> Cc: r-help
> Subject: Re: [R] Dataframe: Average cells of two rows and replace them
> with one row
>
> Hey guys,
>
> thank you very much for your help.  Since I am a R-newbie I am still
> checking out how your code works and how I could adapt it to my
> dataframe,
> which has 124 rows and 41 columns/variables.  The first column would be
> "name", the last ones, 40 and 41, contain the cells I want to average
> for
> some rows. Is it possible to read the dataframe without copying the
> whole
> thing into the text"" function (just tried it and got an error
> message)?
>
> Thank you!
>
> Verena
>
>
> On Wed, May 28, 2014 at 3:48 AM, arun 
> <smartpink...@yahoo.com<mailto:smartpink...@yahoo.com>> wrote:
>
> > Hi,
> > You can also try:
> > dat <- read.table(text="Name C1 C2 C3
> >   1  A  3  3  5
> >   2  B  2  7  4
> >   3  C  4  3  3
> >   4  C  4  4  6
> >   5  D  5  5  3",sep="",header=TRUE,stringsAsFactors=FALSE)
> >
> >
> >  library(plyr)
> >  ddply(dat,.(Name),numcolwise(mean,na.rm=TRUE))
> > A.K.
> >
> >
> > On Tuesday, May 27, 2014 4:08 PM, Verena Weinbir 
> > <vwein...@gmail.com<mailto:vwein...@gmail.com>>
> > wrote:
> > Hello,
> >
> > I have a big dataframe, and want to average two specific cells of two
> > specific rows and then replace those two rows with one row which
> contains
> > the averaged cells. Example (row 3 and 4: Cells2 and Cells3 averaged
> and
> > replaced)
> >
> >     NameC1 C2 C3
> >   1  A  3  3  5
> >   2  B  2  7  4
> >   3  C  4  3  3
> >   4  C  4  4  6
> >   5  D  5  5  3
> >
> >
> >
> >     NameC1 C2  C3
> >   1  A  3  3   5
> >   2  B  2  7   4
> >   3  C  4  3.5 4.5  4  D  5  5   3
> >
> >
> > Many thanks in advance!
> >
> > Best,
> >
> > Verena
> >
> >     [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org<mailto:R-help@r-project.org> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>       [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org<mailto:R-help@r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
________________________________



________________________________
Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou 
určeny pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento 
email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou 
modifikacemi či zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření 
smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně 
přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky 
ze strany příjemce s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve 
výslovným dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za 
společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně 
zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly 
adresátovi tohoto emailu případně osobě, kterou adresát zastupuje, 
předloženy nebo jejich existence je adresátovi či osobě jím zastoupené 
známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an 
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into 
any contracts on behalf of the company except for cases in which he/she is 
expressly authorized to do so in writing, and such authorization or power of 
attorney is submitted to the recipient or the person represented by the 
recipient, or the existence of such authorization is known to the recipient of 
the person represented by the recipient.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to