Thanks, Calum. After rereading the post, I came to your interpret it as you did. So glad that we agree.
"easier" of course is in the mind of the beholder. But I'm glad that you presented a "tidyverse" approach. There are other issues of dependencies and efficiency that also might be relevant. Anyway, here is another "simpler" approach -- in the sense that only base R without any dependencies is needed. Beyond that, I make no claims. I first extracted the data from the post and converted them into a data frame, dat, with two (numeric) column named "Value" and "Group". Then the following does what (I think) was requested: spl <-split(seq_len(nrow(dat)), dat$Group) ## a structure giving all row numbers per group for(grp in unique(dat$Group)){ ix <- spl[[grp]] ## extract indices for the group dat[ix, 'Value'] <- na.omit(dat[ix,'Value'] )[1] ##extract values for the group ##set all values in the Value column for these indices to the first non-NA value" } yielding: > dat Value Group 1 6 8 2 9 5 3 2 1 4 5 6 5 2 7 6 7 2 7 4 4 8 2 7 9 2 7 10 10 3 11 7 2 12 4 4 13 5 6 14 9 5 15 9 5 16 5 6 17 10 3 18 7 2 19 2 1 20 2 7 21 7 2 22 6 8 23 4 4 24 9 5 25 5 6 26 2 1 27 4 4 28 6 8 29 10 3 30 10 3 31 6 8 32 2 1 As Calum said, whether this is really a good approach depends on what the OP wants to do after this. Cheers, Bert On Tue, Aug 27, 2024 at 5:07 PM CALUM POLWART <polc1...@gmail.com> wrote: > > Bert > > I thought she meant she wanted to replace the NAs with the 6. But I could be > wrong. > > It looks like the data is combined from cbind. > > I'm going to give tidyverse examples because it's (/s) *"always"* (/s) easier. > > require(tidyverse) > # impute the missing NAs > myData <- cbind(VB1d[,1],s1id[,1]) > > myData |> said[ > filter(!is.na(1)) |> #uses col1 would be better to use a name > unique() -> referenceData > > myData |> > select(2) |> #better to name > left_join(referenceData) -> cleanData > > You will notice I've used column numbers. I suspect cbind will name the > columns oddly. And I'm typing this on my phone so it's untested. > > If you wanted counts > > myData |> > filter (!is.na(1)) |> > group_by(2) |> > summarise (n()) > > I won't answer the c(5,5) that Bert mentions because that's an extra question > of what you do next with the data to know how best to present it. > > > On Wed, 28 Aug 2024, 00:06 Bert Gunter, <bgunter.4...@gmail.com> wrote: >> >> Sorry, not clear to me. >> >> For group 8 in your example, do you want extract the values in column >> 1 that are not NA, i.e. one value, 6; or do you want to extract the >> number of values -- that is, the count -- that are not NA, i.e. 1? >> >> ... and for group 5, would it be c(9,9) for the values; or 2 for the count? >> >> Or something else entirely if I have completely misunderstood. >> >> Either of the above are easy and quick to do. You can also just remove >> the NA's via a version of ?na.omit if that's what you want. >> >> Of course, feel free to ignore this and wait for a more helpful >> response from someone who understands your query better than I. >> >> Cheers, >> Bert >> >> On Tue, Aug 27, 2024 at 3:45 PM Francesca PANCOTTO via R-help >> <r-help@r-project.org> wrote: >> > >> > Dear Contributors, >> > I have a problem with a database composed of many individuals for many >> > periods, for which I need to perform a manipulation of data as follows. >> > Here I report the procedure I need to do for the first 32 observations of >> > the first period. >> > >> > >> > cbind(VB1d[,1],s1id[,1]) >> > [,1] [,2] >> > [1,] 6 8 >> > [2,] 9 5 >> > [3,] NA 1 >> > [4,] 5 6 >> > [5,] NA 7 >> > [6,] NA 2 >> > [7,] 4 4 >> > [8,] 2 7 >> > [9,] 2 7 >> > [10,] NA 3 >> > [11,] NA 2 >> > [12,] NA 4 >> > [13,] 5 6 >> > [14,] 9 5 >> > [15,] NA 5 >> > [16,] NA 6 >> > [17,] 10 3 >> > [18,] 7 2 >> > [19,] 2 1 >> > [20,] NA 7 >> > [21,] 7 2 >> > [22,] NA 8 >> > [23,] NA 4 >> > [24,] NA 5 >> > [25,] NA 6 >> > [26,] 2 1 >> > [27,] 4 4 >> > [28,] 6 8 >> > [29,] 10 3 >> > [30,] NA 3 >> > [31,] NA 8 >> > [32,] NA 1 >> > >> > >> > In column s1id, I have numbers from 1 to 8, which are the id of 8 groups , >> > randomly mixed in the larger group of 32. >> > For each group, I want the value that is reported for only to group >> > members, to all the four group members. >> > >> > For example, value 8 in first row , second column, is group 8. The value >> > for group 8 of the variable VB1d is 6. At row 28, again for s1id equal to >> > 8, I have 6. >> > But in row 22, the value 8 of the second variable, reports a value NA. >> > in each group is the same, only two values have the correct number, the >> > other two are NA. >> > I need that each group, identified by the values of the variable S1id, >> > correctly report the number of variable VB1d that is present for just two >> > group members. >> > >> > I hope my explanation is acceptable. >> > The task appears complex to me right now, especially because I will need to >> > multiply this procedure for x12x14 similar databases. >> > >> > Anyone has ever encountered a similar problem? >> > Thanks in advance for any help provided. >> > >> > ---------------------------------- >> > >> > Francesca Pancotto >> > >> > Associate Professor Political Economy >> > >> > University of Modena, Largo Santa Eufemia, 19, Modena >> > >> > Office Phone: +39 0522 523264 >> > >> > Web: *https://sites.google.com/view/francescapancotto/home >> > <https://sites.google.com/view/francescapancotto/home>* >> > >> > ---------------------------------- >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > https://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.