Javad,
 
There may be nothing wrong with the methods people are showing you and if it 
satisfied you, great.
 
But I note you have lots of data in over a quarter million rows. If much of the 
text data is redundant, and you want to simplify some operations such as 
changing some of the values to others I multiple ways, have you done any 
learning about an R feature very useful for dealing with categorical data 
called "factors"?
 
If you have a vector or a column in a data.frame that contains text, then it 
can be replaced by a factor that often takes way less space as it stores a sort 
of dictionary of all the unique values and just records numbers like 1,2,3 to 
tell which one each item is. 
 
You can access the values using levels(whatever) and also change them. There 
are packages that make this straightforward such as forcats which is one of the 
tidyverse packages that also includes many other tools some find useful but are 
beyond the usual scope of this mailing list.
 
As an example, if you have a vector in mydata$col1 then code like:
 
mydata$col1 <- factor(mydata$col1)
 
No matter which way you do it, you can now access the levels and make whatever 
changes, and save the changes. One example could be to apply some variant of 
grep to make the substitution. There is a family of functions build in such as 
sub() that matches a Regular Expression and replaces it with what you want.
 
This has a similar result to changing all entries without doing all the work. I 
mean if item 5 used to be "OLD" and is now "NEW" then any of you quarter 
million entries that have a 5 will now be seen as having a value of "NEW".
 
I will stop here and suggest you may want to read some book that explains R as 
a unified set of features with some emphasis on using it for the features it is 
intended to have that can make life easier, rather than using just features it 
shares with most languages. Some of your questions indicate you have less 
grounding and are mainly following recipes you stumble across. 
 
Otherwise, you will have a collection of what you call "codes" and others like 
me call programming and that don't necessarily fit well together.
 
 
-----Original Message-----
From: R-help r-help-boun...@r-project.org <mailto:r-help-boun...@r-project.org> 
 On Behalf Of javad bayat
Sent: Tuesday, June 13, 2023 3:47 PM
To: Eric Berger ericjber...@gmail.com <mailto:ericjber...@gmail.com> 
Cc: R-help@r-project.org <mailto:R-help@r-project.org> 
Subject: Re: [R] Problem with filling dataframe's column
 
Dear all;
I used these codes and I get what I wanted.
Sincerely
 
pat = c("Level 12","Level 22","0")
data3 = data2[-which(data2$Layer == pat),]
dim(data2)
[1] 281549      9
dim(data3)
[1] 244075      9
 
On Tue, Jun 13, 2023 at 11:36 AM Eric Berger < <mailto:ericjber...@gmail.com> 
ericjber...@gmail.com> wrote:
 
> Hi Javed,
> grep returns the positions of the matches. See an example below.
> 
> > v <- c("abc", "bcd", "def")
> > v
> [1] "abc" "bcd" "def"
> > grep("cd",v)
> [1] 2
> > w <- v[-grep("cd",v)]
> > w
> [1] "abc" "def"
> >
> 
> 
> On Tue, Jun 13, 2023 at 8:50 AM javad bayat < <mailto:j.bayat...@gmail.com> 
> j.bayat...@gmail.com> wrote:
> >
> > Dear Rui;
> > Hi. I used your codes, but it seems it didn't work for me.
> >
> > > pat <- c("_esmdes|_Des Section|0")
> > > dim(data2)
> >     [1]  281549      9
> > > grep(pat, data2$Layer)
> > > dim(data2)
> >     [1]  281549      9
> >
> > What does grep function do? I expected the function to remove 3 rows of
> the
> > dataframe.
> > I do not know the reason.
> >
> >
> >
> >
> >
> >
> > On Mon, Jun 12, 2023 at 5:16 PM Rui Barradas < 
> > <mailto:ruipbarra...@sapo.pt> ruipbarra...@sapo.pt>
> wrote:
> >
> > > Às 23:13 de 12/06/2023, javad bayat escreveu:
> > > > Dear Rui;
> > > > Many thanks for the email. I tried your codes and found that the
> length
> > > of
> > > > the "Values" and "Names" vectors must be equal, otherwise the results
> > > will
> > > > not be useful.
> > > > For some of the characters in the Layer column that I do not need to
> be
> > > > filled in the LU column, I used "NA".
> > > > But I need to delete some of the rows from the table as they are
> useless
> > > > for me. I tried this code to delete entire rows of the dataframe
> which
> > > > contained these three value in the Layer column: It gave me the
> following
> > > > error.
> > > >
> > > >> data3 = data2[-grep(c("_esmdes","_Des Section","0"), data2$Layer),]
> > > >       Warning message:
> > > >        In grep(c("_esmdes", "_Des Section", "0"), data2$Layer) :
> > > >        argument 'pattern' has length > 1 and only the first element
> will
> > > be
> > > > used
> > > >
> > > >> data3 = data2[!grepl(c("_esmdes","_Des Section","0"), data2$Layer),]
> > > >      Warning message:
> > > >      In grepl(c("_esmdes", "_Des Section", "0"), data2$Layer) :
> > > >      argument 'pattern' has length > 1 and only the first element
> will be
> > > > used
> > > >
> > > > How can I do this?
> > > > Sincerely
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Sun, Jun 11, 2023 at 5:03 PM Rui Barradas < 
> > > > <mailto:ruipbarra...@sapo.pt> ruipbarra...@sapo.pt>
> > > wrote:
> > > >
> > > >> Às 13:18 de 11/06/2023, Rui Barradas escreveu:
> > > >>> Às 22:54 de 11/06/2023, javad bayat escreveu:
> > > >>>> Dear Rui;
> > > >>>> Many thanks for your email. I used one of your codes,
> > > >>>> "data2$LU[which(data2$Layer == "Level 12")] <- "Park"", and it
> works
> > > >>>> correctly for me.
> > > >>>> Actually I need to expand the codes so as to consider all
> "Levels" in
> > > >> the
> > > >>>> "Layer" column. There are more than hundred levels in the Layer
> > > column.
> > > >>>> If I use your provided code, I have to write it hundred of time as
> > > >> below:
> > > >>>> data2$LU[which(data2$Layer == "Level 1")] <- "Park";
> > > >>>> data2$LU[which(data2$Layer == "Level 2")] <- "Agri";
> > > >>>> ...
> > > >>>> ...
> > > >>>> ...
> > > >>>> .
> > > >>>> Is there any other way to expand the code in order to consider
> all of
> > > >> the
> > > >>>> levels simultaneously? Like the below code:
> > > >>>> data2$LU[which(data2$Layer == c("Level 1","Level 2", "Level 3",
> ...))]
> > > >> <-
> > > >>>> c("Park", "Agri", "GS", ...)
> > > >>>>
> > > >>>>
> > > >>>> Sincerely
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> On Sun, Jun 11, 2023 at 1:43 PM Rui Barradas <
>  <mailto:ruipbarra...@sapo.pt> ruipbarra...@sapo.pt>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Às 21:05 de 11/06/2023, javad bayat escreveu:
> > > >>>>>> Dear R users;
> > > >>>>>> I am trying to fill a column based on a specific value in
> another
> > > >>>>>> column
> > > >>>>> of
> > > >>>>>> a dataframe, but it seems there is a problem with the codes!
> > > >>>>>> The "Layer" and the "LU" are two different columns of the
> dataframe.
> > > >>>>>> How can I fix this?
> > > >>>>>> Sincerely
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> for (i in 1:nrow(data2$Layer)){
> > > >>>>>>              if (data2$Layer == "Level 12") {
> > > >>>>>>                  data2$LU == "Park"
> > > >>>>>>                  }
> > > >>>>>>              }
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>> Hello,
> > > >>>>>
> > > >>>>> There are two bugs in your code,
> > > >>>>>
> > > >>>>> 1) the index i is not used in the loop
> > > >>>>> 2) the assignment operator is `<-`, not `==`
> > > >>>>>
> > > >>>>>
> > > >>>>> Here is the loop corrected.
> > > >>>>>
> > > >>>>> for (i in 1:nrow(data2$Layer)){
> > > >>>>>      if (data2$Layer[i] == "Level 12") {
> > > >>>>>        data2$LU[i] <- "Park"
> > > >>>>>      }
> > > >>>>> }
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> But R is a vectorized language, the following two ways are the
> > > idiomac
> > > >>>>> ways of doing what you want to do.
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> i <- data2$Layer == "Level 12"
> > > >>>>> data2$LU[i] <- "Park"
> > > >>>>>
> > > >>>>> # equivalent one-liner
> > > >>>>> data2$LU[data2$Layer == "Level 12"] <- "Park"
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> If there are NA's in data2$Layer it's probably safer to use
> ?which()
> > > in
> > > >>>>> the logical index, to have a numeric one.
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> i <- which(data2$Layer == "Level 12")
> > > >>>>> data2$LU[i] <- "Park"
> > > >>>>>
> > > >>>>> # equivalent one-liner
> > > >>>>> data2$LU[which(data2$Layer == "Level 12")] <- "Park"
> > > >>>>>
> > > >>>>>
> > > >>>>> Hope this helps,
> > > >>>>>
> > > >>>>> Rui Barradas
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>> Hello,
> > > >>>
> > > >>> You don't need to repeat the same instruction 100+ times, there is
> a
> > > way
> > > >>> of assigning all new LU values at the same time with match().
> > > >>> This assumes that you have the new values in a vector.
> > > >>
> > > >> Sorry, this is not clear. I mean
> > > >>
> > > >>
> > > >> This assumes that you have the new values in a vector, the vector
> Names
> > > >> below. The vector of values to be matched is created from the data.
> > > >>
> > > >>
> > > >> Rui Barradas
> > > >>
> > > >>>
> > > >>>
> > > >>> Values <- sort(unique(data2$Layer))
> > > >>> Names <- c("Park", "Agri", "GS")
> > > >>>
> > > >>> i <- match(data2$Layer, Values)
> > > >>> data2$LU <- Names[i]
> > > >>>
> > > >>>
> > > >>> Hope this helps,
> > > >>>
> > > >>> Rui Barradas
> > > >>>
> > > >>> ______________________________________________
> > > >>>  <mailto:R-help@r-project.org> R-help@r-project.org mailing list -- 
> > > >>> To UNSUBSCRIBE and more, see
> > > >>>  <https://stat.ethz.ch/mailman/listinfo/r-help> 
> > > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > > >>> PLEASE do read the posting guide
> > > >>>  <http://www.R-project.org/posting-guide.html> 
> > > >>> http://www.R-project.org/posting-guide.html
> > > >>> and provide commented, minimal, self-contained, reproducible code.
> > > >>
> > > >>
> > > >
> > > Hello,
> > >
> > > Please cc the r-help list, R-Help is threaded and this can in the
> future
> > > be helpful to others.
> > >
> > > You can combine several patters like this:
> > >
> > >
> > > pat <- c("_esmdes|_Des Section|0")
> > > grep(pat, data2$Layer)
> > >
> > > or, programatically,
> > >
> > >
> > > pat <- paste(c("_esmdes","_Des Section","0"), collapse = "|")
> > >
> > >
> > > Hope this helps,
> > >
> > > Rui Barradas
> > >
> > >
> >
> > --
> > Best Regards
> > Javad Bayat
> > M.Sc. Environment Engineering
> > Alternative Mail:  <mailto:bayat...@yahoo.com> bayat...@yahoo.com
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> >  <mailto:R-help@r-project.org> R-help@r-project.org mailing list -- To 
> > UNSUBSCRIBE and more, see
> >  <https://stat.ethz.ch/mailman/listinfo/r-help> 
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
>  <http://www.R-project.org/posting-guide.html> 
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
 
 
-- 
Best Regards
Javad Bayat
M.Sc. Environment Engineering
Alternative Mail:  <mailto:bayat...@yahoo.com> bayat...@yahoo.com
 

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to