Richard, it is indeed possible for different languages to choose different approaches. If your point is that an R named list can simulate a Python dictionary (or for that manner, a set) there is some validity to that. You can also use environments similarly. Arguably there are differences including in things like what notations are built into the language. If you look the other way, Python chose to make lists a major feature which can hold any combination of things and can even be used to emulate a matrix with sub-lists and also had a tuple version that is similar but immutable and initially neglected something as simple as a vector containing just one kind of content. If you look at it now, many people simply load numpy (and often pandas) to get functionality that is faster and comes by default in R. I think this discussion was about my (amended) offhand remark suggesting R factors stored plain text in a vector attached to the variable and the offset was the number stored in the main factor vector. If that changed to internally use something hashed like a dictionary, fine. I have often made data structures such as in your example to store named items but did not call it a dictionary but simply a named list. In one sense, the two map into each other but I could argue there remain differences. For example, you can use something immutable like a tuple as a key in python. This is not an argument about which language is better. Each has developed to fill ideas and has been extended and quite a few things can now be done in either one. Still, it can be interesting to combine the two inside RSTUDIO so each does some of what it may do better or faster or in a way you find more natural. From: Richard O'Keefe <rao...@gmail.com> Sent: Wednesday, June 14, 2023 10:34 PM To: avi.e.gr...@gmail.com Cc: Bert Gunter <bgunter.4...@gmail.com>; R-help@r-project.org Subject: Re: [R] Problem with filling dataframe's column Consider m <- list(foo=c(1,2),"B'ar"=as.matrix(1:4,2,2),"!*#"=c(FALSE,TRUE)) It is a collection of elements of different types/structures, accessible via string keys (and also by position). Entries can be added: m[["fred"]] <- 47 Entries can be removed: m[["!*#"]] <- NULL How much more like a Python dictionary do you need it to be? On Wed, 14 Jun 2023 at 11:25, <avi.e.gr...@gmail.com <mailto:avi.e.gr...@gmail.com> > wrote: Bert,
I stand corrected. What I said may have once been true but apparently the implementation seems to have changed at some level. I did not factor that in. Nevertheless, whether you use an index as a key or as an offset into an attached vector of labels, it seems to work the same and I think my comment applies well enough that changing a few labels instead of scanning lots of entries can sometimes be a good think. As far as I can tell, external interface seem the same for now. One issue with R for a long time was how they did not do something more like a Python dictionary and it looks like … ABOVE From: Bert Gunter <bgunter.4...@gmail.com <mailto:bgunter.4...@gmail.com> > Sent: Tuesday, June 13, 2023 6:15 PM To: avi.e.gr...@gmail.com <mailto:avi.e.gr...@gmail.com> Cc: javad bayat <j.bayat...@gmail.com <mailto:j.bayat...@gmail.com> >; R-help@r-project.org <mailto:R-help@r-project.org> Subject: Re: [R] Problem with filling dataframe's column Below. On Tue, Jun 13, 2023 at 2:18 PM <avi.e.gr...@gmail.com <mailto:avi.e.gr...@gmail.com> <mailto:avi.e.gr...@gmail.com <mailto:avi.e.gr...@gmail.com> > > wrote: > > > Javad, > > There may be nothing wrong with the methods people are showing you and if it > satisfied you, great. > > But I note you have lots of data in over a quarter million rows. If much of > the text data is redundant, and you want to simplify some operations such as > changing some of the values to others I multiple ways, have you done any > learning about an R feature very useful for dealing with categorical data > called "factors"? > > If you have a vector or a column in a data.frame that contains text, then it > can be replaced by a factor that often takes way less space as it stores a > sort of dictionary of all the unique values and just records numbers like > 1,2,3 to tell which one each item is. -- This is false. It used to be true a **long time ago**, but R has for quite a while used hashing/global string tables to avoid this problem. See here <https://stackoverflow.com/questions/50310092/why-does-r-use-factors-to-store-characters> for details/references. As a result, I think many would argue that working with strings *as strings,* not factors, if often a better default, though of course there are still situations where factors are useful (e.g. in ordering results by factor levels where the desired level order is not alphabetical). **I would appreciate correction/ clarification if my claims are wrong or misleading! ** In any case, please do check such claims before making them on this list. Cheers, Bert [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.