Hello,
I have uploaded a csv file that looks like this:
> gc
alpha_id beta_id
1 142053 1
2 9454 1
3 295618 2
442691 2
5 389224 3
6 9455 3
The alpha_id contains 310660 unique values and the beta_id contain
Hi,
I have two columns with data (both identifiers - it's an affiliation list)
and I would like to delete the rows in which the observations in the second
column have a frequency < 5 in the entire second column. Example:
1 a
1 b
1 c
2 a
2 b
2 d
Let's say, I would like to
Hi Phil,
That worked perfectly! Thanks
Mathijs
--
View this message in context:
http://r.789695.n4.nabble.com/Delete-observations-with-a-frequency-x-tp3081226p3081264.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.
Hi,
I have a dataset (CSV) with some counts of firms located around the globe.
Each count is assigned to the longitude and latitude of the specific
location. Now I want to plot these counts on a world map using dots (size of
dots represent the count). I have been unable to find any info on whethe
Thanks for the suggestions, but I am not there yet (I'm a real novice). In
the code provided by Patrick (see below), I changed the shape input (from
sids to world) which I downloaded here:
http://thematicmapping.org/downloads/world_borders.php. As a result I also
need to change the "CNTY_ID" and "
Hi Patrick,
Thanks! That worked perfectly!
M
--
View this message in context:
http://r.789695.n4.nabble.com/Projecting-data-on-a-world-map-using-long-lat-tp3081298p3085834.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-pro
Hi,
I have a postgresql and a mysql database and I would like to combine the
info from two different tables in R. Both databases contain a table with
three columns: project_name, release_id and release_date. So each project
output could be released multiple times (I am interested in the first
rel
ames = list(unique(DF$B), names(DF)[-2:-3]))[, -1]), 1))
names(newDF) <- time(z)
lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2
Gabor Grothendieck wrote:
>
> On Sat, Apr 9, 2011 at 5:14 AM, mathijsdevaan
> <mathijsdev...@gmail.com> wrote:
>> Hi,
nstead of C ~ B
sum.na <- function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA
r <- rollapply(aa, 3, sum.na, align = "right", partial = TRUE)
Thanks!
Gabor Grothendieck wrote:
>
> On Wed, Apr 27, 2011 at 2:03 PM, mathijsdevaan
> <mathijsdev...@gmail.com>
Hi list,
Can anyone tell my why the following does not work? Thanks a lot! Your help
is very much appreciated.
DF = data.frame(read.table(textConnection("B C D E F G
8025 1995 0 4 1 2
8025 1997 1 1 3 4
8026 1995 0 7 0 0
8026 1996 1 2 3 0
8026 1997 1 2 3 1
8026 1
Great, thanks! Still need to figure out all these functions... ;)
--
View this message in context:
http://r.789695.n4.nabble.com/For-loop-and-sqldf-tp3484559p3484715.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org
Hi,
I have a dataset with info on individuals (B) that have been involved in
projects (A) during multiple years (C). The dataset contains three columns:
A, B, C. Example:
A B C
1 1 a 1999
2 1 b 1999
3 1 c 1999
4 2 c 2001
5 2 d 2001
6 3 a 2004
7 3 b 2004
I am interested in the
That worked great! Thanks!
--
View this message in context:
http://r.789695.n4.nabble.com/Data-manipulation-tp3302717p3303001.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailm
Hi,
I have a large dataset with info on individuals (B) that have been involved
in projects (A) during multiple years (C). The dataset contains three
columns: A, B, C. Example:
A B C
1 1 a 1999
2 1 b 1999
3 1 c 1999
4 1 d 1999
5 2 c 2001
6 2 d 2001
7 3 a 2004
8 3 c 2004
9
Hi,
I have a data frame containing two columns:
x<-as.factor(c('a','a','a','a','a','b','b','b','c','d','d','d'))
y<-c(1,3,6,8,12,3,4,7,5,6,7,10)
X<-data.frame(x,y)
X
x y
1 a 1
2 a 3
3 a 6
4 a 8
5 a 12
6 b 3
7 b 4
8 b 7
9 c 5
10 d 6
11 d 7
12 d 10
I would like to add
Thanks for your quick response. The solution that you propose comes close to
what I want, however by using 'FUN = order', the new variable is based on
its order rather than y< focal y. In the case of two similar rows
(Xrow1=Xrow2 and Yrow1=Yrow2), the value of Zrow1 < Zrow2. Any solutions?
Thanks!
>Here is a method that works despite generating a warning:
>cbind(X, z = ave(X$y, X$x, FUN = seq) - 1)
>
>David Winsemius, MD
>West Hartford, CT
I was happy a bit too early. There's still an error:
x<-as.factor(c('a','a','a','a','a','b','b','b','c','d','d','d'))
y<-c(1,3,6,8,8,3,4,7,5,6,7,10)
y <- c(1,3,6,8,8,3,4,7,5,6,7,10)
> X <- data.frame(x, y)
>
> cbind(X, z = ave(X$y, X$x,
> FUN = function (x) match(x, unique(x)) - 1))
>
>
> I hope it helps.
>
> Best,
> Dimitris
>
>
> On 2/17/2011 11:15 AM, mathijsdevaan wrote:
>>
>>>
d run it step by step to see how each part works. You
>will need some time to read the vignettes and ?data.table
>(which has recently been improved) but I hope you think it is
>worth it. Support is available at maintainer("data.table").
>HTH
>Matthew
>>On Mo
OK, for the last step I have tried this (among other things):
library(data.table)
DT = data.table(read.table(textConnection("A B C
1 1 a 1999
2 1 b 1999
3 1 c 1999
4 1 d 1999
5 2 c 2001
6 2 d 2001
7 3 a 2004
8 3 b 2004
9 3 d 2004"),head=TRUE,stringsAsFactors=FALSE))
Hi,
I have a DF like this:
DF = data.frame(read.table(textConnection("A B C
1 b1 1999 0.25
2 c1 1999 0.25
3 d1 1999 0.25
4 a2 1999 0.25
5 c2 1999 0.25
6 d2 1999 0.25
7 a3 1999 0.25
8 b3 1999 0.25
9 d3 1999 0.25
10 a4 1999 0.25
11 b4 1999 0.25
12 c4 1999 0.25
13 b1
C)-x$C})
DF$D = unlist(by(DF,DF$group, FUN = function(x){cumsum(x$C)}))
DF$D = DF$D-DF$C
Dieter Menne wrote:
>
>
> mathijsdevaan wrote:
>>
>> I have a DF like this:
>>
>> DF = data.frame(read.table(textConnection("A B C
>> 1 b1 1999 0.25
>>
I am still struggling (I'm an R novice). Basically I just want to sum the
values per group if the year condition is met. I have the feeling that using
a loop would work, but I am not really familiar with loops. Something like
this?
for(DF$C in 1:length(DF$C))
{
DF<-which(DF$year
t;
> I think you may have said you have large data. If so, this
> method should be fast. Please let us know how you get on.
>
> HTH
> Matthew
>
>
>
> On Thu, 17 Feb 2011 23:07:19 -0800, mathijsdevaan wrote:
>
>> OK, for the last step I have tried this (among
The output for the new example should be:
project v
1 0
2 0.5
3 1.5
4 0.5
The output you calculated was correct for the v per year, but the v per
group would be incorrect. I think the problem lies in the fact that
expand.grid(B,B) doesn't take into account that combinations of B can only
be
Gabor, that worked great! I was unaware of the sqldf package, but it is great
for data manipulation. Thanks!
--
View this message in context:
http://r.789695.n4.nabble.com/Re-Transforming-relational-data-tp3307449p3320067.html
Sent from the R help mailing list archive at Nabble.com.
___
Thanks Matthew that worked great. What a great forum this is: I am learning a
lot!
PS. I am now running both solutions on two similar computers. Let's see
which is fastest.
--
View this message in context:
http://r.789695.n4.nabble.com/Re-Transforming-relational-data-tp3307449p3321214.html
Se
Hi,
I have two questions:
1. How do I combine "DF$F =" and "DF$G =" into one function? (The original
dataset contains many more columns for which I want to execute the same
operation)
2. How do I improve the ave function so that the value DF(12,G) = 0 instead
of 1 (see bold font)? Both DF(12,B)=
0 1 1 2
Thanks,
Mathijs
Joshua Wiley-2 wrote:
>
> On Wed, Feb 23, 2011 at 8:32 AM, mathijsdevaan
> wrote:
>> Hi,
>>
>> I have two questions:
>> 1. How do I combine "DF$F =" and "DF$G =" into one function? (The
>> original
>> data
Problem solved:
DF = data.frame(read.table(textConnection("A B C D E
1 1 a 1999 1 0
2 1 b 1999 0 1
3 1 c 1999 0 1
4 1 d 1999 1 0
5 2 c 2001 1 0
6 2 d 2001 0 1
7 3 a 2004 0 1
8 3 b 2004 0 1
9 3 d 2004 0 1
10 4 b 2001 1 0
11 4 c 2001 1
Hi, I am running the following script for a different (much larger data
frame):
DF = data.frame(read.table(textConnection("A B C D E
1 1 a 1999 1 0
2 1 b 1999 0 1
3 1 c 1999 0 1
4 1 d 1999 1 0
5 2 c 2001 1 0
6 2 d 2001 0 1
7 3 a 2004 0 1
8 3 b 2004 0
I simply don't understand why I get this error when using a larger dataset.
Error in `[<-.data.frame`(`*tmp*`, i, , value = integer(0)) :
replacement has 0 items, need 37597770
In addition: Warning message:
In max(i) : no non-missing arguments to max; returning -Inf
Any ideas on what this
Mean doesn't work either... I understand that the message "replacement has 0
items, need 37597770" implies that the function is not returning any values,
but I don't understand why then this is not the case in the example.
DF = data.frame(read.table(textConnection("A B C D E
1 1 a 1999
Hi,
I have a data.frame of the following type:
F = data.frame(read.table(textConnection("A B
1 1 4
2 1 3
3 1 1
4 1 4
5 1 2
6 1 2
7 1 2
8 2 1
9 2 1
10 2 1
11 2 1
12 3 2
13 3 4
14 3 1
15 3 1
16 3 1"),head=TRUE,stringsAsFactors=FALSE))
F
A B
1 1 4
2 1 3
3 1 1
Thanks Gabor, that worked great!
Gabor Grothendieck wrote:
>
> On Thu, Mar 10, 2011 at 11:27 AM, mathijsdevaan
> <mathijsdev...@gmail.com> wrote:
>> Hi,
>>
>> I have a data.frame of the following type:
>>
>> F = data.frame(read.table(textConnection(&
Hi,
I am trying to calculate Principal Component Scores per id per year using
the psych package. The following lines provide the scores per obeservation
pca = data.frame(read.table(textConnection("id year A B C D
1001 1972 64 56 14 23
1003 1972 60 55
Hi,
I have a data frame listing US counties and a quantity ("number") per county
and I have a shapefile of the US with county ID's. I would like to plot the
"number" variable on a map (in the shapefile) using a color range per county
(e.g. white = min(number) = 2, black = max(number) = 15). Can a
Hi,
I need to perform calculations on subsets of a data frame:
DF = data.frame(read.table(textConnection("A B C D E F
1 a 1995 0 4 1
2 a 1997 1 1 3
3 b 1995 3 7 0
4 b 1996 1 2 3
5 b 1997 1 2 3
6 b 1998 6 0 0
7 b 1999 3 7 0
8 c 1997 1 2 3
9 c 1998 1 2
Solved the problem: I guess I was still using the main version of zoo. Thanks
again!
--
View this message in context:
http://r.789695.n4.nabble.com/Yearly-aggregates-and-matrices-tp3438140p3441723.html
Sent from the R help mailing list archive at Nabble.com.
_
Hi list, I would like to use the following data.frame to generate matrices
over a 3 year moving window:
DF = data.frame(read.table(textConnection(" A B C
80 8025 1995
80 8026 1995
80 8029 1995
81 8026 1996
82 8025 1997
82 8026 1997
83 8025 1997
83 8027 1997
84 8026 1999
84 80
Thanks a lot!
Given the calculations below, I would like to generate a data.frame in which
the value of case 8029 (i) in 1998 = sum (element c(i,j) * element
final1(1,j) * 1st eigenvalue 1998 * element final2 (i,j)) * (1/sum
final2(i)) in which j are all other cases in year 1998 and i IS NOT j?
so
As a follow up on this post, I am trying to slightly adjust the solution
kindly provided by Gabor. However, I am getting some results that I do not
understand. Example:
# devel version of zoo
install.packages("zoo", repos = "http://r-forge.r-project.org";)
library(zoo)
DF1 = data.frame(read.tab
Thanks for clarifying that.
Best
Gabor Grothendieck wrote:
>
> On Wed, Apr 20, 2011 at 5:49 AM, mathijsdevaan
> <mathijsdev...@gmail.com> wrote:
>> As a follow up on this post, I am trying to slightly adjust the solution
>> kindly provided by Gabor. However, I am get
43 matches
Mail list logo