[R] data.frame, converting row data to columns

2009-04-04 Thread ds

I have a data frame something like:
  name wrist
nLevelemot
14094  3.341   
frustrated
24094  3.941   
frustrated
34094NA1   
frustrated
44094  3.511   
frustrated
54094  3.811   
frustrated
64101  2.624  
excited
74094  2.651   
frustrated
84101NA4  
excited
94101  0.244  
excited
10   4101  0.234  
excited

I am trying to change it to this:

  name  nLevel   emot  w1
w2   w3  w4  w5 w5
4094   1  frustrated3.34   
3.94  NA3.51 3.812.65
4101   4  excited  
2.62   NA 0.240.23  NANA

The nLevel and emot will never vary with the name, so there can be one  
row per name.  But I need the wrist measurements to be in the same  
row.  The number of wrist measures are variable, so I could fill in  
with NAs .  But I really just need help with reshaping the data frame

I think I had some success with the melt

x 
= 
melt 
.data 
.frame(bsub,id.vars=c("name","nLevel","emot"),measure.vars=c("wrist"))

But I can't figure out the cast to get the wrist values in the rows.


Thanks

ds


David H. Shanabrook
dhsha...@acad.umass.edu
256-1019 (home)




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] question on lm or glm matrix of coeficients X test data terms

2008-07-07 Thread DS

Hi,

  is there an easy way to get the calculated weights in a regression equation?



for e.g.

if my model has 2 variables 1 and 2 with coefficient .05 and .6

how can I get the computed values for a test dataset for each coefficient?

data

var1,var2

10,100



so I want to get .5, 60 back in a vector.  This is a one row example but I 
would want to get a matrix of multiplied out coefficients and terms for use in 
comparing contribution of variables to final score.  As in a scorecard using 
logistic regression.



Please advise.

thanks

Dhruv

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question on lm or glm matrix of coeficients X test data terms

2008-07-07 Thread DS

thanks Jorge.  I appreciate your quick help.



Will this work if I have 20 columns of data but my regression only has 5 
variables?



I am looking for something generic where I can give it my model and test data 
and get back a vector of the multiplied coefficients (with no hard coding).  
When predict is called with an input model and data, R must be multiplying all 
co-efficients times variables and summing the number but is there a way to get 
components of the regressiom terms stored in a matrix before they are added?



The idea is to build n models with various terms and after producing a 
prediction list the top 3 variables that had the biggest impact in that 
particular set of predictor values.



e.g. if I build a model to predict default of loans I would then need to list 
the top factors in the model that can be used to explain why the loan is risky. 
 With 10-16 variables which can be present or not for each case there be a 
different 2 or 3 variables that led to the said prediction.



Dhruv





 --- On Mon 07/07, Jorge Ivan Velez < [EMAIL PROTECTED] > wrote:

From: Jorge Ivan Velez [mailto: [EMAIL PROTECTED]

To: [EMAIL PROTECTED]

Date: Mon, 7 Jul 2008 20:12:53 -0400

Subject: Re: [R] question on lm or glm matrix of coeficients X test data terms



Dear Dhruv,Try also:# data setset.seed(123)X=matrix(rpois(10,10),ncol=2)# 
Function to estimate your 
outcomeoutcome=function(x,betas){if(length(x)!=length(betas)) stop("x and betas 
are of different length!")

y=x*betasy}# outcome for beta1=0.05 and 
beta2=0.6t(apply(X,1,outcome,betas=c(0.05,0.6)))# outcome for beta1=5 and 
beta2=6

t(apply(X,1,outcome,betas=c(5,6)))

HTH,JorgeOn Mon, Jul 7, 2008 at 7:56 PM, DS <[EMAIL PROTECTED]> wrote:



Hi,



  is there an easy way to get the calculated weights in a regression equation?







for e.g.



if my model has 2 variables 1 and 2 with coefficient .05 and .6



how can I get the computed values for a test dataset for each coefficient?



data



var1,var2



10,100







so I want to get .5, 60 back in a vector.  This is a one row example but I 
would want to get a matrix of multiplied out coefficients and terms for use in 
comparing contribution of variables to final score.  As in a scorecard using 
logistic regression.









Please advise.



thanks



Dhruv



__

R-help@r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question on lm or glm matrix of coeficients X test data terms

2008-07-07 Thread DS

thanks Jorge.  I appreciate your multiple improvements.



This still involves hard coding the co-efficients.  I wonder if this is what 
glm and lm are doing.

for e.g.

m<-lm(K~a+b,data=data)

m$coefficients would have 0 for all variables except a and b and then R must be 
multiplying the weights the same way as your function.



I will try to use your code with the coefficients matrix from the model and see 
if that works and report back what I find tomorrow.



Then if I can add code to return the names of the columns with the resulting 
highest 3 values of the numbers then I should be done.



thanks a lot Jorge.



regards,



Dhruv







 --- On Mon 07/07, Jorge Ivan Velez < [EMAIL PROTECTED] > wrote:

From: Jorge Ivan Velez [mailto: [EMAIL PROTECTED]

To: [EMAIL PROTECTED]

Date: Mon, 7 Jul 2008 21:42:54 -0400

Subject: Re: [R] question on lm or glm matrix of coeficients X test data terms



That's R: you come out with solutions every time. I hope don't bother you with 
this. Try also:# data set (10 rows, 10 
columns)set.seed(123)X=matrix(rpois(100,10),ncol=10)# Function to estimate your 
outcome

outcome=function(x,betas){if(length(x)!=length(betas)) stop("x and beta have 
different lengths!")y=x*betassum(y)}# let's assume that you want to include x1, 
x4, x7 and x9 only# by using beta1=0.5, beta4=0.6, beta7=-0.1, beta9=0.3

betas=c(0.5,0,0,0.6,0,0,-0.1,0,0.3,0)# Resultsapply(X,1,outcome, 
betas=betas)HTH,JorgeOn Mon, Jul 7, 2008 at 9:31 PM, Jorge Ivan Velez <[EMAIL 
PROTECTED]> wrote:

Sorry, I forgot to the the sum over the rows:# data set (10 rows, 10 columns)



set.seed(123)X=matrix(rpois(100,10),ncol=10)# Function to estimate your 
outcomeoutcome=function(x,betas){if(length(x)!=length(betas)) stop("x and beta 
have different lengths!")





y=x*betasy}# let's assume that you want to include x1, x4, x7 and x9 only# by 
using beta1=0.5, beta4=0.6, beta7=-0.1, 
beta9=0.3betas=c(0.5,0,0,0.6,0,0,-0.1,0,0.3,0)



# Resultsapply(t(apply(X,1,outcome, betas=betas)),1,sum)

HTH,JorgeOn Mon, Jul 7, 2008 at 9:23 PM, Jorge Ivan Velez <[EMAIL PROTECTED]> 
wrote:



Dear Dhruv,It's me again. I've been thinking about a little bit. If you want to 
include/exclude variables to estimate your outcome, you could try something 
like this:# data set (10 rows, 10 columns)





set.seed(123)X=matrix(rpois(100,10),ncol=10)# Function to estimate your 
outcomeoutcome=function(x,betas){if(length(x)!=length(betas)) stop("x and beta 
have different lengths!")





y=x*betasy}# let's assume that you want to include x1, x4, x7 and x9 only# by 
using beta1=0.5, beta4=0.6, beta7=-0.1, 
beta9=0.3betas=c(0.5,0,0,0.6,0,0,-0.1,0,0.3,0)# Resultst(apply(X,1,outcome, 
betas=betas))





HTH,JorgeOn Mon, Jul 7, 2008 at 9:11 PM, Jorge Ivan Velez <[EMAIL PROTECTED]> 
wrote:





Dear Dhruv,The short answer is not, because the function I built doesn't work 
for more variables than coefficients (see the "stop" I introduced). You should 
do some modifications such as coefficients equals to 1 or 0. For example:







# data set (10 rows, 10 columns)set.seed(123)X=matrix(rpois(100,10),ncol=10)X# 
Function to estimate your 
outcomeoutcome=function(x,betas,val){k=length(x)nb=length(betas)





if(length(x)!=length(betas)) betas=c(betas, rep(val,k-nb))

y=x*betasy}# beta1=1, beta2=2, the rest is equal to 
zerot(apply(X,1,outcome,betas=c(1,2),val=0))# beta1=0.5, beta2=0.6, the rest is 
equal to 1

t(apply(X,1,outcome,betas=c(1,2),val=1))



HTH,JorgeOn Mon, Jul 7, 2008 at 8:57 PM, DS <[EMAIL PROTECTED]> wrote:









thanks Jorge.  I appreciate your quick help.







Will this work if I have 20 columns of data but my regression only has 5 
variables?







I am looking for something generic where I can give it my model and test data 
and get back a vector of the multiplied coefficients (with no hard coding).  
When predict is called with an input model and data, R must be multiplying all 
co-efficients times variables and summing the number but is there a way to get 
components of the regressiom terms stored in a matrix before they are added?















The idea is to build n models with various terms and after producing a 
prediction list the top 3 variables that had the biggest impact in that 
particular set of predictor values.







e.g. if I build a model to predict default of loans I would then need to list 
the top factors in the model that can be used to explain why the loan is risky. 
 With 10-16 variables which can be present or not for each case there be a 
different 2 or 3 variables that led to the said prediction.















Dhruv











 --- On Mon 07/07, Jorge Ivan Velez < [EMAIL PROTECTED] > wrote:



From: Jorge Ivan Velez [mailto: [EMAIL PROTECTED]



To: [EMAIL PROTECTED]



Date: Mon, 7 Jul 2008 20:12:53 -0400



Subject: Re: [R] question on lm or glm matrix of coeficients X test

Re: [R] question on lm or glm matrix of coeficients X test data terms

2008-07-08 Thread DS

Hi,

  I found some of what I was looking for.



using the following I can get a matrix of regression coefficient multiplied out 
by the variable data.

g<-predict(comodel,type='terms',data4)

m<-cbind(data4,g)

  

What remains is how do I pick the 3-4 rows for each data row with the highest 
values?



I need to get the column names of the top 3 coefficients from this matrix.



Some looping through for each row and pick the top 3 highest 
coefficient/variable products and then getting the columns names for these 3.



is there an easy way to get this in an R function?



thanks

Dhruv









 --- On Mon 07/07, Jorge Ivan Velez < [EMAIL PROTECTED] > wrote:

From: Jorge Ivan Velez [mailto: [EMAIL PROTECTED]

To: [EMAIL PROTECTED]

Date: Mon, 7 Jul 2008 21:42:54 -0400

Subject: Re: [R] question on lm or glm matrix of coeficients X test data terms



That's R: you come out with solutions every time. I hope don't bother you with 
this. Try also:# data set (10 rows, 10 
columns)set.seed(123)X=matrix(rpois(100,10),ncol=10)# Function to estimate your 
outcome

outcome=function(x,betas){if(length(x)!=length(betas)) stop("x and beta have 
different lengths!")y=x*betassum(y)}# let's assume that you want to include x1, 
x4, x7 and x9 only# by using beta1=0.5, beta4=0.6, beta7=-0.1, beta9=0.3

betas=c(0.5,0,0,0.6,0,0,-0.1,0,0.3,0)# Resultsapply(X,1,outcome, 
betas=betas)HTH,JorgeOn Mon, Jul 7, 2008 at 9:31 PM, Jorge Ivan Velez <[EMAIL 
PROTECTED]> wrote:

Sorry, I forgot to the the sum over the rows:# data set (10 rows, 10 columns)



set.seed(123)X=matrix(rpois(100,10),ncol=10)# Function to estimate your 
outcomeoutcome=function(x,betas){if(length(x)!=length(betas)) stop("x and beta 
have different lengths!")





y=x*betasy}# let's assume that you want to include x1, x4, x7 and x9 only# by 
using beta1=0.5, beta4=0.6, beta7=-0.1, 
beta9=0.3betas=c(0.5,0,0,0.6,0,0,-0.1,0,0.3,0)



# Resultsapply(t(apply(X,1,outcome, betas=betas)),1,sum)

HTH,JorgeOn Mon, Jul 7, 2008 at 9:23 PM, Jorge Ivan Velez <[EMAIL PROTECTED]> 
wrote:



Dear Dhruv,It's me again. I've been thinking about a little bit. If you want to 
include/exclude variables to estimate your outcome, you could try something 
like this:# data set (10 rows, 10 columns)





set.seed(123)X=matrix(rpois(100,10),ncol=10)# Function to estimate your 
outcomeoutcome=function(x,betas){if(length(x)!=length(betas)) stop("x and beta 
have different lengths!")





y=x*betasy}# let's assume that you want to include x1, x4, x7 and x9 only# by 
using beta1=0.5, beta4=0.6, beta7=-0.1, 
beta9=0.3betas=c(0.5,0,0,0.6,0,0,-0.1,0,0.3,0)# Resultst(apply(X,1,outcome, 
betas=betas))





HTH,JorgeOn Mon, Jul 7, 2008 at 9:11 PM, Jorge Ivan Velez <[EMAIL PROTECTED]> 
wrote:





Dear Dhruv,The short answer is not, because the function I built doesn't work 
for more variables than coefficients (see the "stop" I introduced). You should 
do some modifications such as coefficients equals to 1 or 0. For example:







# data set (10 rows, 10 columns)set.seed(123)X=matrix(rpois(100,10),ncol=10)X# 
Function to estimate your 
outcomeoutcome=function(x,betas,val){k=length(x)nb=length(betas)





if(length(x)!=length(betas)) betas=c(betas, rep(val,k-nb))

y=x*betasy}# beta1=1, beta2=2, the rest is equal to 
zerot(apply(X,1,outcome,betas=c(1,2),val=0))# beta1=0.5, beta2=0.6, the rest is 
equal to 1

t(apply(X,1,outcome,betas=c(1,2),val=1))



HTH,JorgeOn Mon, Jul 7, 2008 at 8:57 PM, DS <[EMAIL PROTECTED]> wrote:









thanks Jorge.  I appreciate your quick help.







Will this work if I have 20 columns of data but my regression only has 5 
variables?







I am looking for something generic where I can give it my model and test data 
and get back a vector of the multiplied coefficients (with no hard coding).  
When predict is called with an input model and data, R must be multiplying all 
co-efficients times variables and summing the number but is there a way to get 
components of the regressiom terms stored in a matrix before they are added?















The idea is to build n models with various terms and after producing a 
prediction list the top 3 variables that had the biggest impact in that 
particular set of predictor values.







e.g. if I build a model to predict default of loans I would then need to list 
the top factors in the model that can be used to explain why the loan is risky. 
 With 10-16 variables which can be present or not for each case there be a 
different 2 or 3 variables that led to the said prediction.















Dhruv











 --- On Mon 07/07, Jorge Ivan Velez < [EMAIL PROTECTED] > wrote:



From: Jorge Ivan Velez [mailto: [EMAIL PROTECTED]



To: [EMAIL PROTECTED]



Date: Mon, 7 Jul 2008 20:12:53 -0400



Subject: Re: [R] question on lm or glm matrix of coeficients X test data terms







Dear Dhruv,Try also:# d

Re: [R] question on lm or glm matrix of coeficients X test data terms

2008-07-08 Thread DS

thanks Jorge.  This is great!



regards,

Dhruv





 --- On Tue 07/08, Jorge Ivan Velez < [EMAIL PROTECTED] > wrote:

From: Jorge Ivan Velez [mailto: [EMAIL PROTECTED]

To: [EMAIL PROTECTED]

Date: Tue, 8 Jul 2008 20:45:06 -0400

Subject: Re: [R] question on lm or glm matrix of coeficients X test data terms



Hi Dhruv,Thanks for the data. Here is what you need so far:# Data 
setyourdata=structure(c(0.024575733, 0.775009533, 0.216823408, 0.413676529, 
0.270053406, 0.579946123, 0.634013362, 0.928518128, 0.405825012, 

0.862204203, 0.856558209, 0.187865722, 0.818774004, 0.918802224, 0.469496189, 
0.240583922, 0.390818789, 0.767969261, 0.13339806, 0.986023924, 0.442655239, 
0.437441939, 0.313678293, 0.952285599, 0.528433974, 0.328609537, 0.84584467, 
0.608194527, 0.96139021, 

0.485592658, 0.251827955, 0.289777559), .Dim = c(4L, 8L), .Dimnames = list(
NULL, c("A", "B", "C", "D", "E", "F", "A:B", "G:H")))

# Function to select the top k values (names)ftopk= function(x,top=3){ 
res=cnames[order(x, decreasing = TRUE)][1:top] paste(res,collapse=";",sep="")}# 
Application of the function using the top 3 rows

topk=apply(yourdata,1,ftopk,top=3)# Resultdata.frame(yourdata,topk) A B 
C D E F   A.B   G.H  topk1 
0.02457573 0.2700534 0.4058250 0.8187740 0.3908188 0.4426552 0.5284340 
0.9613902 G:H;D;A:B

2 0.77500953 0.5799461 0.8622042 0.9188022 0.7679693 0.4374419 0.3286095 
0.4855927 D;C;A3 0.21682341 0.6340134 0.8565582 0.4694962 0.1333981 
0.3136783 0.8458447 0.2518280   C;A:B;B4 0.41367653 0.9285181 0.1878657 
0.2405839 0.9860239 0.9522856 0.6081945 0.2897776 E;F;B

HTH,JorgeOn Tue, Jul 8, 2008 at 8:19 PM, DS <[EMAIL PROTECTED]> wrote:

 Hi Jorge,



  I am attaching some sample data that looks like the coefficient matrix.







In the spreadsheet for each row I have listed the column names I would want to 
extract for each row. (the ones with the highest values in the row).







hope this helps.



thanks



regards,



Dhruv



















 --- On Tue 07/08, Jorge Ivan Velez < [EMAIL PROTECTED] > wrote:



From: Jorge Ivan Velez [mailto: [EMAIL PROTECTED]



To: [EMAIL PROTECTED]



Date: Tue, 8 Jul 2008 19:36:52 -0400



Subject: Re: [R] question on lm or glm matrix of coeficients X test data terms







Dear Dhruv, Could you please send me part your data set m?  Just 10-20 rows, so 
I'll have any idea about what you have and what you'd like. I hope you don't 
mind.Thanks a lot,Jorge



On Tue, Jul 8, 2008 at 7:33 PM, DS  wrote:







Hi,







  I found some  of what I was looking for.















using the following I can get a matrix of regression coefficient multiplied out 
by the variable data.







g wrote:















From: Jorge Ivan Velez [mailto: [EMAIL PROTECTED]















To: [EMAIL PROTECTED]















Date: Mon, 7 Jul 2008 20:12:53 -0400















Subject: Re: [R] question on lm or glm matrix of coeficients X test data terms































Dear Dhruv,Try also:# data setset.seed(123)X=matrix(rpois(10,10),ncol=2)# 
Function to estimate your 
outcomeoutcome=function(x,betas){if(length(x)!=length(betas)) stop("x and betas 
are of different length!")





















































y=x*betasy}# outcome for beta1=0.05 and 
beta2=0.6t(apply(X,1,outcome,betas=c(0.05,0.6)))# outcome for beta1=5 and 
beta2=6















t(apply(X,1,outcome,betas=c(5,6)))















HTH,JorgeOn Mon, Jul 7, 2008 at 7:56 PM, DS  wrote:































Hi,































  is there an easy way to get the calculated weights in a regression equation?































































for e.g.































if my model has 2 variables 1 and 2 with coefficient .05 and .6































how can I get the computed values for a test dataset for each coefficient?































data































var1,var2































10,100































































so I want to get .5, 60 back in a vector.  This is a one row example but I 
would want to get a matrix of multiplied out coefficients and terms for use in 
comparing contribution of variables to final score.  As in a scorecard using 
logistic regression.





















































































































Please advise.































thanks































Dhruv































__















R-help@r-project.org mailing list















https://stat.ethz.ch/mailman/listinfo/r-help















PLEASE do read the posting guide http://www.R-project.org/posting-guide.

[R] r format questions

2008-09-21 Thread DS
Hi,

1)   I have noticed that when I use the aggregate function it outputs numbers 
in the results. for example:
aggregate by product

group.1   Aggregate
1ProductA   1000400.00
2ProductB   23232323.00
3Missing  232323.00

is there a way to suppress the numbers infront of aggregate outputs.  I checked 
and they don't look like columns when I do a summary so I can't -1 them away.

2) is there an easy way to then take my aggregate matrix and then format the 
sum wtih $ and commas. for e.g instead 1 it should show
$10,000.00?

I am trying to create a report and am piping the aggregate into an xtable and 
feeding it R2html.

thanks
Dhruv


Medical Billing and Coding Training

ools.
http://tagline.excite.com/fc/JkJQPTgMxYL8ba16zHPqHis4q6x4p3rbpaGcEAQIui8YyxCoQBNUxa/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] design question on piping multiple data sets from 1 file into R

2008-09-21 Thread DS
Hi,
   I have some queries that I use  to get time series information for 8 
seperate queries which deal with a different set of time series each.

  I take my queries run them and save the output as csv file and them format 
the data into graphs in excel.

  I wanted to know if there is an elegant and clean way to read in 1 csv file 
but to read the seperate matrices on different rows into seperate R data 
objects.

  if this is easy then I can read the 8 datasets in the csv file into 8 r 
objects and pipe them to time series objects for graphs.

thanks
Dhruv


Email Fax
It's easy to receive faxes via email. Click now to find out how!
http://tagline.excite.com/fc/JkJQPTgLMRGrZRz1SpXTBEyJ7zsqYo4Wrxjvd4ml8SSHhbc6NzbNSo/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.