[R] How to replace match words whith colum name of data frame?

2017-07-30 Thread Abraham Mathew
Try the stringr package. This should work chemical=c("basic", "alkalin", "alkali", "acid", " ph ", "hss") chemical_match <- str_c(chemical, collapse = "|") chemical_match concept_df$match[str_detect(concept_df$concept, chemical_match)] <- "chemical" concept_df > concept_df conc

[R] Oddd results when computing confidence intervals

2016-12-02 Thread Abraham Mathew
I have a vector of values, and have written a function that takes each value in that vector, generates a normal distribution with that value as the mean, and then finds the interval at different levels. However, these intervals don't seem to be right (too narrow). ### CREATE PREDICTION INTERVALS

[R] Merging two data frame with different lengths

2016-06-10 Thread Abraham Mathew
So I have two data frames. The first one is a reccomendation data frame and the second is a melted list with a pairing of OpportunityId's and ProductId's. There are multiple product id's per an opportunty id. What I want to do is merge based on ProductId so that I can add the OpportunityId to the

[R] Error: Invalid First Argument in DPlyr

2015-10-30 Thread Abraham Mathew
I'm getting an "invalid first argument" error for the following. However, con is an actual connection and is set up properly. So what does this error actually refer to? library(dplyr) con <- RSQLServer::src_sqlserver("***", database = "***") myData <- con %>% tbl("table") %>% group_by( work_d

[R] knitr error when using knit2wp: "need a login and password"

2015-08-30 Thread Abraham Mathew
I'm using the knitr package to post an .Rmd file to wordpress. First time I'm working this type of project and am having the following error/issue. Can anyone help identify the issue. Have done a number of Google searches but haven't seem similar issues. Also tried to use the newPost function for t

[R] Issues with RPostgres

2015-08-27 Thread Abraham Mathew
I have a user-defined function that I'm using alongside a postgresql connection to summarize some data. I've connected to the local machine with no problem. However, the connection keeps throwing the following error when I attempt to use it. Can anyone point to what I could be doing wrong. > ds_su

[R] Response Variable Coding

2015-08-19 Thread Abraham Mathew
Very simple question that I want confirm. Let's say that I have a response variable. What are the appropriate ways that it can be coded for a logistic regression model? 1. It can be 0/1 and a factor 2. It can be 1/2 and a factor 3. It can be characters and a factor, where the second letter takes

[R] Find common two and three word phrases in a corpus

2014-10-07 Thread Abraham Mathew
Let's say I have a corpus and want to find the two, three, etc word phrases that occur most frequently in the data. I normally do this in the following manner but am getting an error message and am having some difficulty diagnosing what is going wrong. Given the following data, I'd just want a coun

[R] Reshaping Data in R - Transforming Two Columns Into One

2014-03-17 Thread Abraham Mathew
I have the following data frame. Using the stringr package, I've attempted to map the url's to some specific elements that are in each url. I then used the reshape package to join two different data frames. The next step is to transform the two columns in the mydt data frame (forester and customer_

Re: [R] Parsing aspects of a url path in R

2014-03-06 Thread Abraham Mathew
gt; dices the url. > > library(XML) > parseURI('http://www.mdd.com/food/pizza/index.html') > > Might that help? > > Cheers, > Ben > > On Mar 6, 2014, at 12:23 PM, Abraham Mathew wrote: > > > Let's say that I have the following character vector with a s

[R] Parsing aspects of a url path in R

2014-03-06 Thread Abraham Mathew
should be /food/pizza/index.html build-your-own/index.html /special-deals.html If anyone has a solution using the stringr package, that'd be of interest also. Thanks. -- *Abraham Mathew**Analytics Strategist* *Minneapolis, MN* *720-648-0108* *abmathe...@gmail.com * *Twitter <https://twit

[R] Convert date column with two different structures

2013-11-05 Thread Abraham Mathew
Let's say I have the following data frame and the date column has two different ways in which date is presented. How can I use as.Date or the lubridate package to have one date structure for the entire colum df = data.frame(Date=c("5/1/13","8/1/13","9/1/13","Apr-10", "Apr-11","Apr-1

[R] Making predictions from a linear model

2013-08-27 Thread Abraham Mathew
I'm trying to educate myself about predictive analytics and am using R to generate a linear model with the following data. age <- c(23, 19, 25, 10,9, 12, 11,8) steroid <- c(27.1, 22.1, 21.9, 10.7, 7.4, 18.8, 14.7, 5.7) gpa <- c( 2.1, 2.9, 2.8, 3.5, 3.2, 3.9, 2.8, 2.6) sample

[R] Generating Predicted Probabilities in a Data Frame from a Logit Model

2013-05-24 Thread Abraham Mathew
ulate predict() such that I can get a similar output as ^^. mod1 = glm(posted ~ amount, data=ndat, family=binomial(link="probit")) summary(mod1) Can anyone help? Thanks! -- *Abraham Mathew Statistical Analyst **720-648-0108* *abmathe...@gmail.com * *Twitter <https://twitter.com/abma

[R] Finding predicted probabilities and their confidence intervals for a logit model

2013-01-29 Thread Abraham Mathew
I want to construct a logit model, plot the probability curve with the confidence intervals, and then I want to print out a data frame with the predictor, response value, predicted value, the low ci predicted value, and the high ci predicted value. So it should look something like: value low_ci

[R] Error message with the effects package - 'Subscript out of bounds'

2012-10-15 Thread Abraham Mathew
LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] effects_2.2-1colorspace_1.1-

[R] Specifying a response variable in a Bayesian network

2012-09-26 Thread Abraham Mathew
;,"GOOD")) dat$won = factor(dat$won) dat$sold = factor(dat$sold) dat$insured = factor(dat$insured) dat$credit = factor(dat$credit) highlight.opts <- list(nodes = c("won","sold","insured","credit"), col = "red"

[R] Plotting every probability curve

2012-09-11 Thread Abraham Mathew
nt","rent","own"), income=c(50,20,20,50,50), gender=c("M","M","F","F","F")) df$sell = as.factor(df$sell) df$home = as.factor(df$home) df$income = as.factor(df$income) df$gender = as.factor(df$gender) str(df) m1

Re: [R] Using the effects package to plot logit probabilities

2012-08-13 Thread Abraham Mathew
not use > > ?predict.glm ## with type = "response" ? > > -- Bert > > On Mon, Aug 13, 2012 at 12:39 PM, Abraham Mathew > wrote: > > I'm trying to run a logit model and plot the probability curve for a > number > > of the important predictors. I'm trying to

[R] Using the effects package to plot logit probabilities

2012-08-13 Thread Abraham Mathew
or any of the other predictors). However, what I want to do is generate the same plot, with won don y axis and income on x axis, but the curves showing the probabilities for age and home. Not seeing how to do this in the effects documentation. Help! Thanks. -- *Abraham Mathew Statistical A

[R] Calculating percentages across multiple columns

2012-08-08 Thread Abraham Mathew
5,8,3,5,4,2,3,5), purchase=c(6,3,4,5,5,5,6,2,3,7), sold=c(0,1,0,0,0,1,1,0,0,1)) f Thanks. -- *Abraham Mathew Statistical Analyst www.amathew.com 720-648-0108 @abmathewks* [[alternative HTML version deleted]] __ R-help@r-project.o

[R] Re-grouping data in R

2012-08-07 Thread Abraham Mathew
ls(dat$final_purchase_amount)character(0) Can anyone point to what I'm doing wrong. Thanks! -- *Abraham Mathew Statistical Analyst www.amathew.com 720-648-0108 @abmathewks* [[alternative HTML version deleted]] __ R-help@r-projec

[R] Naive Bayes in R

2012-08-02 Thread Abraham Mathew
ny diagnostic test to determine the overall misclassification rate of a NB classifier, and if there is a function in R that is available to implement it? Thanks, Abraham -- *Abraham Mathew Statistical Analyst www.amathew.com 720-648-0108 @abmathewks* [[alternative HTML version deleted]] __

[R] Removing values from a string

2012-07-19 Thread Abraham Mathew
ub to find a solution, and there doesn't seem to be anything helpful in the stringr package for this task. Thanks -- *Abraham Mathew Statistical Analyst www.amathew.com 720-648-0108 @abmathewks* [[alternative HTML version deleted]] __ R-help

[R] Using the effects package

2012-07-09 Thread Abraham Mathew
I've been looking into the effects package and it seems to be a great tool for plotting the probabilities of the response variable by the predictors. However, I'm wonder if I can use the effects package to plot the probabilities on the y axis and one predictor on the x axis, with the curve having t

Re: [R] Plotting the probability curve from a logit model with 10 predictors

2012-07-06 Thread Abraham Mathew
model. Do I simply expand the >> expand.grid() function to include all the variables? >> >> So my question is how do I form a plot of a logit probability curve when I >> have 10 predictors? >> >> would be nice to do this in ggplot2. >> >> Thank

[R] Plotting the probability curve from a logit model with 10 predictors

2012-07-05 Thread Abraham Mathew
ot;l") I'm not sure how to proceed when I have 10 or so predictors in the logit model. Do I simply expand the expand.grid() function to include all the variables? So my question is how do I form a plot of a logit probability curve when I have 10 predictors? would be nice to do

[R] Finding all the coefficients for a logit model

2012-02-09 Thread Abraham Mathew
Let's say I have a variable, day, which is saved as a factor with 7 levels, and I use it in a logistic regression model. I ran the model using the car package in R and printed out the results. mod1 = glm(factor(status1) ~ factor(day), data=mydat, family=binomial(link="logit")) print(summary(mod1))

[R] Grouping together a time variable

2012-02-09 Thread Abraham Mathew
2" "00:05:22" [23] "00:05:28" "00:05:44" "00:05:54" "00:06:54" "00:06:54" "00:07:10" "00:08:15" "00:08:26" What I am trying to do is group the data into one hour incr

Re: [R] Error message with glm

2011-12-22 Thread Abraham Mathew
ot;)) On Thu, Dec 22, 2011 at 8:23 AM, Abraham Mathew wrote: > > I'm working on a logistic regression in R with the car package but keep > getting the following error message. > It's only and warning and not an error, but I'm just not sure how to > resolve the issues

[R] Finding predicted probabilities

2011-12-22 Thread Abraham Mathew
ot(s.out) When I run with mbid as 300, I get 49%. At 500, it's 49% and at 700 it's 50%. At 1500, it's 51% These results are just really weird. I was expecting an exponential curve when I plotted mbid by probability of winning, but that doesn't seem to be the case

[R] Error message with glm

2011-12-22 Thread Abraham Mathew
... $ mbid: int 700 300 700 300 500 300 300 700 300 300 ... Can anyone tell me what I should do to fix the warnings. -- *Abraham Mathew Statistical Analyst www.amathew.com 720-648-0108 @abmathewks* [[alternative HTML version deleted]] ___

Re: [R] Predicting a linear model for all combinations

2011-12-21 Thread Abraham Mathew
845 > Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | > www.r-statistics.com (English) > > ------ > > > > > On Wed, Dec 21, 2011 at 12:48 PM, Abraham Mathew wrote: > >>

Re: [R] Predicting a linear model for all combinations

2011-12-21 Thread Abraham Mathew
li.com (Hebrew) | www.biostatistics.co.il (Hebrew) | > www.r-statistics.com (English) > > ---------- > > > > > On Wed, Dec 21, 2011 at 12:04 PM, Abraham Mathew wrote: > >> >> I looked into what you suggested an

Re: [R] Predicting a linear model for all combinations

2011-12-21 Thread Abraham Mathew
w) | www.biostatistics.co.il (Hebrew) | > www.r-statistics.com (English) > > ---------- > > > > > On Wed, Dec 21, 2011 at 6:59 AM, Abraham Mathew wrote: > >> Lets say I have a linear model and I want to find the a

[R] Predicting a linear model for all combinations

2011-12-21 Thread Abraham Mathew
combination of values in the independent variables. So Expected price when: weather=1, gender=male weather=1, gender=female weather=2, gender=male etc. Can anyone help with this problem? -- *Abraham Mathew Statistical Analyst www.amathew.com 720-648-0108 @abmathewks* [[alternative HTML

[R] Zellig Error Message

2011-12-16 Thread Abraham Mathew
ran everything but the explanatory variable as a numeric variable. Now, I'm trying everything and no luck. -- *Abraham Mathew Statistical Analyst www.amathew.com 720-648-0108 @abmathewks* [[alternative HTML version deleted]] __ R-help@r-

[R] Error constructing probabilities in Zelig

2011-12-16 Thread Abraham Mathew
lternative solution that I can use to generate the probabilities. -- *Abraham Mathew Statistical Analyst www.amathew.com 720-648-0108 @abmathewks* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mail

[R] Incorrect Number of Dimensions in Zelig with setx()

2011-12-15 Thread Abraham Mathew
dimensions I googled this problem and couldn't find anything, minus a question by me on this same problem from 1.5 years ago. Just don't remember what I did to solve the problem. Help! -- *Abraham Mathew Statistical Analyst www.amathew.com 720-648-0108 @abmathewks* [[

[R] Reordering a numeric variable

2011-12-15 Thread Abraham Mathew
s(educ)[c(3,5)]] <- "Advanced Degree" educ2[educ %in% levels(educ)[c(6,8)]] <- "Other" educ2 = factor(educ2) levels(educ2) The above code is how I regrouped the variable. How can I regroup it so that it's levels are from lowest to highest. What if they're numeric

[R] Fisher Exact Test

2011-11-17 Thread Abraham Mathew
This mean First, I am no expert but I am analyzing some marketing data. I have information on two versions of the same site, and I have data on the number of times people filled out a form on each version of the site. Sample data: Site 1 Site 2 Fill

[R] Error When Installing the RODBC Package

2011-11-16 Thread Abraham Mathew
;t have a package installed that is necessary for RODBC. What is that package? -- *Abraham Mathew Statistical Analyst www.amathew.com 720-648-0108 @abmathewks* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.c

[R] Difference between a data frame and data table

2011-08-29 Thread Abraham Mathew
I didn't learn about data tables until recently. (They're never covered in any intro R books). In any case, I'm not sure what (if any) is the difference between a data frame and a data table. Can anyone provide a brief explanation? Is one preferred over another or is it just dependent on the tas

[R] Placing brackets around the values in a data frame

2011-07-27 Thread Abraham Mathew
Lets say I have the following data frame. df = data.frame(word = c("David", "James", "Sara", "Jamie", "Jon")) df I was trying to place brackets , [ ] , around each string. I'll be exporting it with write.table and quotes=FALSE, so it will eventually look like: [David] [James] [Sara] Can

[R] Dealing with many parameters in a function

2011-07-25 Thread Abraham Mathew
I'm creating a function in R. However, I have a large number of function parameters, and need to find an efficient solution for running the function with all the parameters. So in the following function, I have about 20 parameters that I assign to the function, with almost all the values being diff

[R] xml2-config issues

2011-07-23 Thread Abraham Mathew
I'm trying to install the XML package on Ubuntu 10.10, and I keep getting a warning message the XML could not be found and had non-zero exit status. How can I fix this problem? > install.packages() Loading Tcl/Tk interface ... done --- Please select a CRAN mirror for use in this session --- Instal

[R] Stacked Bar Plot in ggplot2

2011-07-19 Thread Abraham Mathew
I'm trying to develop a stacked bar plot in R with ggplot2. My data: conv = c(10, 4.76, 17.14, 25, 26.47, 37.5, 20.83, 25.53, 32.5, 16.7, 27.33) click = c(20, 42, 35, 28, 34, 48, 48, 47, 40, 30, 30) date = c("July 7", "July 8", "July 9", "July 10", "July 11", "July 12", "July 13", "July 14", "Jul

[R] Finding Confidence Intervals

2011-07-11 Thread Abraham Mathew
This is a very basic question, so please bear with me. I've been learning about AB Testing, which is largely used in internet marketing to examine the effectiveness of certain aspects of ads, websites, etc. Here's a couple links to people who want to know more about AB Testing: http://visualwebsi

[R] Finding the "levels" in a data frame column

2011-06-23 Thread Abraham Mathew
I have a data frame that looks as follows. df <- data.frame(city=c("Houston", "Houston", "El Paso", "Waco", Houston", "Plano", "Plano") What I want to do is get a list of the city values. Currently, when I run df$city, I get all the values. I just want to know the four cities that appear. So ins

[R] Invalid Regular Expression

2011-06-14 Thread Abraham Mathew
I'm working with some data, and am trying to generate it in the following format. statecity zipcode I like pizza0 0 0 I live in Denver 0 1 0

[R] Putting commas in between character strings

2011-06-14 Thread Abraham Mathew
I have a number of strings in a vector, and want the output to be seperated by commas. > t [1] "35004" "35005" "35006" "35007" "35010" "35014" "35016" So I want want it to look like: "35004", 35005", "35006", "35007",... Can anyone help? I initially thought strsplit would be the correct funct

[R] Counting the Number of Letters in a row

2011-06-10 Thread Abraham Mathew
I'm trying to find the total number of letters in a row of a data frame. Let's say I have the following data frame. f1 <- data.frame(keyword=c("I live in Denver", I live in Kansas City, MO", "Pizza is good")) The following function gives me the number of characters in each string. So for "I live

[R] Trying to make code more efficient

2011-06-09 Thread Abraham Mathew
I have a repetative task in R and i'm trying to find a more efficient way to perform the following task. lst <- list(roots = c("car insurance", "auto insurance"), roots2 = c("insurance"), prefix = c("cheap", "budget"), prefix2 = c("low cost"), suffix = c("quote", "quotes

Re: [R] Problem with a if statement inside a function

2011-06-09 Thread Abraham Mathew
I passed it as an argument to the function because every week I'll need to add keywords to the lst, and that function will make the process more automated. On Thu, Jun 9, 2011 at 10:21 AM, Sarah Goslee wrote: > On Thu, Jun 9, 2011 at 11:53 AM, Abraham Mathew > wrote: > >

Re: [R] Problem with a if statement inside a function

2011-06-09 Thread Abraham Mathew
a bad idea, because > it guarantees that nobody but you can ever use it. And why would you, > rather than passing the working directory as an argument if it's > crucial? > > Sarah > > > On Thu, Jun 9, 2011 at 11:14 AM, Abraham Mathew > wrote: > > I have a rea

[R] Problem with a if statement inside a function

2011-06-09 Thread Abraham Mathew
I have a really long functions, and at the end of the function, I am using a if statement to tag certain keywords based on whether they have certain values contained in them. However, the if statement doesn't seem to work. When I had split up the commands into various functions, it worked fine, b

[R] Using a function inside a function

2011-06-08 Thread Abraham Mathew
I'm trying to run a function inside a function but get an error message. lst <- list(roots = c("car insurance", "auto insurance"), roots2 = c("insurance"), prefix = c("cheap", "budget"), prefix2 = c("low cost"), suffix = c("quote", "quotes"), suffix2 = c("rate", "rates"), suffix3 = c("comparison")

[R] Error: missing values where TRUE/FALSE needed

2011-06-08 Thread Abraham Mathew
I'm writing a function and keep getting the following error message. myfunc <- function(lst) { lst <- list(roots = c("car insurance", "auto insurance"), roots2 = c("insurance"), prefix = c("cheap", "budget"), prefix2 = c("low cost"), suffix = c("quote", "quotes"), suffix2 = c("rate", "rates"), suf

[R] Automating a process

2011-06-08 Thread Abraham Mathew
I have a series of strings and I am trying to find all combinations and then assign 1 or 0 to them based on whether they contain the words car or budged. I want the data to look like: car budget cheap car insurance quote10 budget car insurance quote 11

Re: [R] Adding values to the end of a data frame

2011-06-08 Thread Abraham Mathew
<- one(roots, suffix) > rbind(d1, d2) > > To see a potential flaw in your function (as least as far as console > output is concerned), try > rbind(d1, one(roots, suffix)) > > HTH, > Dennis > > On Tue, Jun 7, 2011 at 3:30 PM, Abraham Mathew > wrote: > > Let&#

[R] Adding values to the end of a data frame

2011-06-07 Thread Abraham Mathew
Let's say that I'm trying to write a functions that will allow me to automate a process where I examine all possible combinations of various string groupings. Each time I run the one function, I want to include the new values to the end of a data frame. The data frame will basically be one column w

[R] Regular Expressions for "Large" Data Set

2011-06-07 Thread Abraham Mathew
I'm running R 2.13 on Ubuntu 10.10 I have a data set which is comprised of character strings. site = readLines('http://www.census.gov/tiger/tms/gazetteer/zips.txt') dat <- c("01, 35004, AL, ACMAR, 86.51557, 33.584132, 6055, 0.001499") dat I want to loop through the data and construct a data fra

[R] Merge two columns of a data frame

2011-06-06 Thread Abraham Mathew
I have the following data: prefix <- c("cheap", "budget") roots <- c("car insurance", "auto insurance") suffix <- c("quote", "quotes") prefix2 <- c("cheap", "budget") roots2 <- c("car insurance", "auto insurance") roots3 <- c("car insurance", "auto insurance") suffix3 <- c("quote", "quotes") df

[R] Partial Matching

2011-06-04 Thread Abraham Mathew
Let's say that I have a string and I want to know if a single word is present in the string. I've written the following function to see if the word "Geico" is mentioned in the string "Cheap Geico car insurance". However, it doesn't work, and I assume it has something to do with the any() function.

[R] Using SQLDF to pick values based on word count

2011-06-02 Thread Abraham Mathew
I have a data frame in R with the following values. cars autocar cars info what is that donna drive car telephone i need car... I want to select all values which contain 'car', values with three words, and those keywords with car that contain three words. The first part is done with : sqldf("SE

[R] My First Attempt at Screen Scraping with R

2011-05-06 Thread Abraham Mathew
Hello Folks, I'm working on trying to scrape my first web site and ran into a issue because I'm really don't know anything about regular expressions in R. library(XML) library(RCurl) site <- "http://thisorthat.com/leader/month"; site.doc <- htmlParse(site, ?, xmlValue) At the ?, I realize that

[R] Subsetting Data

2011-04-28 Thread Abraham Mathew
I'm using the subset() function in R. dat <- data.frame(one=c(6,7,8,9,10), Number=c(5,15,13,1,13)) subset(dat, Number >= 10) However, I want to find the number of all rows who meet the Number>=10 condition. I've done this in the past with something like colSums or rowSums or another similar fun

Re: [R] Merging two columns of a data frame

2011-04-28 Thread Abraham Mathew
02 unfortunately, can I delete the Year and Month Columns. Once that's done, I can reconfigure the columns Abraham On Thu, Apr 28, 2011 at 11:00 AM, Abraham Mathew wrote: > > Hi folks, I have a simple question that I just can't solve. > > I'm trying to merge two column

[R] Merging two columns of a data frame

2011-04-28 Thread Abraham Mathew
Hi folks, I have a simple question that I just can't solve. I'm trying to merge two columns in my data frame. > sessionInfo() R version 2.13.0 (2011-04-13) Platform: i686-pc-linux-gnu (32-bit) > head(dat) Year Month Number 2002 Jan 0 2002 Feb 0 2002 March0 2002 April

[R] Install and Configure RSQLite in Ubuntu

2011-04-25 Thread Abraham Mathew
Hi Folks, I'm new to the linux world and am having some trouble installing the RSQLite package. SQLite is installed, but some dependencies(?) seem to be missing. Can anyone help? > sessionInfo() R version 2.13.0 (2011-04-13) Platform: i686-pc-linux-gnu (32-bit) > install.packages() Installing

Re: [R] Problem installing XML in Ubuntu 10.10

2011-04-25 Thread Abraham Mathew
> on your Ubuntu machine. > > - Phil > > > > > On Mon, 25 Apr 2011, Abraham Mathew wrote: > > Hello folks, >> >> >> Here's is info on what system I'm working on. >> >>> sessionInfo() >>> >> R version 2.13.0 (2

[R] Problem installing XML in Ubuntu 10.10

2011-04-25 Thread Abraham Mathew
Hello folks, Here's is info on what system I'm working on. > sessionInfo() R version 2.13.0 (2011-04-13) Platform: i686-pc-linux-gnu (32-bit) I'm trying to install the XML package. However, I end up with the following error message. > install.packages("XML") checking for xml2-config... no