[R] Simple look up
So i am hoping this solution is simple, which i beleive it is. I would like to look up a value in one column of a data set and display the corresponding value in the 2nd column. For example TAZVACANT ACRES 100 45 101 62 102 23 103 64 104 101 105 280 So if i want to find TAZ 103 it would give me 64 as the corresponding value. Hope you guys can help Cheers -- View this message in context: http://www.nabble.com/Simple-look-up-tp19098777p19098777.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple look up
WTF mate. you think im about wasting time. I been seeking a solution for this for two weeks. excuse my lack of programming savvy, but this is why i posted. bartjoosen wrote: > > Please take a look at the manuals before posting (DO read the posting > guide!) > > Bart > PS: yourdataframe[yourdataframe$TAZ==103,2] > > > > > > PDXRugger wrote: >> >> So i am hoping this solution is simple, which i beleive it is. I would >> like to look up a value in one column of a data set and display the >> corresponding value in the 2nd column. For example >> TAZVACANT ACRES >> 100 45 >> 101 62 >> 102 23 >> 103 64 >> 104 101 >> 105 280 >> >> So if i want to find TAZ 103 it would give me 64 as the corresponding >> value. Hope you guys can help >> >> Cheers >> >> > > -- View this message in context: http://www.nabble.com/Simple-look-up-tp19098777p19113270.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] creating selection vector with 2 attributes
Please consider the following: Puma=c(702, 702, 701, 702, 701, 702, 701, 702 ,702 ,702 ,701 ,702, 702, 701 ,701, 702, 701 ,702, 702, 702,701, 702 ,702 ,702 ,701) PumaNums=c(100 , 200 , 300 , 400 , 500 , 600 , 701 , 702 , 800 , 900 ,1000 ,1101, 1102, 1200 ,1301 ,1302 ,1303, 1304, 1305, 1306, 1307 ,1308 ,1309 ,1310 ,1311 ,1312 ,1313) PumaNames<-c("Northeast", "NorthCentral", "Southeast", "Deschutes", "NorthCoast", "BentonLinn", "Lane", "Eugene", "CoosCurryJosephine", "Jackson", "Douglas", "Salem", "Marion", "PolkYamhill", "Portland", "Portland", "Portland", "Portland", "Portland", "Portland", "ClackamasMultnomah", "Portland", "Portland", "Portland", "Washington", "Portland", "Portland") PumsAreaNames=data.frame(PumaNums,PumaNames) HhSerialno<-c( 38 , 71 , 80, 100, 100 ,106, 126 ,152 ,157 ,181 ,182 ,210 ,244, 267 ,296 ,361, 387, 399, 430 ,434 ,468 ,480 ,483 ,486 ,532) Hhdata<-data.frame(HhSerialno,Puma) IsEugene <- Hhdata$Puma %in% PumsAreaNames..$PumaNums[(PumsAreaNames..$PumaNames == "Eugene")] #-- Is Eugene returns a true/false for whether the Hhdata$Puma attribute matches 'Eugene". I would also like it to associate the serial number but not in a dataframe form but rather like the following: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 9995828 9995840 9995843 9996259 9996287 9996375 9996581 9996724 9996796 9997176 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 9997492 9997546 9997664 9997924 9998062 9998073 9998216 9998682 9998691 9998721 FALSE FALSE FALSE FALSE FALSE FALSETRUE FALSE FALSE FALSE 9998740 082 198 331 390 805 811 FALSE FALSE FALSE FALSE FALSE FALSE FALSE Where the numerics are the serial number. The reason i am having trouble is that i am using some existing code and the desired above is what is returned form the original code and the new code returns just the logical values. I have done the obvious and made sure my data is in the correct format which it is so i am wondering if i can just figure out how to move to what i need. Thanks everyone. JR -- View this message in context: http://www.nabble.com/creating-selection-vector-with-2-attributes-tp24909595p24909595.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a simple line graph
Hey everyone, Sorry for yet another simple question but hopefully it makes whoever comes up with the answer feel good about helping others. I would like to simply plot the following two sets of data in a line graph. The one set is an observed set of points and the latter is the predicted. I have looked through the documentation (which makes any graphing very complicated to me) but i havent found what i need. So for: Sz= c("h1","h2","h3","h4") Pred=c(34790.0 ,47559.8, 21197.8, 28198.6) Obs=c(34740 ,48615 ,20420, 26840) MeanEst2000.Sz=cbind(Sz,Pred) LaneCo2000HH.Sz =cbind(Sz,Obs) I would like the x-axis to display the labels(Sz) and the y-axis to be the vlaues I am currently using the below (wont work with sample data) which gives me the proportions of the observed versus the predicted in four different graphs in histogram format. panelHist(DataMatrix=t(apply(Hh2000.SnSz, 1, 4)), ObsMeans=proportion(rowSums(LaneCo2000HH.SzWk),4), Bounds=c(0.95, 1.05) ) Also, if there is additional documentation for these operations i would appreciate any insights./ Thanks -- View this message in context: http://www.nabble.com/Creating-a-simple-line-graph-tp25123681p25123681.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re placing null values (#NULL!)
I am stumped. I have a data set with multiple columns and about 65,000 case. Some of the cases have a "#NULL!" value so for dataframe "Props_": access_emp pct_vacant TAVAIL park PARKACRES totlandare 4538 52.15 #NULL! 01 22.9974,129.70400 4539 52.15.09 01 22.99982,850.80400 10292 54.20 #NULL! 10 0.00 212,382.82400 11235 52.93 #NULL! 11 15.57 1,355,691.04700 14294 55.42 .09 00 0.0036,693.81700 14295 55.42 #NULL! 00 0.0097,526.17400 14296 55.42 #NULL! 00 0.00 125,131.20300 14297 55.42 #NULL! 00 0.00 209,691.18400 14298 55.42 .09 00 0.00#NULL! 14299 55.42 #NULL! 00 0.0026,161.29100 So i need to replace the the #NULL! with 0. I have tried: Props_pct_vacant<-Props_pct_vacant[Props_$pct_vacant !="#NULL!"] Which seems to be deleting the #Null! cases but then the vectors are of different lengths and i think it casues problems. Ideally i would load the data and remove all the cases with #NULL! values no matter what column there in or at least specify the column. So question one is since the NULL value is actually a "#NULL!" value do the same commands apply becuase it seems like they dont work. The main question is how can i remove the #NULL! value from each of the columns (entire cases is fine) or specified columns. Thanks in advance. Josh -- View this message in context: http://www.nabble.com/Replacing-null-values-%28-NULL%21%29-tp24540921p24540921.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trouble with If statements
Hey everyone, Consider the below. I would like to look up each of the items in "data" and store the result in "BinStore". In this example it isnt storing any value but the last. In my actual code i am getting another error altogether, i get a 1:In if (VacAcresVals.CandTaz <= 4) (BinNumber <- 1) ... : the condition has length > 1 and only the first element will be used 2: In if (VacAcresVals.CandTaz > 651) (BinNumber <- 10) ... : the condition has length > 1 and only the first element will be used for each iteration. Thoughts about how to fix each of these issues. Thanks guys data=c(1034.06001, 102.6600, 219.92000, 306.16001, 134.1, 21.13999, 363.08999, 337.27000, 498.43999 , 429.28000, 234.08000, 51.82000 , 148.68999 , 116.83999, 14.33000 , 40.46001 , 59.0 , 67.43000 , 60.88999 , 12.31001 , 43.5, 128.37000 , 241.9 ,223.77000 , 159.88000 , 45.63000 , 235.43999 , 414.28999, 75.05000 , 621.48999, 148.92000 ,814.66001 , 272.68000 , 108.98000 , 49.05000, 20.16001 , 33.13999 ,222.72000 , 677.52000 , 209.53999 , 511.7, 584.88000, 143.12000 , 726.7 ,472.43000 , 88.56001 , 89.51001 , 97.88999 ,573.72000, 176.36001 , 196.21001 , 267.63000 , 325.37000 , 421.75000 , 76.41001 , 113.38999, 31.7 , 35.78999 , 76.62000 , 94.58999 ,140.92000, 80.16001 , 471.78000, 78.53999 , 341.48999 , 179.12000 , 98.83999 , 245.38999, 83.37000 ,523.81001, 799.22000 , 578.53999 , 246.01001 , 321.31001 ,489.63999 , 523.53000 , 684.7, 1262.2 , 937.9 , 36.11001 ,101.76001 , 25.52000 , 77.47000 , 49.7, 104.53999 , 20.5 , 18.96001 , 14.31001) BinStore=list() for (i in 1:length(data){ IterData=data[i] if (IterData <= 4) (BinNumber<-1) if (IterData >4 && IterData<=7) (BinNumber<-2) if (IterData>7 && IterData<=17) (BinNumber<-3) if (IterData>17 && IterData<=28) (BinNumber<-4) if (IterData>28 && IterData<=50) (BinNumber<-5) if (IterData>50 && IterData<=91) (BinNumber<-6) if (IterData>91 && IterData<=151) (BinNumber<-7) if (IterData>151 && IterData<=341) (BinNumber<-8) if (IterData>341 && IterData<=651) (BinNumber<-9) if (IterData>651) (BinNumber<-10) BinStore[[i]]=BinNumber } BinStore -- View this message in context: http://www.nabble.com/Trouble-with-If-statements-tp24647922p24647922.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble with If statements
Alright guys i figured it out, no need to reply. Sorry for any bother... PDXRugger wrote: > > Hey everyone, > Consider the below. I would like to look up each of the items in > "data" and store the result in "BinStore". In this example it isnt > storing any value but the last. In my actual code i am getting another > error altogether, i get a > > 1:In if (VacAcresVals.CandTaz <= 4) (BinNumber <- 1) ... : > the condition has length > 1 and only the first element will be used > 2: In if (VacAcresVals.CandTaz > 651) (BinNumber <- 10) ... : > the condition has length > 1 and only the first element will be used > > for each iteration. Thoughts about how to fix each of these issues. > Thanks guys > > data=c(1034.06001, 102.6600, 219.92000, 306.16001, 134.1, > 21.13999, 363.08999, > 337.27000, 498.43999 , 429.28000, 234.08000, 51.82000 , 148.68999 , > 116.83999, > 14.33000 , 40.46001 , 59.0 , 67.43000 , 60.88999 , 12.31001 , > 43.5, > 128.37000 , 241.9 ,223.77000 , 159.88000 , 45.63000 , 235.43999 , > 414.28999, > 75.05000 , 621.48999, 148.92000 ,814.66001 , 272.68000 , 108.98000 , > 49.05000, > 20.16001 , 33.13999 ,222.72000 , 677.52000 , 209.53999 , 511.7, > 584.88000, > 143.12000 , 726.7 ,472.43000 , 88.56001 , 89.51001 , 97.88999 > ,573.72000, > 176.36001 , 196.21001 , 267.63000 , 325.37000 , 421.75000 , 76.41001 , > 113.38999, >31.7 , 35.78999 , 76.62000 , 94.58999 ,140.92000, 80.16001 , > 471.78000, > 78.53999 , 341.48999 , 179.12000 , 98.83999 , 245.38999, 83.37000 > ,523.81001, > 799.22000 , 578.53999 , 246.01001 , 321.31001 ,489.63999 , 523.53000 , > 684.7, > 1262.2 , 937.9 , 36.11001 ,101.76001 , 25.52000 , 77.47000 , > 49.7, > 104.53999 , 20.5 , 18.96001 , 14.31001) > > BinStore=list() > > > for (i in 1:length(data){ > IterData=data[i] > > if (IterData <= 4) > (BinNumber<-1) > if (IterData >4 && IterData<=7) > (BinNumber<-2) > if (IterData>7 && IterData<=17) > (BinNumber<-3) > if (IterData>17 && IterData<=28) > (BinNumber<-4) > if (IterData>28 && IterData<=50) > (BinNumber<-5) > if (IterData>50 && IterData<=91) > (BinNumber<-6) > if (IterData>91 && IterData<=151) > (BinNumber<-7) > if (IterData>151 && IterData<=341) > (BinNumber<-8) > if (IterData>341 && IterData<=651) > (BinNumber<-9) > if (IterData>651) > (BinNumber<-10) > > BinStore[[i]]=BinNumber > > > } > BinStore > -- View this message in context: http://www.nabble.com/Trouble-with-If-statements-tp24647922p24648049.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] truncating values into separate categories
Hi all, Simple question which i thought i had the answer but it isnt so simple for some reason. I am sure someone can easily help. I would like to categorize the values in NP into 1 of the five values in "Per", with the last category("4") representing values >=4(hence 4:max(NP)). The problem is that R is reading max(NP) as multiple values instead of range so the lengths of the labels and the breaks are not matching. Suggestions? Per <- c("NA", "1", "2", "3","4") NP=c(1 ,1 ,2 ,1, 1 ,2 ,2 ,1 ,4 ,1 ,0 ,5 ,3 ,3 ,1 ,5 ,3, 5, 1, 6, 1, 2, 2, 2, 4, 4, 1, 2, 1, 3, 3, 1 ,2 ,2 ,1 ,2, 1, 2, 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2) Person_CAT <- cut(NP, breaks=c(0,1,2,3,4:max(NP)), labels=Per) -- View this message in context: http://www.nabble.com/truncating-values-into-separate-categories-tp24749046p24749046.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] truncating values into separate categories
I must apoligize, as i want clear of what i wanted to occur. i dont want to count the occurences but rather recode them. I am trying to replace all of the values with the new coded values in Person_CAT. SO NP <- c(1, 1, 2, 1, 1, 2, 2, 1, 4, 1, 0, 5, + 3, 3, 1, 5, 3, 5, 1, 6, 1, 2, 2, 2, + 4, 4, 1, 2, 1, 3, 3, 1, 2, 2, 1, 2, 1, 2, + 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2) and Person_CAT: 1, 1, 2, 1, 1, 2, 2, 1, 4, 1, NA, 4. and so on. This task would easily be done in SPSS but i am trying to automate it using R. I hope this is more clear, Bill.Venables wrote: > > Here is a suggestion: > >> Per <- c("NA", "1", "2", "3","4") >> NP <- c(1, 1, 2, 1, 1, 2, 2, 1, 4, 1, 0, 5, > + 3, 3, 1, 5, 3, 5, 1, 6, 1, 2, 2, 2, > + 4, 4, 1, 2, 1, 3, 3, 1, 2, 2, 1, 2, 1, 2, > + 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2) >> Person_CAT <- cut(NP, breaks = c(0:4, Inf)-0.5, labels = Per) >> table(Person_CAT) > Person_CAT > NA 1 2 3 4 > 1 19 15 6 9 >> > > You should be aware, though, that items corresponding to the level "NA" > will NOT be treated as missing. > > > Bill Venables > http://www.cmis.csiro.au/bill.venables/ > > > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of PDXRugger > Sent: Friday, 31 July 2009 9:54 AM > To: r-help@r-project.org > Subject: [R] truncating values into separate categories > > > Hi all, > Simple question which i thought i had the answer but it isnt so simple > for > some reason. I am sure someone can easily help. I would like to > categorize > the values in NP into 1 of the five values in "Per", with the last > category("4") representing values >=4(hence 4:max(NP)). The problem is > that > R is reading max(NP) as multiple values instead of range so the lengths of > the labels and the breaks are not matching. Suggestions? > > Per <- c("NA", "1", "2", "3","4") > > NP=c(1 ,1 ,2 ,1, 1 ,2 ,2 ,1 ,4 ,1 ,0 ,5 ,3 ,3 ,1 ,5 ,3, 5, 1, 6, 1, 2, 2, > 2, > 4, 4, 1, 2, 1, 3, 3, 1 ,2 ,2 ,1 ,2, 1, 2, > 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2) > > Person_CAT <- cut(NP, breaks=c(0,1,2,3,4:max(NP)), labels=Per) > > -- > View this message in context: > http://www.nabble.com/truncating-values-into-separate-categories-tp24749046p24749046.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://www.nabble.com/truncating-values-into-separate-categories-tp24749046p24761455.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] re moving intial numerals
I would like to recreate "data" so that only the last 5 digits of the below data are inlcuded as data so 200502019 would become 02019. Any ideas. data=c(200500735, 200502019, 200504131, 200504217, 200504629, 200504822, 200510115, 200511605, 200514477, 200515314, 200515438, 200519040, 200519603, 200522735, 200522853, 200523415, 200524227, 200524423) -- View this message in context: http://www.nabble.com/removing-intial-numerals-tp24763596p24763596.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Truncating based on attribute range and serial no
COnsider the following: Age<-c(48, 57, 56, 76, 76, 66, 70, 14, 7, 3, 62, 62, 30, 10, 7, 53, 44, 29, 46, 47, 15, 13, 84, 77, 26) SerialNo<-c(001147, 005979, 005979, 006128, 006128, 007004, 007004, 007004, 007004, 007004, 007438, 007438,009402,009402, 009402, 012693, 012693, 012693, 014063,014063, 014063, 014063, 014811, 014811,016570) TestSet<-cbind(Age,SerialNo) TestSet<-data.frame(TestSet) I am looking to create a third column titled "IsHead". This column would be either TRUE or FALSE depending on whether the Age variable was the greatest for that set of SerialNo's. So for the above i would return: Age SerialNo IsHead 48 1147 TRUE 57 5979 TRUE 56 5979 FALSE 76 6128 TRUE 76 6128 FALSE 66 7004 FALSE 70 7004 TRUE 14 7004 FALSE 7 7004 FALSE 3 7004 FALSE I am thinking this is simple but cannot get my own code to work. Thanks for any insights. JR -- View this message in context: http://www.nabble.com/Truncating-based-on-attribute-range-and-serial-no-tp24796313p24796313.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re ferencing columns and pulling selected data
Please consider the following inputs: PrsnSerialno<-c(735,1147,2019,4131,4131,4217,4629,4822,4822,5979,5979,6128,6128,7004,7004, 7004,7004,7004,7438,7438,9402,9402,9402,10115,10115,11605,12693,12693,12693) PrsnAge<-c(59,48,42,24,24,89,60,43,47,57,56,76,76,66,70,14,7,3,62,62,30,10,7,20,21,50,53,44,29) IsHead<-c(TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,FALSE,TRUE,TRUE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE,FALSE, FALSE,TRUE,FALSE,TRUE,FALSE,FALSE,FALSE,TRUE,FALSE,TRUE,FALSE,FALSE) PrsnData<-cbind(PrsnSerialno,PrsnAge,IsHead) HhSerialno<-c(735,1147,2019,4131,4217,4629,4822,5979,6128,7004,7438,9402,10115,11605,12693) HhData<-cbind(HhSerialno) What i would like to do is to add a age column to HhData that would correspond to the serial number and which is also the oldest person in the house, or what corresponds to "TRUE"(designates oldest person). The TRUE false doesnt have to be considered but is preferable. The result would then be: HhSerialno HhAge 735 59 114748 201942 413124 421789 462960 482247 597957 612876 700470 743862 940230 10115 21 11605 50 12693 53 I tried PumsHh..$Age<-PumsPrsn[PumsPrsn$SERIALNO==PumsHh..$Serialno,PumsPrsn$AGE] but becaseu teh data frames are of different length it doesnt work so im unsure of another way of doing this. Thanks in advance JR -- View this message in context: http://www.nabble.com/Referencing-columns-and-pulling-selected-data-tp24813802p24813802.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] re ading and analyzing a word document
file=("LUSDR/letter.doc") Howdy Y'all, So i am looking to read a word document in the following formats(.doc) or any type of accessible word processor software (e.g. text .txt, notepad, etc). Had the ability to search certain words, for instance "banana", "peacock","Weapons" "Mass" "Destruction". Then i could summarize and view the results. i looked and the only thing i could find was the below where i want to analyze "letter.doc" and look for the words mentioned in quotes above. Its aparently wrong but im wondering if this is even possible. Please advise. Thanks In Solidarity JR cat"banana", "peacock","Weapons" "Mass" "Destruction" file=("letter.doc"),sep="\n") readLines(file, n=-1) unlink("letter.doc") # tidy up ## difference in blocking cat("123\nabc", file = "test1") readLines("test1") # line with a warning a=con <- file("test1", "r", blocking = FALSE) readLines(con) # empty cat(" def\n", file = "test1", append = TRUE) readLines(con) # gets both close(con) unlink("test1") # tidy up -- View this message in context: http://www.nabble.com/reading-and-analyzing-a-word-document-tp25691972p25691972.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re ading and analyzing a word document
Considering your instructions: #Define words to find to.find <- c( 'the', 'is', 'are' ,'dr') #Read in the file... file.text <- readLines( 'data/letter.txt' ) #Count number of occurnces of deined word in text line.matches <- unlist( lapply( to.find, grep, x = unlist(file.text[2]) ) ) Result: > line.matches [1] 1 1 1 This is not right of course as there are actually four words and secondly becasue the searched words appear multiple times. I think the problem is that the file.text is coming in so that file.text[2] <-""\tHello sir, I write to you seeking your guidance organizing some data. I have a ." So its reading the document its just putting them into this type of format. Im stuck, i tried doing it by saving the doc to a csv and searching strings, tried using a match process. It would also be useful to simply get a run down similar to a summary expressing the most common words. Ideas? cls59 wrote: > > > PDXRugger wrote: >> >> Howdy Y'all, >> >> So i am looking to read a word document in the following formats(.doc) or >> any type of accessible word processor software (e.g. text .txt, notepad, >> etc). Had the ability to search certain words, for instance "banana", >> "peacock","Weapons" "Mass" "Destruction". Then i could summarize and >> view the results. i looked and the only thing i could find was the below >> where i want to analyze "letter.doc" and look for the words mentioned in >> quotes above. Its aparently wrong but im wondering if this is even >> possible. Please advise. Thanks >> >> In Solidarity >> JR >> > > Well... you could make a vector of the words you want to find: > > to.find <- c( 'banana', 'peacock', 'Weapons' ) > > Read in the file... > > file.text <- readLines( 'myFile.txt' ) > > And recursively apply the grep command in order to determine which lines > contain matches for your words: > > line.matches <- unlist( lapply( to.find, grep, x = file.text ) ) > > It may do what you want for plain text files, as for Microsoft Word > files... well... > > Sometimes there is a price to pay for using a closed proprietary binary > document format. > > Good luck! > > -Charlie > -- View this message in context: http://www.nabble.com/reading-and-analyzing-a-word-document-tp25691972p25692751.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] text mining
The following code is derived from a paper titled "Text Mining Infrastructure in R" (http://www.jstatsoft.org/v25/i05/paper). The example below seems to load some default documents for analysis, some sort of latin document. I cannot for the life of me figure out to load my own document let alone an entire corpus. I have searched the above documenet as well as related documentation. Any leads or help would be appreciated. Thanks everyone from document txt <- system.file("texts", "txt", package = "tm") (ovid <- Corpus(DirSource(txt), readerControl = list(reader = readPlain, language = "la", load = TRUE))) my attempt txt <- system.file("Speeches/speech", "txt", package = "tm") (ovid <- Corpus(DirSource(txt), readerControl = list(reader = readPlain, language = "la", load = TRUE))) -- View this message in context: http://www.nabble.com/text-mining-tp25717142p25717142.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Selecting initial numerals
II just want to create a new object with the first two numerals of the data. Not sure why this isnt working, consider the following: EmpEst$naics=c(238321, 624410, 484121 ,238911, 81, 531110, 621399, 541613, 524210 ,236115 ,811121 ,236115 ,236115 ,621610 ,814110 ,812320) EmpEst$naics2<-formatC(EmpEst$naics %% 1e2, width=2, flag="", mode ="integer") #RESULT:Warning message: #In Ops.factor(EmpEst$naics, 100) : %% not meaningful for factors EmpEst$naics2<-format(EmpEst$naics,trim=FALSE,digits=2,width=2) #RESULT #Changes data to string I know this is super simple but im not sure why i get the error in the first try. Help with some excplanation would be helpful, thanks Cheers -- View this message in context: http://www.nabble.com/Selecting-initial-numerals-tp25876664p25876664.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with rJava and tm packages
I am looking to do some text analysis using R and have run into some issues with some of the packages. Im not sure if its my goofy Vista OS or what but using R 2.8.1 i s relatively successful loading the text but the rJava package was messed up somehow: library(tm) > library(rJava) Error in if (!nchar(javahome)) stop("JAVA_HOME is not set and could not be determined from the registry") : argument is of length zero In addition: Warning message: package 'rJava' was built under R version 2.9.1 Error : .onLoad failed in 'loadNamespace' for 'rJava' Error: package/namespace load failed for 'rJava' > > #Set documents directory > DIR <- "G:/TextSearch/Speeches" > > #Load corpus > speech <- Corpus(DirSource(DIR), readerControl = list(reader = readPlain, + language = "en_US", load = TRUE)) > > #Remove stopwords > speech <- tmMap(speech, stripWhitespace) > speech A corpus with 2 text documents > tdm<-TermDocumentMatrix(speech) Error in if (!nchar(javahome)) stop("JAVA_HOME is not set and could not be determined from the registry") : argument is of length zero Error: .onLoad failed in 'loadNamespace' for 'rJava' So the initial question is whats going on with the rJava package? I get the same error when i try and load the package and then when i try and utilize a function from the package. I tried installing 2.9.2 and ran into more problems when running the lines: >utils:::menuInstallPkgs() Warning: package 'tm' is in use and will not be installed > speech <- tmMap(speech, stripWhitespace) Error: could not find function "tmMap" the package is installed correctly but its not able to pick it up in this version of R. Again, im not sure if its somehting with Vista or what. Thanks guys and gals Cheers, JR -- View this message in context: http://www.nabble.com/Problems-with-rJava-and-tm-packages-tp25913472p25913472.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re ading and Creating Shape Files
Hello R Community, I have imported a dataset which contain X Y coordinates and would like to recreate a shape file after some data analysis. What i have done is to import some taxlot data and join them based on some criteria. I want to check to see how well the joining went by reviewing the results in GIS. A couple things. I cant seem to import a shape file correctly using the maptools package and the readShapeSpatial. I have tried Building=file("data/input/BuildingShape/Building.shp") Bldg<-readShapeSpatial(fn=data/input/BuildingShape/Building,proj4string=NAD83) #-- Bldg<-readShapeSpatial(data/input/BuildingShape/Building,proj4string=NAD83) #--- Building=file("data/input/BuildingShape/Building.shp") Bldg<-readShapeSpatial(Building,proj4string=NAD83) I know i am mis interpreting the documentation but it doesnt seem like it is very complicated so i am of course confused. Also, i am wondering if i can create a shape file by simply using XY coordinates from a data frame. So for: Ycoord=c( 865296.4, 865151.5, 865457.0 ,865363.4 ,865311.0, 865260.9 ,865210.7 ,865173.3, 865123.6 ,865038.2 ,864841.1 ,864745.4 ,864429.1 ,864795.6 ,864334.9 ,864882.0) Xcoord=c( 4227640 ,4227816 ,4228929 ,4228508 ,4229569 ,4229498 ,4226747, 4226781, 4229597, 4229204, 4228910, 4228959 ,4229465 ,4229794 ,4229596 ,4229082) Lot<-c(1900 , 2000, 2100 , 100 ,200 , 300, 400 , 500 , 600 , 701 , 900 , 1000 , 1100, 300 ,100, 200) XYcoord<-spCbind(Ycoord,Xcoord) #doesnt work so XYcoord=c(Ycoord,Xcoord) TaxLots<-cbind(Ycoord,Xcoord,Lot) writeSpatialShape(XYcoord, TaxLots.., file=data/input/test/Taxlots,strictFilename=FALSE) So help reading in shape files and then creating them using XY coordinates if possible Any help would be appreciated. Thank you. -- View this message in context: http://www.nabble.com/Reading-and-Creating-Shape-Files-tp26098828p26098828.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Summing identical IDs
Hello All, I would like to select records with identical IDs then sum an attribute then and return them to the data frame as a single record. Please consider Acres<-c(100,101,100,130,156,.5,293,300,.09) Bldgid<-c(1,2,3,4,5,5,6,7,7) DF=cbind(Acres,Bldgid) DF<-as.data.frame(DF) So that: Acres Bldgid 1 100.00 1 2 101.00 2 3 100.00 3 4 130.00 4 5 156.00 5 6 0.50 5 7 293.00 6 8 300.00 7 9 0.09 7 Becomes Acres Bldgid 1 100.00 1 2 101.00 2 3 100.00 3 4 130.00 4 5 156.50 5 7 293.00 6 8 300.09 7 dup<-unique(DF$Bldgid[duplicated(Bldgid)]) dupbuild<-DF[DF$Bldgid %in% dup,] dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild$Bldgid)]) This sums the unique Ids of the duplicated records, not whati want. Thanks ahead of time JR -- View this message in context: http://www.nabble.com/Summing-identical-IDs-tp26118922p26118922.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summing identical IDs
Terrific help thank you. dupbuild<-aggregate(DF$Acres, list(Bldgid), sum) This line worked best. Now im going to challenge everyone (i think?) Consider the following: Acres<-c(100,101,100,130,156,.5,293,300,.09,100,12.5) Bldgid<-c(1,2,3,4,5,5,6,7,7,8,8) Year<-c(1946,1952,1922,1910,1955,1955,1999,1990,1991,2000,2000) ImpValue<-c(1000,1400,1300,900,5000,1200,500,1000,300,1000,1000) DF=cbind(Acres,Bldgid,Year,ImpValue) DF<-as.data.frame(DF) I would like to do the same, except there are some rules i want to follow. I only want to aggregate the Acres if : a) The Years are not identical b) The ImpValues are not identical c) The Years are identical and the ImpValue are not d)The ImpValues are identical and the Years are not but if the Acres and ImpValues are identical i would still like to add the Acres together and form one case. If the cases are put together i would also like to add the ImpValues together. So the below Acres Bldgid Year ImpValue 1 100.00 1 1946 1000 2 101.00 2 1952 1400 3 100.00 3 1922 1300 4 130.00 4 1910 900 5 156.00 5 1955 5000 60.50 5 1955 1200 7 293.00 6 1999 500 8 300.00 7 1990 1000 90.09 7 1991 300 10 100.00 8 2000 1000 11 12.50 8 2000 1000 would become Acres Bldgid Year ImpValue 1 100.00 1 1946 1000 2 101.00 2 1952 1400 3 100.00 3 1922 1300 4 130.00 4 1910 900 5 156.50 5 1955 6200 7 293.00 6 1999 500 8 300.09 7 1990 1300 10 112.50 8 2000 1000 Thanks, i gave it a bunch of shots but nothing worth posting. PDXRugger wrote: > > Hello All, >I would like to select records with identical IDs then sum an attribute > then and return them to the data frame as a single record. Please > consider > > > Acres<-c(100,101,100,130,156,.5,293,300,.09) > Bldgid<-c(1,2,3,4,5,5,6,7,7) > > DF=cbind(Acres,Bldgid) > DF<-as.data.frame(DF) > > So that: > > Acres Bldgid > 1 100.00 1 > 2 101.00 2 > 3 100.00 3 > 4 130.00 4 > 5 156.00 5 > 6 0.50 5 > 7 293.00 6 > 8 300.00 7 > 9 0.09 7 > > Becomes > > Acres Bldgid > 1 100.00 1 > 2 101.00 2 > 3 100.00 3 > 4 130.00 4 > 5 156.50 5 > 7 293.00 6 > 8 300.09 7 > > dup<-unique(DF$Bldgid[duplicated(Bldgid)]) > dupbuild<-DF[DF$Bldgid %in% dup,] > dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild$Bldgid)]) > > This sums the unique Ids of the duplicated records, not whati want. > Thanks ahead of time > > JR > > > -- View this message in context: http://www.nabble.com/Summing-identical-IDs-tp26118922p26121056.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summing identical IDs
David, You are correct. I think the frist two assumptions can be thrown out and only the latter two (c,d) can be considered. So how would i combine Acres for matching Bldgids based on assumptions c,d? David Winsemius wrote: > > > On Oct 29, 2009, at 5:23 PM, PDXRugger wrote: > >> >> Terrific help thank you. >> dupbuild<-aggregate(DF$Acres, list(Bldgid), sum) >> This line worked best. >> >> Now im going to challenge everyone (i think?) >> >> Consider the following: >> >> >> Acres<-c(100,101,100,130,156,.5,293,300,.09,100,12.5) >> Bldgid<-c(1,2,3,4,5,5,6,7,7,8,8) >> Year<-c(1946,1952,1922,1910,1955,1955,1999,1990,1991,2000,2000) >> ImpValue<-c(1000,1400,1300,900,5000,1200,500,1000,300,1000,1000) >> DF=cbind(Acres,Bldgid,Year,ImpValue) >> DF<-as.data.frame(DF) >> >> I would like to do the same, except there are some rules i want to >> follow. >> I only want to aggregate the Acres if : >> a) The Years are not identical >> b) The ImpValues are not identical >> c) The Years are identical and the ImpValue are not >> d)The ImpValues are identical and the Years are not > > As I review your Boolean logic, I run into serious problems. > > c) and d) cannot be true if a and b) are true. > > So no cases satisfy all 4 specs. In particular both of the pairs you > say you want aggregated (5+6) and 10+11) violate rule a) and the > second pair also violates b). > > -- > David >> >> but if the Acres and ImpValues are identical i would still like to >> add the >> Acres together and form one case. >> If the cases are put together i would also like to add the ImpValues >> together. So the below >> >>Acres Bldgid Year ImpValue >> 1 100.00 1 1946 1000 >> 2 101.00 2 1952 1400 >> 3 100.00 3 1922 1300 >> 4 130.00 4 1910 900 >> 5 156.00 5 1955 5000 >> 60.50 5 1955 1200 >> 7 293.00 6 1999 500 >> 8 300.00 7 1990 1000 >> 90.09 7 1991 300 >> 10 100.00 8 2000 1000 >> 11 12.50 8 2000 1000 >> >> would become >> >>Acres Bldgid Year ImpValue >> 1 100.00 1 1946 1000 >> 2 101.00 2 1952 1400 >> 3 100.00 3 1922 1300 >> 4 130.00 4 1910 900 >> 5 156.50 5 1955 6200 >> 7 293.00 6 1999 500 >> 8 300.09 7 1990 1300 >> 10 112.50 8 2000 1000 >> >> Thanks, i gave it a bunch of shots but nothing worth posting. >> >> >> >> >> >> PDXRugger wrote: >>> >>> Hello All, >>> I would like to select records with identical IDs then sum an >>> attribute >>> then and return them to the data frame as a single record. Please >>> consider >>> >>> >>> Acres<-c(100,101,100,130,156,.5,293,300,.09) >>> Bldgid<-c(1,2,3,4,5,5,6,7,7) >>> >>> DF=cbind(Acres,Bldgid) >>> DF<-as.data.frame(DF) >>> >>> So that: >>> >>> Acres Bldgid >>> 1 100.00 1 >>> 2 101.00 2 >>> 3 100.00 3 >>> 4 130.00 4 >>> 5 156.00 5 >>> 6 0.50 5 >>> 7 293.00 6 >>> 8 300.00 7 >>> 9 0.09 7 >>> >>> Becomes >>> >>> Acres Bldgid >>> 1 100.00 1 >>> 2 101.00 2 >>> 3 100.00 3 >>> 4 130.00 4 >>> 5 156.50 5 >>> 7 293.00 6 >>> 8 300.09 7 >>> >>> dup<-unique(DF$Bldgid[duplicated(Bldgid)]) >>> dupbuild<-DF[DF$Bldgid %in% dup,] >>> dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild >>> $Bldgid)]) >>> >>> This sums the unique Ids of the duplicated records, not whati want. >>> Thanks ahead of time >>> >>> JR >>> >>> >>> >> >> -- >> View this message in context: >> http://www.nabble.com/Summing-identical-IDs-tp26118922p26121056.html >> Sent from the R help mailing list archive at Nabble.com. >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://old.nabble.com/Summing-identical-IDs-tp26118922p26135732.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Summing rows based on criteria
I am attempting to clean up some land use building data and need to join some buildings together making sure not to double count GIS slivers. The first data.frame is the original, the 2nd adds all the acres for each identical bldgid. I now want to a) throw out all but one of the the cases where the Years and ImpValue are Identical, b) Sum the impvalues based on: 1) The Years are identical and the ImpValue are not 2)The ImpValues are identical and the Years are not Resulting in the 3rd data frame. Please consider the following DF=cbind(Acres,Bldgid,Year,ImpValue) DF<-as.data.frame(DF) DF Acres Bldgid Year ImpValue 1 100.00 1 1946 1000 2 101.00 2 1952 1400 3 100.00 3 1922 1300 4 130.00 4 1910 900 5 156.00 5 1955 5000 60.50 5 1955 1200 7 293.00 6 1999 500 8 300.00 7 1990 9000 90.09 7 1991 9000 10 100.00 8 2000 1000 11 12.50 8 2000 1000 #Aggregate acres where identical ids dupbuild<-aggregate(DF$Acres,DF["Bldgid"],sum) colnames(dupbuild)[2]<-"Acres" #Add aggregated Acres to DF DF$Acres<-dupbuild$Acres[match(DF$Bldgid,dupbuild$Bldgid)] DF Acres dgid Year ImpValue 1 100.00 1 1946 1000 2 101.00 2 1952 1400 3 100.00 3 1922 1300 4 130.00 4 1910 900 5 156.50 5 1955 5000 6 156.50 5 1955 1200 7 293.00 6 1999 500 8 300.09 7 1990 9000 9 300.09 7 1991 9000 10 112.50 8 2000 1000 11 112.50 8 2000 1000 #desired outcome data frame Acres dgid Year ImpValue 1 100.00 1 1946 1000 2 101.00 2 1952 1400 3 100.00 3 1922 1300 4 130.00 4 1910 900 5 156.50 5 1955 7200 #combined 5 & 6 7 293.00 6 1999 500 8 300.09 7 1990 18000 #combined 8 & 9 10 112.50 8 2000 1000 #one case thrown out So in this case the Impvalue are added together for rows 5 & 6 (from dataframe example 2) b/c the years are identical and the Impvalue is not, rows 8 and 9 have their Impvalue summed because there Years are identical but the improvement value is not, and one of the cases is thrown out of rows 10 & 11 because they have identical years and ImpValue. When rows are joined the Year value is no longer important but to remain consistent i would like to keep the earliest (lowest) year. There will be instances in the actual data where there are more than 2 cases to consider if that makes any coding difference as i didnt include any in my example data. It would also be useful to include a new column keeping track of how many joined bldgids . I think i can figure that one out though. Hope this is all clear. Thanks for the guidance and insights JR -- View this message in context: http://old.nabble.com/Summing-rows-based-on-criteria-tp26157755p26157755.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems intalling on Suse 10.3 x86_64 OS
Alright, i am unsure of the posting rules for these types of questions but i will be as help ful as possible. My windows based system cant handle a model i am running so i am trying to install R on our Linux based machine but i have encountered the following and i dont know linux much but my intuition is that i need to install some other files first. Any thoughts? YaST2 conflicts list - generated 2009-05-14 12:20:48 No valid solution found with just resolvables of best architecture. With this run only resolvables with the best architecture have been regarded. Regarding all possible resolvables takes time, but can come to a valid result. Conflict Resolution: ( ) Make a solver run with ALL possibilities. R-base cannot be installed due to missing dependencies There are no installable providers of libtcl8.5.so()(64bit) for R-base-2.9.0-2.1.x86_64[home:dsteuer] === R-base-2.9.0-2.1.x86_64[home:dsteuer] === R-base-2.9.0-2.1.x86_64[home:dsteuer] will be installed by another application. (ApplLow/ApplHigh) pango-1.18.2-4.x86_64 is needed by R-base-2.9.0-2.1.x86_64[home:dsteuer] (libpangocairo-1.0.so.0()(64bit)) xorg-x11-libXext-7.2-65.x86_64 is needed by R-base-2.9.0-2.1.x86_64[home:dsteuer] (libXext.so.6()(64bit)) 25 more... Conflict Resolution: ( ) do not install R-base ( ) Ignore this requirement just here YaST2 conflicts list END ### Im not sure what else info to provide. Hope you guys can help. Cheers, JR -- View this message in context: http://www.nabble.com/Problems-intalling-on-Suse-10.3-x86_64-OS-tp23547451p23547451.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] loading and manipulating 10 data frames-simplified
I have to load 10 different data frames and then manipulate those 10 data frames but would like to do this in a more simplified code than what i am doing. I have tried a couple of approaches but cannot get it to work correctly. So the initial (bulky) code is: #Bin 1 #--- #Loads bin data frame from csv files with acres and TAZ data Bin1_main <- read.csv(file="I:/Research/Samba/urb_transport_modeling/LUSDR/Workspace/BizLandPrice/data/Bin_lookup_values/Bin1_lookup.csv",head=FALSE); #Separates Acres data from main data and converts acres to square feet Bin1_Acres=Bin1_main[[1]]*43560 #Separates TAZ data from main data Bin1_TAZ=Bin1_main[[2]] #Separates TAZ data from main data and converts acres to square feet Bin1_TAZvacant=Bin1_main[[3]]*43560 #Sums each parcel acreage data of the bin Bin1Acres_sum=sum(Bin1_Acres) #Creates data frame of cumlative percentages of each parcel of bin Bin1_cumper=cumsum(Bin1_Acres/Bin1Acres_sum) #Calculates the probability of choosing particular parcel from bin Bin1_parprob=abs(1-Bin1_cumper) #Combines parcel acreage data and cumlative percentage data Bin1Main.data = cbind(Bin1_Acres,Bin1_parprob,Bin1_TAZ,Bin1_TAZvacant) #Bin 2 #--- #Loads bin data frame from csv files with acres and TAZ data Bin2_main <- read.csv(file="I:/Research/Samba/urb_transport_modeling/LUSDR/Workspace/BizLandPrice/data/Bin_lookup_values/Bin2_lookup.csv",header=FALSE); #Separates Acres data from main data and converts acres to square feet Bin2_Acres=Bin2_main[[1]]*43560 #Separates TAZ data from main data Bin2_TAZ=Bin2_main[[2]] #Separates TAZ data from main data and converts acres to square feet Bin2_TAZvacant=Bin2_main[[3]]*43560 #Sums each parcel acreage data of the bin Bin2Acres_sum=sum(Bin2_Acres) #Creates data frame of cumlative percentages of each parcel of bin Bin2_cumper=cumsum(Bin2_Acres/Bin2Acres_sum) #Calculates the probability of choosing particular parcel from bin Bin2_parprob=abs(1-Bin2_cumper) #Combines parcel acreage data and cumlative percentage data Bin2Main.data = cbind(Bin2_Acres,Bin2_parprob,Bin2_TAZ,Bin2_TAZvacant) #Bin 3 #--- #Loads bin data frame from csv files with acres and TAZ data Bin3_main <- read.csv(file="I:/Research/Samba/urb_transport_modeling/LUSDR/Workspace/BizLandPrice/data/Bin_lookup_values/Bin3_lookup.csv",header=FALSE); #Separates Acres data from main data and converts acres to square feet Bin3_Acres=Bin3_main[[1]]*43560 #Separates TAZ data from main data Bin3_TAZ=Bin3_main[[2]] #Separates TAZ data from main data and converts acres to square feet Bin3_TAZvacant=Bin3_main[[3]]*43560 #Sums each parcel acreage data of the bin Bin3Acres_sum=sum(Bin3_Acres) #Creates data frame of cumlative percentages of each parcel of bin Bin3_cumper=cumsum(Bin3_Acres/Bin3Acres_sum) #Calculates the probability of choosing particular parcel from bin Bin3_parprob=abs(1-Bin3_cumper) #Combines parcel acreage data and cumlative percentage data Bin3Main.data = cbind(Bin3_Acres,Bin3_parprob,Bin3_TAZ,Bin3_TAZvacant) #Bin 4 #--- #Loads bin data frame from csv files with acres and TAZ data Bin4_main <- read.csv(file="I:/Research/Samba/urb_transport_modeling/LUSDR/Workspace/BizLandPrice/data/Bin_lookup_values/Bin4_lookup.csv",header=FALSE); #Separates Acres data from main data and converts acres to square feet Bin4_Acres=Bin4_main[[1]]*43560 #Separates TAZ data from main data Bin4_TAZ=Bin4_main[[2]] #Separates TAZ data from main data and converts acres to square feet Bin4_TAZvacant=Bin4_main[[3]]*43560 #Sums each parcel acreage data of the bin Bin4Acres_sum=sum(Bin4_Acres) #Creates data frame of cumlative percentages of each parcel of bin Bin4_cumper=cumsum(Bin4_Acres/Bin4Acres_sum) #Calculates the probability of choosing particular parcel from bin Bin4_parprob=abs(1-Bin4_cumper) #Combines parcel acreage data and cumlative percentage data Bin4Main.data = cbind(Bin4_Acres,Bin4_parprob,Bin4_TAZ,Bin4_TAZvacant) #Bin 5 #--- #Loads bin data frame from csv files with acres and TAZ data Bin5_main <- read.csv(file="I:/Research/Samba/urb_transport_modeling/LUSDR/Workspace/BizLandPrice/data/Bin_lookup_values/Bin5_lookup.csv",header=FALSE); #Separates Acres data from main data and converts acres to square feet Bin5_Acres=Bin5_main[[1]]*43560 #Separates TAZ data from main data Bin5_TAZ=Bin5_main[[2]] #Separates TAZ data from main data and converts acres to square feet Bin5_TAZvacant=Bin5_main[[3]]*43560 #Sums each parcel acreage data of the bin Bin5Acres_sum=sum(Bin5_Acres) #Creates data frame of cumlative percentages of each parcel of bin Bin5_cumper=cumsum(Bin5_Acres/Bin5Acres_sum) #Calculates the probability of choosing particular parcel from bin Bin5_parprob=abs(1-Bin5_cumper) #Combines parcel acreage data and cumlative percentage data Bin5Main.data = cbind(Bin5_Acres,Bin5_parprob,Bin5_TAZ,Bin5_TAZvacant) #Bin 6 #--- #Loads bin data frame from csv files with acres and TAZ data Bin6_main <- read.csv(file="I:/Research/Samb
[R] Re moving unwanted double values in list
I have a procedure that goes sorts out some numbers based on specidifed criteria and for some reason the list contains double values in some of the rows such as: TAZs <- [[84]] [1] 638 [[85]] [1] 643 [[86]] [1] 644 732 [[87]] [1] 651 801 i would like to check list TAZs for double values and remove any if present. I have tried if (length(TAZDs==2)) rm(TAZDs[2]) but no luck. I cant find nor think of another way. Any help would be helpful. Thanks in advance -- View this message in context: http://www.nabble.com/Removing-unwanted-double-values-in-list-tp23288507p23288507.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] subscript out of bounds
Consider -> if (BinNumber==1) Loc_Prob=Bin1Main.data[findInterval(Dev_Size,Bin1Main.data[,1])+1,2] . .. ... if (BinNumber==10) Loc_Prob=Bin10Main.data[findInterval(Dev_Size,Bin10Main.data[,1])+1,2] BinNumber is just referencing 1 of 10 sets of data so: where BinNumber = 1 and Dev_Size = 16 and Bin1Main.data -> [946,] 75407.58720 5.777982e-02 240 1.089000e+05 [947,] 86986.83708 4.762052e-02 833 9.713880e+04 [948,] 90271.43532 3.707760e-02 366 1.228392e+05 [949,] 91780.87644 2.635840e-02 349 9.583200e+04 [950,] 95189.31576 1.524112e-02 787 1.258884e+05 [951,] 130498.79040 4.440892e-16 625 1.306800e+05 I am getting the obvious "subscript out of bounds"error. How can i default Loc_Prob to 0.0 if there isnt a large enough value to match Dev_Size too? Something simple i am thinking but i cant get any of my thoughts too work. Thanks JR -- View this message in context: http://www.nabble.com/subscript-out-of-bounds-tp23398077p23398077.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in table lookup
I am trying to lookup a value in 1 of 10 loaded two column-data sets (Bins) by displaying the value of the second column based on the value of the first. For instance in Bin1_Acres Bin1_parprobBin1_TAZ [1,] 0.0004420.978 356 [2,] 0.0004530.954 356 [3,] 0.0005830.925 366 [4,] 0.0006350.893 403 [5,] 0.0007560.854 358 [6,] 0.0007740.815 530 [7,] 0.0008130.773 405 [8,] 0.0009700.724 576 [9,] 0.0010220.672 569 [10,] 0.0010660.618 620 I would like to display the column value on Bin1_parprob based on the closest match to Bin1_acres. So if the value i am referenceing in column 1 (Bin1_Acres) is .00 the value outputted would be .672. I keep getting a numeric(0) error with the code i am using(see below). I think the issue is i usually dont have an exact match and i am using a >= sign which may be causing the problem. I need the closest match and it needs to be larger, not smaller if not exact match is found( which is very common). #Test value for Vacant acres in TAZ TAZDetermine = 24 #Test value for and development size Dev_Size= 3.5 #Determines Bin number based on vacant acres in TAZ BinSize=function(Dev_Size,BinNumer){ if (TAZDetermine<=3.999) BinNumber=1 if (TAZDetermine>=4:6.999) (BinNumber=2) if (TAZDetermine >=10:16.999) (BinNumber=3) if (TAZDetermine>=17:27.999) (BinNumber=4) if (TAZDetermine>=28:49.999) (BinNumber=5) if (TAZDetermine>=50:90.999) (BinNumber=6) if (TAZDetermine>=91:150.999) (BinNumber=7) if (TAZDetermine>=151:340.999) (BinNumber=8) if (TAZDetermine>=341:650.999) (BinNumber=9) if (TAZDetermine>=651:3000) (BinNumber=10) BinNumber } #so in this case Bin 4 is selected #Based on previously selected bin, display second column value(Bin1_parprob). Selected value in column 1 may be slightly larger but closest match is desirable. if (BinNumber==1) (Loc_Prop=Bin1Main.data[Bin1Main.data$Bin1_Acres >=Dev_Size,1]) if (BinNumber==2) (Loc_Prop=Bin2Main.data[Bin2Main.data$Bin2_Acres>=Dev_Size,1]) if (BinNumber==3) (Loc_Prop=Bin3Main.data[Bin3Main.data$Bin3_Acres>=Dev_Size,1]) if (BinNumber==4) (Loc_Prop=Bin4Main.data[Bin4Main.data$Bin4_Acres>=Dev_Size,1]) if (BinNumber==5) (Loc_Prop=Bin5Main.data[Bin5Main.data$Bin5_Acres>=Dev_Size,1]) if (BinNumber==6) (Loc_Prop=Bin6Main.data[Bin6Main.data$Bin6_Acres>=Dev_Size,1]) if (BinNumber==7) (Loc_Prop=Bin7Main.data[Bin7Main.data$Bin7_Acres>=Dev_Size,1]) if (BinNumber==8) (Loc_Prop=Bin8Main.data[Bin8Main.data$Bin8_Acres>=Dev_Size,1]) if (BinNumber==9) (Loc_Prop=Bin9Main.data[Bin9Main.data$Bin9_Acres>=Dev_Size,1]) if (BinNumber==10) (Loc_Prop=Bin10Main.data[Bin10Main.data$Bin10_Acres>=Dev_Size,1]) I hope this question is clear. I have tried a number of different lines of code but get the same error (Numeric (0)) Cheers, JR -- View this message in context: http://www.nabble.com/Error-in-table-lookup-tp19316307p19316307.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calling object outside function
What i thought was a simple process isnt working for me. After i create an multiple objects in a function (see below), how to i use those objects later in the program. I tried calling the function again and then the object i wanted and it worked the first time but now it doesnt( i think i defined the object outside the function accidently so then it worked but when run properly it doesnt). I did this using Testdata(TAZDetermine) to first recall the function then the object i wanted to use. This deosnt work and it errors that the object cannot be found. Do i use attach? this didnt seem to work either. I just want to call an object defined in a function outside of the function. Hope you can help Cheers, JR #Function to create hypothetical numbers for process testing Testdata=function(TAZ_VAC_ACRES,Loc_Mod_TAZ,Dev_Size,TAZDetermine,Dev_Size){ #Loads TAZ and corresponding vacant acres data TAZ_VAC_ACRES= read.csv(file="I:/Research/Samba/urb_transport_modeling/LUSDR/Workspace/BizLandPrice/data/TAZ_VAC_ACRES.csv",header=TRUE); #Test Location Choice Model selected TAZ Loc_Mod_TAZ = 120 #Create test Development Dev_Size=58 #Determines vacant acres by TAZ TAZDetermine=TAZ_VAC_ACRES[TAZ_VAC_ACRES$TAZ==Loc_Mod_TAZ,2] #Displays number of vacant acres in Location Choice Model selected TAZ TAZDetermine } Testdata(TAZDetermine) error indicating the that function cannot be found even thoug its part of the argument list in the main function. -- View this message in context: http://www.nabble.com/Calling-object-outside-function-tp19653634p19653634.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] conditional loop
I am looking up a number based upon a randomly selected number and then proceed to the rest of my code if the corresponding value is greater than or equal to yet another value. so if Dev_Size = 14 and my randomly selected number is 102 and i am looking up 102 in the following table 100 21 101 4 102 9 103 52 104 29 So i select the the corresponding value of 102, which is 9 and determine if it is greater than or equal to Dev_Size and if it is i would proceed, if not i would like my random number generator to select another number until complies with the requirement. I have tried below but my while loop doesnt do what i would like. I cant seem to find a good resource for my problem, any help would be much appreciated. Cheers, JR #Loads TAZ and corresponding vacant acres data TAZ_VAC_ACRES= read.csv(file="C:/LUSDR/Workspace/BizLandPrice/data/TAZ_VAC_ACRES.csv",header=TRUE); #Test Location Choice Model selected TAZ while(TAZDetermine <= Dev_Size){ #Randomly generates test TAZ from 100 to 844 RndTAZ=sample(100:844,1,replace=T) #Renames randomly generated TAZ's object Loc_Mod_TAZ = RndTAZ #Create test Development Dev_Size=50 #Determines vacant acres by TAZ TAZDetermine=TAZ_VAC_ACRES[TAZ_VAC_ACRES$TAZ==Loc_Mod_TAZ,2] #Displays number of vacant acres in Location Choice Model selected TAZ TAZDetermine } -- View this message in context: http://www.nabble.com/conditional--loop-tp19751172p19751172.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re moving Numeric,0 from dataframe
I realize i am breaking the posting rules by not posting sample code but i tried building some sample test code for this problem based on my working code and it wasnt producing what i wanted so hopefully a brief explanation and my result will allow you guys enough information to offer some advice. My result: allTAZprobs TAZS [1,] Numeric,0 640 [2,] 0.4385542 641 [3,] 0.2876207 642 [4,] Numeric,0 643 [5,] Numeric,0 649 [6,] Numeric,0 650 [7,] 0.7543349 652 [8,] Numeric,0 654 is a dataframe that is built after about 4 iterative processes of looking up some numbers in different tables and plugging them into the next process. The "Numeric,0" result is acheived becuase that missing value didnt get through a "greater than" comparison with a random number, that part is fine. What i want the result to be is: [1,] 0.4385542 641 [2,] 0.2876207 642 [3,] 0.7543349 652 throwing out all of the rows with a "Numeric 0" and keeping the rest. I havent been able to find a way to do this while keeping the two colums matched. Not sure if i will have to alter things farther up in my code or if i can just do some sort of unlist() function to clear out the unwanted rows. Again i apoligize for no working code and realize the breach of protocol but i hope my question is clear and i hope someone can help, Cheers JR -- View this message in context: http://www.nabble.com/Removing-Numeric%2C0-from-dataframe-tp21339704p21339704.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Environment change?
So i have a simple question that doesnt require sample code, not sure if that violates posting rules or not. Is this: [1] "111" "112" "113" "114" "115" "116" "118" "119" "120" "123" "125" "126" [13] "127" "128" "132" "137" "138" "139" "140" "143" "149" "154" "156" "157" [25] "158" "164" "165" "166" "173" "177" "188" "189" "190" "191" "192" "193" [37] "194" "195" "196" "197" "198" "199" "211" "213" "215" "222" "223" "225" [49] "227" "228" "229" "230" "231" "232" "236" "241" "242" "243" "246" "247" [61] "249" "250" "252" "253" "254" "255" "256" "257" "258" "259" "260" "261" [73] "264" "286" "295" "296" "298" "299" "300" "301" "303" "304" "306" "307" [85] "308" "309" "311" "312" "316" "348" "349" "352" "354" "355" "356" "357" [97] "358" "359" "363" "367" "370" "373" "374" "375" "376" "377" "386" "387" [109] "391" "392" "393" "396" "397" "398" "399" "400" "401" "402" "403" "404" [121] "405" "406" "407" "408" "409" "410" "411" "412" "413" "414" "415" "416" [133] "417" "418" "419" "420" "421" "422" "423" "424" "425" "426" "427" "428" [145] "430" "431" "432" "433" "434" "435" "439" "440" "441" "442" "443" "445" [157] "446" "447" "453" "454" "455" "460" "466" "475" "489" "493" "494" "495" [169] "497" "498" "499" "501" "502" "503" "504" "505" "506" "507" "509" "510" [181] "511" "512" "513" "514" "515" "516" "517" "518" "519" "520" "521" "522" [193] "523" "524" "525" "526" "527" "528" "529" "530" "531" "533" "535" "536" [205] "537" "538" "539" "540" "541" "543" "544" "545" "546" "547" "548" "549" [217] "563" "567" "568" "569" "570" "571" "572" "574" "575" "577" "579" "580" [229] "581" "584" "587" "588" "589" "590" "591" "592" "593" "603" "604" "607" [241] "608" "609" "620" "622" "623" "631" "632" "634" "637" "640" "642" "647" [253] "649" "651" "652" "653" "658" "659" "673" "674" "677" "681" "682" "683" [265] "684" "685" "686" "697" "698" "699" "702" "703" "704" "705" "706" "708" [277] "710" "714" "715" "717" "719" "720" "721" "722" "723" "725" "727" "728" [289] "733" "735" "737" "738" "744" "745" "746" "747" "750" "751" "754" "763" [301] "766" "769" "770" "771" "784" "785" "786" "787" "788" "789" "790" "791" [313] "792" "802" "803" "804" "805" "813" "818" "822" "826" "827" "828" "829" [325] "830" "831" "832" "834" "835" "837" "840" "841" "842" "843" The same as this: [1] 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 [22] 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 [43] 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 [64] 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 [85] 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 [106] 205 Now besides the fact that the values and length is different, does that fact that there are quotes around each value change the way R deals with the values. I am getting an error when i try and use the object that contains the first list's values. The error is: Error in `[.data.frame`(TAZ_VAC_FEET, TAZ_VAC_FEET$TAZ == Loc_Mod_TAZ, : (list) object cannot be coerced to type 'integer' Now the object Loc_Mod_TAZ would be equal to one of the first list values in a loop like: Candidates[i]=Loc_Mod_TAZ. So again the question is does R treat these two lists the same. Thanks and sorry i cant supply my code, it would require megs of data in order fo rit to work anyhow. Thanks in advance. Cheers, JR -- View this message in context: http://www.nabble.com/Environment-change--tp21636186p21636186.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] For Loop - loading 10 sets of data and calculating
I am trying to simplify my code by adding a for loop that will load and compute a sequence of code 10 time. They way i run it now is that the same 8 lines of code are basically reproduced 10 times. I would like to replace the numeric value in the code (e.g. Bin1, Bin2Bin10) each time the loop goes around. Below i tried doing this with a simple for loop and adding the string character before each numeric value. I need to first load the data then calculate, im sure this is possible as i have seen it done but cant seem to reproduce it in my own code. Hope my question is clear and i hope someone can offer some guidance. Cheers, JR for (i in 1:10) { #--- #Loads bin data frame from csv files with acres and TAZ data Bin$i_main <- read.csv(file="I:/Research/Samba/urb_transport_modeling/LUSDR/Workspace/BizLandPrice/data/Bin_lookup_values/Bin$i_lookup.csv",head=FALSE); #Separates Acres data from main data and converts acres to square feet Bin$i_Acres=Bin$i_main[[1]]*43560 #Separates TAZ data from main data Bin$i_TAZ=Bin$i_main[[2]] #Separates TAZ data from main data and converts acres to square feet Bin$i_TAZvacant=Bin$i_main[[3]]*43560 #Sums each parcel acreage data of the bin Bin$iAcres_sum=sum(Bin$i_Acres) #Creates data frame of cumlative percentages of each parcel of bin Bin$i_cumper=cumsum(Bin$i_Acres/Bin$iAcres_sum) #Calculates the probability of choosing particular parcel from bin Bin$i_parprob=abs(1-Bin$i_cumper) #Combines parcel acreage data and cumlative percentage data Bin$iMain.data = cbind(Bin%i_Acres,Bin$i_parprob,Bin$i_TAZ,Bin$i_TAZvacant) } -- View this message in context: http://www.nabble.com/For-Loop---loading-10-sets-of-data-and-calculating-tp20386673p20386673.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditionally Removing data from a vector
I am having trouble removing entries from a vector based on the value of the vector and another object value. It works in my pseudo test run: DEV=400 #Y CANDS=c(100:105) #Z DEVS=c(120,220,320,420,520) if(DEV>DEVS) (CANDS=CANDS[which(DEVTAZDetermine_FEET) (Candidates=Candidates[which(Dev_Size>TAZDetermine_FEET)]) } So basically i am starting out with a list of CAndidate TAZs and i want to remove the TAZS that are not big enough to accomadate the Development(Dev..At). My test code works but again i cant get it to run in my workiong code. Anything stupid im missing or maybe a better approach than the "which" function. Remove only seem useful for entire objects so not sure what else would work. Thanks guys and gals Cheers, JR -- View this message in context: http://www.nabble.com/Conditionally-Removing-data-from-a-vector-tp20627244p20627244.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditionally Removing data from a vector
I greatly apoligize, clear in my own mind, just didnt explain well enough. DEVS should equal 6 values the last being 620, like the data frame: CANDS DEVS > [1,] 100 120 > [2,] 101 220 > [3,] 102 320 > [4,] 103 420 > [5,] 104 520 > [6,] 105 620 And yes, my first question is do i need a "if" statement at all as the which command seems to do the same thing. IN my working code i start with a vector of Candidates, a process occurs that looks up a value associated with each of those Candidates. If the corresponding value is equal to or larger than the starting value, in the test case the DEV value, then that Candidate is kept and the next Candidate is analyzed. The final product is a new vector of Candidates, those that are equal to or larger than the DEV value. Hope i didnt further the perfection of your confucion. Thanks for your time, let me know if i need to ask a better question Cheers, JR Philipp Pagel-5 wrote: > > >> I am having trouble removing entries from a vector based on the value of >> the >> vector and another object value. It works in my pseudo test run: >> >> DEV=400 >> #Y >> CANDS=c(100:105) >> #Z >> DEVS=c(120,220,320,420,520) >> if(DEV>DEVS) >> (CANDS=CANDS[which(DEV> CANDS DEVS >> [1,] 100 120 >> [2,] 101 220 >> [3,] 102 320 >> [4,] 103 420 >> [5,] 104 520 >> [6,] 105 620 > > > The above code does not produce the output you show when I > paste it into R. It would be highly desirable to have a short, > documented, working example... > >> So the result CANDS is 103, 104 and 105 b/c the corresponding DEVS are >> larger than the DEV(400 in this case). > > How do they correspond? Do you mean they are both columns of a > data.frame as your example output suggests? But CANDS and DEVS > are of differnt length, so this is probably not what you meant. > Finally, your code seems to indicate that you want all elements > of CANDS for which the corresponding (?) value of DEVS is less > than threshold DEV, but only if DEV>DEVS. The latter produces a > warning, as you are comparing a single value to a vector of 5, > which is most likely not what you want. At this point my > confusion is perfect and I give up. > > I think you need to give us a better explanantion of what data > you have and what exactly you want to accomplish. > > cu > Philipp > > > -- > Dr. Philipp Pagel > Lehrstuhl für Genomorientierte Bioinformatik > Technische Universität München > Wissenschaftszentrum Weihenstephan > 85350 Freising, Germany > http://mips.gsf.de/staff/pagel > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://www.nabble.com/Conditionally-Removing-data-from-a-vector-tp20627244p20628349.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] creating a new vecotr in a for loop
I have consulted the intro and nabble but have not found an answer to what should be a simple question, here goes: I am doing a crosscheck of a data frame and pulling out a single value based on an inputted value ie based on x i will select y, or if x =2 then my code returns 7. xy 1 4 2 7 3 10 4 2 My code currently iterates through for as many times as the length of the x dataframe. What i would like to do is then is to build a vector for those values selected. so if vector x = 1,4,7 then "new-y" would equal 4, 2, 3. xy 1 4 2 7 3 10 4 2 5 21 6 13 7 3 8 90 I need to ultimetly re-sort my x vector based on the values of y but the function i am using is the subset function and i cant do it without y being a vector itself and it is currently only a one value object. #Creates test Candidates Vector Candidates=c(100,101,102,103,104,105) #Creates object equaling the number of candidate TAZs from the main script Location Choice Model NumCands1=length(Candidates) Dev..At=999 for(i in 1:NumCands1){ #Renames Location Choice Model generated TAZ's object Loc_Mod_TAZ=Candidates[i] #Converts Development size from main script to Development density function format Dev_Size=Dev..At #This is my "y" value in the example i gave above, this is the value that i need put into a # vector, a vector that should have as many values as there are number of candidates TAZDetermine_FEET=TAZ_VAC_FEET[TAZ_VAC_FEET$TAZ==Loc_Mod_TAZ,2] #Creates new vector based on adaquate vacancy in the TAZ and the development to be located Newcands=subset(Loc_Mod_TAZ, TAZDetermine_FEET>=Dev_Size) } Thanks for the help Cheers, JR -- View this message in context: http://www.nabble.com/creating-a-new-vecotr-in-a-for-loop-tp20691663p20691663.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a vector based on lookup function
I am still searching for a solution to what i think is a simple problem i am having with building a vector in a for loop. I have built a more understandable example so hopefully that will help..help you, help me, if you know what i mean. dev=400 #test location model TAZs to reference cands=c(101,105,109) #Create Object of length of cands candslength=length(cands) #TEST TAZ Vector CANDS=(100:110) #Test Dev Vector TAZDEVS=c(120,220,320,420,520,620,720,820,920,1020,1120) #Creates datframe of CANDS and DEVS TAZLIST=data.frame(CANDS,TAZDEVS) for(i in 1:candslength){ cand=cands[i] #Creates an object based on the value of "cands" in TAZLIST TAZDet=TAZLIST[TAZLIST$CANDS==cand[i],2] } What i would like to see is TAZDet filled with the "cands" corresponding values found in "TAZLIST"'s second column TAZDEVS, they would be 120,520,920. So if things worked the way i would like them TAZDet would be a vector containing these three values (102,520,920). At this point it gives me NAs. Any help would be desirable, Thanks guys and gals. Cheers, JR -- View this message in context: http://www.nabble.com/Creating-a-vector-based-on-lookup-function-tp20706588p20706588.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a vector
Good day all, I am having seom trouble building a simple vector. Below my sample code shows what ime trying to do and i have pointed out where the issue is. What happens not is that a single "TAZDetermine_FEET" is selected by i need multiple values, as many as there are "cands". I am thinking that this should occur within the for loop and add a "TAZDetermine_FEET" to a new vector ("TAZDs") each time the loop goes round but i cant seem to make it work using what i know. Probably simple so sorry and any help will be much appreciated, been struggling with this for too long. Cheers, JR #test location model TAZs TAZS=101:108 #Builds test Vacant TAZ vector TAZDEVS=c(125481,174581,556789,14755776,9984275,1324587,12457841,4511475) #Builds dataframe simulating TAZ_VAC_FEET TAZ_VAC_FEET=data.frame(TAZS,TAZDEVS) #Candidate TAZs from location shoice model cands=c(101,102,107,108) #Create Object of length of cands candslength=length(cands) #Test Location Choice Model selected TAZ for(i in i:candslength){ #Renames randomly generated TAZ's object Loc_Mod_TAZ=cands[i] #Create test Development in Sq. Ft. Dev_Size=50 #Determines vacant square feet by TAZ TAZDetermine_FEET=TAZ_VAC_FEET[TAZ_VAC_FEET$TAZS==Loc_Mod_TAZ,2] #Heres my problem...my attempt below is to build a vector of length-"candslength" of all of the #"TAZDetermine_FEET" values that are less than or equal to Dev_size. #heres what i have tried, doesnt get me what i want though #is.vector(TAZDs) #TAZDs=list(TAZDetermine_FEET) #TAZDs=vector(mode = "logical", length = candslength) } -- View this message in context: http://www.nabble.com/Creating-a-vector-tp20985205p20985205.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.