Hi Shashi, First off, keep the thread on the list. Compare the two statements below:
Jim: If this method is revealed to us, we may be able to help you. Shashi: "if this method reveal to me i can help" Regardless, I will attempt to help. This looks like number 2 - inefficient code You appear to be forming a very large vector bit by bit. This is _very_ inefficient. If you want to get the data frame "matrixdata" as a vector: # this may work fitness_1_data<-unlist(matrixdata) # if not, try this fitness_1_data<-as.vector(as.matrix(matrixdata)) This is written to a file and the file is read and again reformatted into vectors for processing. If you are able, try to create a _small_ data set that will be processed in the same way as "matrixdata" (e.g. a 10x10 data frame): smalldata<-as.data.frame(matrix(sample(1:100,100,nrow=10)) names(smalldata)<-paste("Col",1:10,sep="") This will allow you to try out your code without spending a day on each run. For instance, you can probably substitute: matrixdata2<-matrixdata[,-1] for a lot of the code in the second half of your script. Jim On Wed, May 11, 2016 at 10:16 AM, SHASHI SETH <sethsha...@rediffmail.com> wrote: > > Hi Jim, > > Thanks a lot.. I could not understand what do u mean by "if this method > reveal to me i can help" I am > giving full program again and putting comment at calculation part. When I > execute it, I can see after > every one minute 29 kb is written in the file. Pls see. > > > fitness_1_data <- c(); > src="dtm_mydata.csv" > matrixdata <- read.csv(src) > #get no vector/column from file/matrix > noofvec <- length(matrixdata) > > #set no of records/rows/document > noofrecords <- length(matrixdata[,1]) > #set row index > rindex<-1; > #preapare header > colindex<-1; > colList <- colnames(matrixdata) > > combine<-""; > > vec_fitness_data<- c(); > > while(colindex <= length(colList)) > { > fitness_1_data <- append(fitness_1_data,colList[colindex]) > > colindex<- colindex+1 > } > #add two additional vector for percentage and cluster > fitness_1_data <- append(fitness_1_data,"percentage") > fitness_1_data <- append(fitness_1_data,"Cluster") > #write.csv(matrix(fitness_1_data, nrow=1), file ="myfile.csv", > row.names=FALSE) > write.table(as.list(fitness_1_data), file ="Res_mydata_cycle1.csv",append > = TRUE, > row.names=FALSE, col.names=FALSE, sep=",") > > #end header record > > #while (rindex < 2) #fitness will apply for first record everytime (first > record will > be compare with all below records) > > nestedloopindex <- 2 > > > while( nestedloopindex <= noofrecords ) > { > > #init of temperory variables > sums1 <- 0; > sums2 <- 0; > sum <- 0; > > #set initial index of column 2 , coloumn one hold document no not > actual data > colindex <- 3; > > # combine <-""; > > vec1 <- c(); > vec2 <- c(); > > #add document number in vector > vec1 <- append(vec1,matrixdata[rindex,1]); > vec2 <- append(vec2,matrixdata[nestedloopindex,1]); > vec1 <- append(vec1,matrixdata$ID[rindex]); > vec2 <- append(vec2,matrixdata$ID[nestedloopindex]); > > > baseSum <- 0; > > ##############################################Calculation > Part####################################### > while(colindex <= noofvec ) > { > > baseSum <- baseSum + matrixdata[rindex,colindex] > > vec1 <- append(vec1,matrixdata[rindex,colindex]); > vec2 <- append(vec2,matrixdata[nestedloopindex,colindex]); > > sum = sum + > matrixdata[rindex,colindex]*matrixdata[nestedloopindex,colindex] > > sums1 <- sums1 + matrixdata[rindex,colindex]^2; > > sums2 <- sums2 + matrixdata[nestedloopindex,colindex]^2; > > colindex <- colindex+1 > } > > if(sum > 0 && sums1 > 0 && sums2 > 0) > { > out <- sum / ((sqrt(sums1) * sqrt(sums2))) > }else > { > out <-0 > } > #################################### End Calculation > ################################################ > vec1 <- append(vec1,out); > vec1 <-append(vec1, "1") > vec2 <- append(vec2, out); > > if(nestedloopindex==2) > { > write.table(as.list(vec1), file ="Res_mydata_cycle1.csv",append = > TRUE, row.names=FALSE, col.names=FALSE, sep=",") > write.table(as.list(vec2), file ="Res_mydata_cycle1.csv",append = > TRUE, row.names=FALSE, col.names=FALSE, sep=",") > nestedloopindex<- nestedloopindex+1 > } else > { > write.table(as.list(vec2), file ="Res_mydata_cycle1.csv",append = > TRUE, row.names=FALSE, col.names=FALSE, sep=",") > nestedloopindex<- nestedloopindex+1 > } > } > > Thanks, > Shashi > > > > > On Wed, 11 May 2016 03:49:19 +0530 Jim Lemon wrote > >Hi Shashi, > > The assumption that anyone on the list apart from yourself knows what > > "some calculation" involves is incorrect. I suspect that "what is > > wrong" may be one of two things: > > > 1) "some calculation" includes a very large number of operations, > > perhaps leading to "disk-thrashing" when your 16GB of memory is full > > of intermediate values. There is no software problem, buy more > > hardware. > > > > 2) "some calculation" is a very inefficient method of getting the > > result you want. If this method is revealed to us, we may be able to > > help you. > > > > Jim > > > > > > On Wed, May 11, 2016 at 2:24 AM, SHASHI SETH wrote: > > > Hi, > > > > > > > > > > > > I have implemented following program in R, that reads data from the > "dtm_mydata.csv". file size is > > > > > > 114,029 kB, saved document Term matrix. Prog. performing some > calculation and writing in a file. my > > > > > > computer RAM is 16 GB. To execute this program its taking around 25 > hours. can any body help me what > is > > > > > > wrong, why this much time is taken. Although it is doing the job what is > required > > > > > > fitness_1_data > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > <https://sigads.rediff.com/RealMedia/ads/click_nx.ads/www.rediffmail.com/signatureline.htm@Middle?> > > Get your own *FREE* website, *FREE* domain & *FREE* mobile app with > Company email. > *Know More >* > <http://track.rediff.com/click?url=___http://businessemail.rediff.com?sc_cid=sign-1-10-13___&cmp=host&lnk=sign-1-10-13&nsrv1=host> [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.