Here is how to wittle it down for the first two parts of your question. I am not exactly what you are after in the third part. Is it that you want specific DATEs or do you want the ratio of the DATE[max]/DATE[min]?
> x <- read.table(textConnection("CODE NAME > DATE DATA1 + 4813 'ADVANCED TELECOM' 1987 0.013 + 3845 'ADVANCED THERAPEUTIC SYS LTD' 1987 10.1 + 3845 'ADVANCED THERAPEUTIC SYS LTD' 1989 2.463 + 3845 'ADVANCED THERAPEUTIC SYS LTD' 1988 1.563 + 2836 'ADVANCED TISSUE SCI -CL A' 1987 0.847 + 2836 'ADVANCED TISSUE SCI -CL A' 1989 0.872 + 2836 'ADVANCED TISSUE SCI -CL A' 1988 0.529"), header=TRUE) > # matches on things to delete > delete_indx <- grep("-CL A$|-OLD$|-ADS$", x$NAME) > # delete them > x <- x[-delete_indx,] > x CODE NAME DATE DATA1 1 4813 ADVANCED TELECOM 1987 0.013 2 3845 ADVANCED THERAPEUTIC SYS LTD 1987 10.100 3 3845 ADVANCED THERAPEUTIC SYS LTD 1989 2.463 4 3845 ADVANCED THERAPEUTIC SYS LTD 1988 1.563 > # I assume you want to use NAME to check for ranges of data > date_range <- tapply(x$DATE, x$NAME, function(dates) diff(range(dates))) > date_range ADVANCED TELECOM ADVANCED THERAPEUTIC SYS LTD 0 2 ADVANCED TISSUE SCI -CL A NA > # delete ones with less than 3 years > names_to_delete <- names(date_range[date_range < 2]) > # delete those entries > x <- x[!(x$NAME %in% names_to_delete),] > x CODE NAME DATE DATA1 2 3845 ADVANCED THERAPEUTIC SYS LTD 1987 10.100 3 3845 ADVANCED THERAPEUTIC SYS LTD 1989 2.463 4 3845 ADVANCED THERAPEUTIC SYS LTD 1988 1.563 > > On Nov 13, 2007 2:34 PM, Jonas Malmros <[EMAIL PROTECTED]> wrote: > Dear R users, > > I have a huge database and I need to adjust it somewhat. > > Here is a very little cut out from database: > > CODE NAME DATE > DATA1 > 4813 ADVANCED TELECOM 1987 0.013 > 3845 ADVANCED THERAPEUTIC SYS LTD 1987 10.1 > 3845 ADVANCED THERAPEUTIC SYS LTD 1989 2.463 > 3845 ADVANCED THERAPEUTIC SYS LTD 1988 1.563 > 2836 ADVANCED TISSUE SCI -CL A 1987 0.847 > 2836 ADVANCED TISSUE SCI -CL A 1989 0.872 > 2836 ADVANCED TISSUE SCI -CL A 1988 0.529 > > What I need is: > 1) Delete all cases containing -CL A (and also -OLD, -ADS, etc) at the end > 2) Delete all cases that have less than 3 years of data > 3) For each remaining case compute ratio DATA1(1989) / DATA1(1987) > [and then ratios involving other data variables] and output this into > new database consisting of CODE, NAME, RATIOs. > > Maybe someone can suggest an effective way to do these things? I > imagine the first one would involve grep(), and 2 and 3 would involve > apply family of functions, but I cannot get my mind around the actual > code to perform this adjustments. I am new to R, I do write code but > usually it consists of for-functions and plotting. I would much > appreciate your help. > Thank you in advance! > -- > Jonas Malmros > Stockholm University > Stockholm, Sweden > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.