Hello, I bumped into the following funny use-case. I have too much data for a given plot. I have the following data frame df:
> str(df) 'data.frame': 5015 obs. of 5 variables: $ n : Factor w/ 5 levels "1000","2000",..: 1 1 1 1 1 1 1 1 1 1 ... $ iter : int 10 20 30 40 50 60 70 80 90 100 ... $ Error : num 1.05e-02 1.24e-03 3.67e-04 1.08e-04 4.05e-05 ... $ Duality_Gap: num 20080 3789 855 443 321 ... $ Runtime : num 0.00536 0.01353 0.01462 0.01571 0.01681 ... But if I plot e.g. Runtime vs log(Duality Gap) I have too many observations due to taking a snapshot every 10 iterations rather than say 500 and the plot looks very cluttered. So I would like to trim the data frame including only those records for which iter is multiple of 500 and so I do this: df <- subset(df, iter %% 500 == 0) This gives me almost exactly what I need except that the last and most important Duality Gap observations are of course gone due to the filtering ... I would like to change the subset clause to be iter %% 500 _or_ the record is the last per n (n is my problem size and category in this case) ... how can I do that? I thought of adding a new column that flags whether a given row is the last element per category as "last" Boolean but this is a bit too complicated .. is there a simpler condition construct that can be used with the subset command? TIA, Best regards, Giovanni ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.