Thom_249 wrote:
Hi
For a school project I have a file with 120 columns and ~2000 lines. This
file contains timestamps of spike detected in 60 channels, and the time
elapsed between the last spike.
I need to clean too high values. About 98% of values are between 0 and 2000
and 2% are between 2000 and 20'000. I want to get rid of theses values.
Please could you help me?
Regards
Thom
Hi,
If you are dealing with a data frame, you can simply use the index
function like this,
my.data.frame[my.data.frame>2000] = NA
This way, all values in your data.frame that are greater than 2000 will
be transformed into NAs. However, this kind of substitution wouldn't be
recommended if your goal is to fit a regression model to the data.
Missing data can sometimes be informative, hence the importance to be
careful when doing such substitutions.
--
*Luc Villandré*
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.