Manli Yan wrote:
assume dependent variable y( continuous),independent variable x ( continuous),I try to categorize x with some interval,such that,those intervals would has most significant different effect on y. any one knows which method I should apply,I know it will cause the loss of information,but can I really do that?or by using what mehod ,I will keep the loss minimal,all I want just some key words,thanks in advance~
This is bad statistical practice and should be avoided. Use modern methods such as regression splines, penalized splines, loess, etc.
Howard Wainer provided an algorithm that, for any set of x-y pairs in which there is no correlation, one can find a set of 5 intervals such that the mean y is increasing in x and another set of intervals in which the mean y is decreasing in x.
Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.