[R] clusterwise regression from fpc (fixed point clustering) package

Stefan Habermehl Mon, 14 Jan 2008 01:08:41 -0800

hi there, whenever i try the clusterwise regression from the fpc package, there 
occurs the following problem:
 
the first cluster is always designed in a way, that when i run a normal linear 
regression on the independent variables to describe the dependent variable 
(only on those respondents from the first cluster) - then the regression uses 
only one independent variable that describes the whole variance of the dep 
variable.
this seems interesting, but doesnt make sense. (it seems as if he is searching 
for a variable, that has identical values as the dependent var and gives this 
variable the whole power of explanation.
 
some words to the clusterwise regression:
it seems to me as if it is a very interesting method that combines clustering 
and regression in that way, that it clusters the respondents in n clusters, in 
a way, that their n regressions (one individual for every cluster) have a high 
explanation power.
 
my data looks that way: (just an example - it wouldnt make sense to post the 
whole table)
dep__i1__i2__i3__i4...
3____2__1___5__2
2____2__3___1__1
5____5__3___4__41____1__2___1__2
 
in other words: 1 dep var (1-5) and a couple of indep vars (1-5) ... so pretty 
standart (and about 4000 respondents)
 
 
the result of the clusterwise regression seems to be the following (when i use 
it):
 
for the first cluster he just looks if there is a ind var that is identiacal 
with the dep var (among a lot of respondents)
 
in this case he would choose i1 since it is approx. identical to the dep var 
(expect of the first respondent)
 
so in the result he gives me this clustering:
 
dep__i1__i2__i3__i4__cl
3____2__1___5__2___2
2____2__3___1__1___1
5____5__3___4__4___11____1__2___1__2___1
 
when i run a linear reg for cluster one - explaining dep on i1-i4 he uses 
almost only i1 for explaining dep with 100% explained variance but often the 
model is insignificant and it is not a kind of cluster i am looking for)
 
this is my code:
 
library(fpc)unabh <- read.table("G:/Data_Files/SPSS Files/unabh.csv", sep=";", 
header=TRUE)abh <- read.table("G:/Data_Files/SPSS Files/abh.csv", sep=";", 
header=TRUE)m <- as.matrix(unabh)attach(abh)rmt1 <- regmix(m, VVV, ir=1, 
nclus=1:2,icrit=1.e-5, minsig=1.e-6, warning=TRUE)write.table(rmt1$g)
 
 
this is just an example i have tried other nclusters and without the 
warning=TRUE and without any instruction beyond the variables but when trying 
more clusters he never stops calculating (maybe there are too many respondents 
r.b. 4000 ???)
 
 
my objective:
i want a couple of meaningful clusters with meaningful regressions
 
 
does anybody know what to change?
 
thanks a lot for any kind of recommendation
stafan
_________________________________________________________________



        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] clusterwise regression from fpc (fixed point clustering) package

Reply via email to