Dear R help list,

I have a training dataset that looks like Table1.
I have an unknown dataset that looks like Table2.
I want to have a program that should search the training dataset and
identify that the unknown sample belongs to which category (type1, type2 or
type3)
and also if the unknown does not belong to any of the categories, it should
let me know.
The real dataset has 600 variables and 50 sample types.

I tried working with linear discriminant analysis (lda in MASS package) and
its predict function. It works great but I think  lda is supposed to
categorize unknown into one of the types.
Most of my unknowns would not be from any category in the training dataset.
I don't want to have false positive identification.


Table 1: Three types and 10 variables

    type1    type1    type1    type2    type2    type2    type3    type3
type3
var1    24    28    25    50    51    46    18    20    16
var2    4    5    4    9    8    9    10    9    10
var3    7    7    7    12    12    12    9    6    6
var4    4    5    4    10    12    9    2    2    2
var5    4    5    4    10    9    10    3    2    3
var6    5    4    5    2    3    2    1    3    5
var7    5    4    5    7    7    7    3    3    3
var8    3    4    3    10    10    8    4    2    4
var9    3    4    3    2    2    2    2    2    2
var10    3    3    3    4    4    4    3    1    2


Table 2

    unknown
var1    23
var2    4
var3    7
var4    4
var5    4
var6    6
var7    5
var8    3
var9    3
var10    3



Thanks

RS

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to