Thank you so much, Jessica,

The specific of my case is that I have a very detailed variable 'Interests' 
which may have several thousands of possible values. Usually each customer has 
3-10 different interests. For example:
customer_id|...|interests
10000001   |...| cycling, swimming, cooking
10000002   |...| cooking, singing, dancing

Total number of possible distinct values is several thousands. I m curious how 
to use these interests in SVM (represent as a vector of real numbers with 
several thousands of elements?).

If you have any ideas please let me know.


Thank you,
-Alex

________________________________
From: Jessica Streicher [j.streic...@micromata.de]
Sent: 27 March 2012 11:18
To: Alekseiy Beloshitskiy
Subject: Re: [R] normalization of multi-value string variable

Well, not sure what you mean with scaling and normalizing strings, but if you 
want to represent the interests as numbers, you can do something like this:

n<-seq(1,length(unique(my_strings)))[factor(my_strings)]


Am 26.03.2012 um 18:50 schrieb Alekseiy Beloshitskiy:

Hi All,

I need to normalize/scale string variable which represents interests of 
customers (e.g., 'cycling, rollerblading, swimming' etc).

Does anybody know how to do this, I want then use it along with other numeric 
variables for SVM classification.

Appreciate for any advice.

-Alex

[[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org<mailto:R-help@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




Velti anti-spam filter: Click 
here<https://www.mailcontrol.com/sr/r0FnbR2LtoLTndxI!oX7UvIItv2OGGpT0AcqlhvMu8o1Dzu7YBkufzUjcExl8H5fIQg52m9U+4B6aunJTqVygQ==>
 to report this email as spam.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to