On Sat, Dec 14, 2013 at 12:29 PM, Amit Saha <amitsaha...@gmail.com> wrote: > On Sat, Dec 14, 2013 at 12:14 PM, Michael Crawford <dalu...@gmail.com> wrote: >> I found this piece of code on github >> >> https://gist.github.com/kljensen/5452382 >> >> def one_hot_dataframe(data, cols, replace=False): >> """ Takes a dataframe and a list of columns that need to be encoded. >> Returns a 3-tuple comprising the data, the vectorized data, >> and the fitted vectorizor. >> """ >> vec = DictVectorizer() >> mkdict = lambda row: dict((col, row[col]) for col in cols) >> #<<<<<<<<<<<<<<<<<< >> vecData = pandas.DataFrame(vec.fit_transform(data[cols].apply(mkdict, >> axis=1)).toarray()) >> vecData.columns = vec.get_feature_names() >> vecData.index = data.index >> if replace is True: >> data = data.drop(cols, axis=1) >> data = data.join(vecData) >> return (data, vecData, vec) >> >> I don't understand how that lambda expression works. >> For starters where did row come from? >> How did it know it was working on data? > > Consider this simple example: > >>>> l = lambda x: x**2 >>>> apply(l, (3,)) > 9 > > A lambda is an anonymous function. So, when you use apply(), the > lambda, l gets the value 3 in x and then returns x**2 which is 9 in > this case.
Argh, no sorry, that doesn't answer your question. Sorry, my bad. I should have read your query properly. -- http://echorand.me _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor