Having done similar, the options are (depending on the dataset): 1: Python to read, clean and classify data, then R to do the analysis (e.g. regression analysis)
2: Python to read, clean and classify data, and python for the analysis 3: All in R If you want to use Python for the analysis, most people would probably use Pandas for the data cleaning and SciPy for the stats. However, there are alternatives. There is a tutorial that describes almost exactly the same problem as yours here, using Pandas and some other packages: http://blog.yhat.com/posts/logistic-regression-and-python.html HTH, Matt <Trimmed original message> _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor