kitty wrote:
I'm new to python and I have read through the tutorial on:
http://docs.python.org/tutorial/index.html
which was really good, but I have been an R user for 7 years and and am
finding it difficult to do even basic things in python, for example I want
to import my data (a tab-delimited .txt file) so that I can index and select
a random sample of one column based on another column. my data has
2 columns named 'area' and 'change.dens'.
In R I would just
data<-read.table("FILE PATH\\Road.density.municipio.all.txt", header=T)
#header =T gives colums their headings so that I can call each individually
names(data)
attach(data)
Then to Index I would simply:
subset<-change.dens[area<2000&area>700] # so return change.dens values that
have corresponding 'area's of between 700 and 2000
then to randomly sample a value from that I just need to
random<-sample(subset,1)
My question is how do I get python to do this???
Good question! This does look like something where R is easier to use
than Python, especially with the table() function doing most of the work
for you.
Here's one way to do it in Python.
# Open the file and read two tab-delimited columns.
# Note that there is minimal error checking here.
f = open('Road.density.municipio.all.txt')
data = []
for row in f:
if not row.strip():
# Skip blank lines.
continue
area, dens = row.split('\t') # Split into two columns at tab
pair = (float(area), float(dens))
data.append(pair)
f.close() # Close the file when done.
# Select items with specified areas.
subset = [pair for pair in data if 700 < pair[0] < 2000]
# Get a single random sample.
import random
sample = random.choice(subset)
# Get ten random samples, sampling with replacement.
samples = [random.choice(subset) for i in range(10)]
# Get ten random samples, without replacement.
copy = subset[:]
random.shuffle(copy)
samples = copy[:10]
--
Steven
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor