[Tutor] Syntax for Simplest Way to Execute One Python Program Over 1000's of Datasets

2011-06-09 Thread B G
I'm trying to analyze thousands of different cancer datasets and run the
same python program on them.  I use Windows XP, Python 2.7 and the IDLE
interpreter.  I already have the input files in a directory and I want to
learn the syntax for the quickest way to execute the program over all these
datasets.

As an example,for the sample python program below, I don't want to have to
go into the python program each time and change filename and countfile.  A
computer could do this much quicker than I ever could.  Thanks in advance!

*
*


import string

filename = 'draft1.txt'
countfile = 'draft1_output.txt'

def add_word(counts, word):
if counts.has_key(word):
counts[word] += 1
else:
counts[word] = 1

def get_word(item):
word = ''
item = item.strip(string.digits)
item = item.lstrip(string.punctuation)
item = item.rstrip(string.punctuation)
word = item.lower()
return word


def count_words(text):
text = ' '.join(text.split('--')) #replace '--' with a space
items = text.split() #leaves in leading and trailing punctuation,
 #'--' not recognised by split() as a word separator
counts = {}
for item in items:
word = get_word(item)
if not word == '':
add_word(counts, word)
return counts

infile = open(filename, 'r')
text = infile.read()
infile.close()

counts = count_words(text)

outfile = open(countfile, 'w')
outfile.write("%-18s%s\n" %("Word", "Count"))
outfile.write("===\n")

counts_list = counts.items()
counts_list.sort()
for word in counts_list:
outfile.write("%-18s%d\n" %(word[0], word[1]))

outfile.close
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Syntax for Simplest Way to Execute One Python Program Over 1000's of Datasets

2011-06-09 Thread B G
hmm, thanks for the help.  So I kinda got it working, although from an
efficiency perspective it leaves a lot to be desired.

I managed to do the following:
1) Create a script that gives me a list of all the filenames in the folder:
path = "...\\Leukemia_Project"
i = 0
for (files) in os.walk(path):
print(files)
print("\n")
i += 1
2) I manually copy and paste the resulting list from the IDLE interpreter
into a .txt file
3) Open the .txt file in Excel, remove the few lines I don't need (ie single
quotes, etc)
4) Write another python script to print the result from step 3 in a new txt
file, where each filename has its own row (pretty easy to do b/c all
separated by the tab in the original txt file)
5)Put a for loop around my original python script that takes as input each
line from the result of step 4.

As it stands it is painfully awkward, but it works and has already saved me
a lot of aggravation...


On Thu, Jun 9, 2011 at 4:07 PM, Walter Prins  wrote:

>
>
> On 9 June 2011 20:49, B G  wrote:
>
>> I'm trying to analyze thousands of different cancer datasets and run the
>> same python program on them.  I use Windows XP, Python 2.7 and the IDLE
>> interpreter.  I already have the input files in a directory and I want to
>> learn the syntax for the quickest way to execute the program over all these
>> datasets.
>>
>> As an example,for the sample python program below, I don't want to have to
>> go into the python program each time and change filename and countfile.  A
>> computer could do this much quicker than I ever could.  Thanks in advance!
>>
>
> OK, so make a function of the current program logic for a single filename,
> and make the base filename a parameter to that function.  Then write another
> function (or main program) that iterates over your directory tree and calls
> the processing function for each input file.  See documentation for module
> os, specifically os.walk().  (See here:
> http://docs.python.org/library/os.html )
>
> If that doesn't get you going then post back again.
>
> Walter
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Program to Predict Chemical Properties and Reactions

2011-07-16 Thread B G
I was just wondering how feasible it would be to build something like the
following:

Brief background, in chemistry, the ionization energy is defined as the
energy required to remove an electron from an atom. The ionization energies
of different elements follow general trends (ie moving left to right across
the periodic table, the ionization energy increases; moving down a group the
ionization energy decreases).

What if I wanted something such that we could type in a few elements and the
program would list the elements in order of increasing ionization energy
(for, say, the first ionization energy).  Any suggestions for going about
this?  I'm trying to brush up on my chemistry and thought the funnest way to
do it would be to build a program that can do this (if it works, I'd also
like to replicate it for level of electronegativity, atomic radius size,
electron affinity, and ionization levels 2, 3, and 4). I have a good idea of
pseudocode that could make this happen in terms of the rules of chemistry,
but having some trouble visualizing the actual implementation.

Thanks!
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Program to Predict Chemical Properties and Reactions

2011-07-16 Thread B G
Thanks, Emile-- although I'm not sure I was completely clear about my
objective. What I really meant is that is there a way (via machine learning)
to give the computer a list of rules and exceptions, and then have it
predict based on these rules the ionization energy. Ultimately I'm pretty
interested in the idea of building a program that could predict the expected
result of a chemical reaction, but that's a huge project, so I wanted to
start with something much, much, much smaller and easier.

So I don't really want to manually input ionization energy, atomic radius
size, etc, since I'm not sure how much that would help towards the bigger
goal.  For this project, I already have the ~15 rules that describe trends
on the periodic table.  I want to write a program that, based on these rules
(and the few exceptions to them), could predict the inputs that I want.
 Does this make sense?
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor