[Tutor] Syntax for Simplest Way to Execute One Python Program Over 1000's of Datasets
I'm trying to analyze thousands of different cancer datasets and run the same python program on them. I use Windows XP, Python 2.7 and the IDLE interpreter. I already have the input files in a directory and I want to learn the syntax for the quickest way to execute the program over all these datasets. As an example,for the sample python program below, I don't want to have to go into the python program each time and change filename and countfile. A computer could do this much quicker than I ever could. Thanks in advance! * * import string filename = 'draft1.txt' countfile = 'draft1_output.txt' def add_word(counts, word): if counts.has_key(word): counts[word] += 1 else: counts[word] = 1 def get_word(item): word = '' item = item.strip(string.digits) item = item.lstrip(string.punctuation) item = item.rstrip(string.punctuation) word = item.lower() return word def count_words(text): text = ' '.join(text.split('--')) #replace '--' with a space items = text.split() #leaves in leading and trailing punctuation, #'--' not recognised by split() as a word separator counts = {} for item in items: word = get_word(item) if not word == '': add_word(counts, word) return counts infile = open(filename, 'r') text = infile.read() infile.close() counts = count_words(text) outfile = open(countfile, 'w') outfile.write("%-18s%s\n" %("Word", "Count")) outfile.write("===\n") counts_list = counts.items() counts_list.sort() for word in counts_list: outfile.write("%-18s%d\n" %(word[0], word[1])) outfile.close ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Syntax for Simplest Way to Execute One Python Program Over 1000's of Datasets
hmm, thanks for the help. So I kinda got it working, although from an efficiency perspective it leaves a lot to be desired. I managed to do the following: 1) Create a script that gives me a list of all the filenames in the folder: path = "...\\Leukemia_Project" i = 0 for (files) in os.walk(path): print(files) print("\n") i += 1 2) I manually copy and paste the resulting list from the IDLE interpreter into a .txt file 3) Open the .txt file in Excel, remove the few lines I don't need (ie single quotes, etc) 4) Write another python script to print the result from step 3 in a new txt file, where each filename has its own row (pretty easy to do b/c all separated by the tab in the original txt file) 5)Put a for loop around my original python script that takes as input each line from the result of step 4. As it stands it is painfully awkward, but it works and has already saved me a lot of aggravation... On Thu, Jun 9, 2011 at 4:07 PM, Walter Prins wrote: > > > On 9 June 2011 20:49, B G wrote: > >> I'm trying to analyze thousands of different cancer datasets and run the >> same python program on them. I use Windows XP, Python 2.7 and the IDLE >> interpreter. I already have the input files in a directory and I want to >> learn the syntax for the quickest way to execute the program over all these >> datasets. >> >> As an example,for the sample python program below, I don't want to have to >> go into the python program each time and change filename and countfile. A >> computer could do this much quicker than I ever could. Thanks in advance! >> > > OK, so make a function of the current program logic for a single filename, > and make the base filename a parameter to that function. Then write another > function (or main program) that iterates over your directory tree and calls > the processing function for each input file. See documentation for module > os, specifically os.walk(). (See here: > http://docs.python.org/library/os.html ) > > If that doesn't get you going then post back again. > > Walter > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Program to Predict Chemical Properties and Reactions
I was just wondering how feasible it would be to build something like the following: Brief background, in chemistry, the ionization energy is defined as the energy required to remove an electron from an atom. The ionization energies of different elements follow general trends (ie moving left to right across the periodic table, the ionization energy increases; moving down a group the ionization energy decreases). What if I wanted something such that we could type in a few elements and the program would list the elements in order of increasing ionization energy (for, say, the first ionization energy). Any suggestions for going about this? I'm trying to brush up on my chemistry and thought the funnest way to do it would be to build a program that can do this (if it works, I'd also like to replicate it for level of electronegativity, atomic radius size, electron affinity, and ionization levels 2, 3, and 4). I have a good idea of pseudocode that could make this happen in terms of the rules of chemistry, but having some trouble visualizing the actual implementation. Thanks! ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Program to Predict Chemical Properties and Reactions
Thanks, Emile-- although I'm not sure I was completely clear about my objective. What I really meant is that is there a way (via machine learning) to give the computer a list of rules and exceptions, and then have it predict based on these rules the ionization energy. Ultimately I'm pretty interested in the idea of building a program that could predict the expected result of a chemical reaction, but that's a huge project, so I wanted to start with something much, much, much smaller and easier. So I don't really want to manually input ionization energy, atomic radius size, etc, since I'm not sure how much that would help towards the bigger goal. For this project, I already have the ~15 rules that describe trends on the periodic table. I want to write a program that, based on these rules (and the few exceptions to them), could predict the inputs that I want. Does this make sense? ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor