Re: [Tutor] Question regular expressions - the non-greedy pattern
Now I think I got it. Thanks a lot again. Marcin Am 22.01.2013 12:00, schrieb tutor-requ...@python.org: > > Message: 1 > Date: Tue, 22 Jan 2013 11:31:01 +1100 > From: Steven D'Aprano > To: tutor@python.org > Subject: Re: [Tutor] Question regular expressions - the non-greedy > pattern > Message-ID: <50fdddc5.6030...@pearwood.info> > Content-Type: text/plain; charset=UTF-8; format=flowed > > On 22/01/13 10:11, Marcin Mleczko wrote: >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA1 >> >> Hello Hugo, hello Walter, >> >> first thank you very much for the quick reply. >> >> The functions used here i.e. re.match() are taken directly form the >> example in the mentioned HowTo. I'd rather use re.findall() but I >> think the general interpretetion of the given regexp sould be nearly >> the same in both functions. > > Regular expressions are not just "nearly" the same, they are EXACTLY > the same, in whatever re function you call, with one exception: > > re.match only matches at the start of the string. > > >> So I'd like to neglect the choise of a particular function for a >> moment a concentrate on the pure theory. >> What I got so far: >> in theory form s = '> '<.*?>' would match > > Incorrect. It will match > > '<' > '' > '' > '' > > > Why don't you try it and see? > > py> s = ' py> import re > py> re.findall('<.*?>', s) > ['<', '', '', ''] > > > The re module is very stable. The above is what happens in every Python > version between *at least* 1.5 and 3.3. > > >> to achieve this the engine should: >> 1. walk forward along the text until it finds< > > Correct. That matches the first "<". > > >> 2. walk forward from that point until in finds> > > Correct. That matches the first ">". > > Since the regex has now found a match, it moves on to the next part > of the regex. Since this regex is now complete, it is done, and > returns what it has found. > > >> 3. walk backward form that point (the one of>) until it finds< > > Regexes only backtrack on *misses*, not once they successfully find > a match. Once a regex has found a match, it is done. > > >> 4. return the string between< from 3. and> from 2. as this gives the >> least possible string between< and> > > Incorrect. > > >> Did I get this right so far? Is this (=least possible string between< >> and>), what non-greedy really translates to? > > No. The ".*" regex searches forward as far as possible; the ".*?" searches > forward as little as possible. They do not backtrack. > > The only time a non-greedy regex will backtrack is if the greedy version > will backtrack. Since ".*" has no reason to backtrack, neither does ".*?". > > >> For some reason, I did not get so far the regexp engine in Python >> omits step 3. and returns the string between< from 1. and> from 2. >> resulting in '<' >> >> Am I right? If so, is there an easily graspable reason for the engine >> designers to implement it this way? > > Because that's the way regexes work. You would need to learn about > regular expression theory, which is not easy material. But you can start > here: > > http://en.wikipedia.org/wiki/Regular_expression > > and for more theoretical approach: > > http://en.wikipedia.org/wiki/Chomsky_hierarchy > http://en.wikipedia.org/wiki/Regular_language > http://en.wikipedia.org/wiki/Regular_grammar > > If you don't understand all the theory, don't worry, neither do I. > > > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Calculate hours
Hello All, I originally wrote this program to calculate and print the employee with the most hours worked in a week. I would now like to change this to calculate and print the hours for all 8 employees in ascending order. The employees are named employee 0 - 8 Any ideas? Thanks, Tony Code below: # Create table of hours worked matrix = [ [2, 4, 3, 4, 5, 8, 8], [7, 3, 4, 3, 3, 4, 4], [3, 3, 4, 3, 3, 2, 2], [9, 3, 4, 7, 3, 4, 1], [3, 5, 4, 3, 6, 3, 8], [3, 4, 4, 6, 3, 4, 4], [3, 7, 4, 8, 3, 8, 4], [6, 3, 5, 9, 2, 7, 9]] maxRow = sum(matrix[0]) # Get sum of the first row in maxRow indexOfMaxRow = 0 for row in range(1, len(matrix)): if sum(matrix[row]) > maxRow: maxRow = sum(matrix[row]) indexOfMaxRow = row print("Employee 7", indexOfMaxRow, "has worked: ", maxRow, "hours") ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Calculate hours
On 01/22/2013 09:52 PM, anthonym wrote: Hello All, > > I originally wrote this program to calculate and print the employee > with the most hours worked in a week. I would now like to change this > to calculate and print the hours for all 8 employees in ascending > order. > > The employees are named employee 0 - 8 > > Any ideas? > > Thanks, > Tony > > Code below: > > > > # Create table of hours worked > > matrix = [ > [2, 4, 3, 4, 5, 8, 8], > [7, 3, 4, 3, 3, 4, 4], > [3, 3, 4, 3, 3, 2, 2], > [9, 3, 4, 7, 3, 4, 1], > [3, 5, 4, 3, 6, 3, 8], > [3, 4, 4, 6, 3, 4, 4], > [3, 7, 4, 8, 3, 8, 4], > [6, 3, 5, 9, 2, 7, 9]] > > maxRow = sum(matrix[0]) # Get sum of the first row in maxRow > indexOfMaxRow = 0 > > for row in range(1, len(matrix)): > if sum(matrix[row]) > maxRow: > maxRow = sum(matrix[row]) > indexOfMaxRow = row > > print("Employee 7", indexOfMaxRow, "has worked: ", maxRow, "hours") There is an issue with this program: it omits the first row. It's better to use enumerate, e.g.: for n, row in enumerate(matrix): ... To make the change you need, use list comprehension to make sums of all rows, sort it (using list sort method); iterate over it using enumerate() and print out "employee N, sum of hours:" HTH, -m -- Lark's Tongue Guide to Python: http://lightbird.net/larks/ Idleness is the mother of psychology. Friedrich Nietzsche ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Calculate hours
On 01/22/2013 09:52 PM, anthonym wrote: Hello All, I originally wrote this program to calculate and print the employee with the most hours worked in a week. I would now like to change this to calculate and print the hours for all 8 employees in ascending order. The employees are named employee 0 - 8 Any ideas? Thanks, Tony Code below: # Create table of hours worked matrix = [ [2, 4, 3, 4, 5, 8, 8], [7, 3, 4, 3, 3, 4, 4], [3, 3, 4, 3, 3, 2, 2], [9, 3, 4, 7, 3, 4, 1], [3, 5, 4, 3, 6, 3, 8], [3, 4, 4, 6, 3, 4, 4], [3, 7, 4, 8, 3, 8, 4], [6, 3, 5, 9, 2, 7, 9]] maxRow = sum(matrix[0]) # Get sum of the first row in maxRow indexOfMaxRow = 0 for row in range(1, len(matrix)): if sum(matrix[row]) > maxRow: maxRow = sum(matrix[row]) indexOfMaxRow = row print("Employee 7", indexOfMaxRow, "has worked: ", maxRow, "hours") What other constraints does your professor make? For example, are you allowed to use sorting with custom key function? If you want to keep it simple, then invert your logic to find the min element. Then after finding that min element, print it, delete it, and repeat the whole thing till the matrix is empty. -- DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Calculate hours
On 01/22/2013 10:08 PM, Mitya Sirenef wrote: On 01/22/2013 09:52 PM, anthonym wrote: Hello All, > > I originally wrote this program to calculate and print the employee > with the most hours worked in a week. I would now like to change this > to calculate and print the hours for all 8 employees in ascending > order. > > The employees are named employee 0 - 8 > > Any ideas? > > Thanks, > Tony > > Code below: > > > > # Create table of hours worked > > matrix = [ > [2, 4, 3, 4, 5, 8, 8], > [7, 3, 4, 3, 3, 4, 4], > [3, 3, 4, 3, 3, 2, 2], > [9, 3, 4, 7, 3, 4, 1], > [3, 5, 4, 3, 6, 3, 8], > [3, 4, 4, 6, 3, 4, 4], > [3, 7, 4, 8, 3, 8, 4], > [6, 3, 5, 9, 2, 7, 9]] > > maxRow = sum(matrix[0]) # Get sum of the first row in maxRow > indexOfMaxRow = 0 > > for row in range(1, len(matrix)): > if sum(matrix[row]) > maxRow: > maxRow = sum(matrix[row]) > indexOfMaxRow = row > > print("Employee 7", indexOfMaxRow, "has worked: ", maxRow, "hours") There is an issue with this program: it omits the first row. No, it doesn't. The OP fills in item 0 in the initial values for maxRow and indexOfMaxRow. Then he figures he can skip that row in the loop, which is correct. It's better to use enumerate, e.g.: for n, row in enumerate(matrix): ... To make the change you need, use list comprehension to make sums of all rows, sort it (using list sort method); iterate over it using enumerate() and print out "employee N, sum of hours:" HTH, -m -- DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Calculate hours
On 01/22/2013 10:34 PM, Dave Angel wrote: On 01/22/2013 10:08 PM, Mitya Sirenef wrote: >> On 01/22/2013 09:52 PM, anthonym wrote: >>> Hello All, >> > >> > I originally wrote this program to calculate and print the employee >> > with the most hours worked in a week. I would now like to change this >> > to calculate and print the hours for all 8 employees in ascending >> > order. >> > >> > The employees are named employee 0 - 8 >> > >> > Any ideas? >> > >> > Thanks, >> > Tony >> > >> > Code below: >> > >> > >> > >> > # Create table of hours worked >> > >> > matrix = [ >> > [2, 4, 3, 4, 5, 8, 8], >> > [7, 3, 4, 3, 3, 4, 4], >> > [3, 3, 4, 3, 3, 2, 2], >> > [9, 3, 4, 7, 3, 4, 1], >> > [3, 5, 4, 3, 6, 3, 8], >> > [3, 4, 4, 6, 3, 4, 4], >> > [3, 7, 4, 8, 3, 8, 4], >> > [6, 3, 5, 9, 2, 7, 9]] >> > >> > maxRow = sum(matrix[0]) # Get sum of the first row in maxRow >> > indexOfMaxRow = 0 >> > >> > for row in range(1, len(matrix)): >> > if sum(matrix[row]) > maxRow: >> > maxRow = sum(matrix[row]) >> > indexOfMaxRow = row >> > >> > print("Employee 7", indexOfMaxRow, "has worked: ", maxRow, "hours") >> >> >> There is an issue with this program: it omits the first row. > > No, it doesn't. The OP fills in item 0 in the initial values for maxRow and indexOfMaxRow. Then he figures he can skip that row in the loop, which is correct. Yes, I noticed after writing the reply.. To the OP: that's an odd way to handle it, you can set maxRow to 0 and then iterate over all rows. -m -- Lark's Tongue Guide to Python: http://lightbird.net/larks/ Habit is stronger than reason. George Santayana ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Calculate hours
Thanks Dave I think I would like to keep it simple. How would I get it to repeat and print before deleting? On 1/22/13 7:10 PM, "Dave Angel" wrote: >On 01/22/2013 09:52 PM, anthonym wrote: >> Hello All, >> >> I originally wrote this program to calculate and print the employee >>with the >> most hours worked in a week. I would now like to change this to >>calculate >> and print the hours for all 8 employees in ascending order. >> >> The employees are named employee 0 - 8 >> >> Any ideas? >> >> Thanks, >> Tony >> >> Code below: >> >> >> >> # Create table of hours worked >> >> matrix = [ >> [2, 4, 3, 4, 5, 8, 8], >> [7, 3, 4, 3, 3, 4, 4], >> [3, 3, 4, 3, 3, 2, 2], >> [9, 3, 4, 7, 3, 4, 1], >> [3, 5, 4, 3, 6, 3, 8], >> [3, 4, 4, 6, 3, 4, 4], >> [3, 7, 4, 8, 3, 8, 4], >> [6, 3, 5, 9, 2, 7, 9]] >> >> maxRow = sum(matrix[0]) # Get sum of the first row in maxRow >> indexOfMaxRow = 0 >> >> for row in range(1, len(matrix)): >> if sum(matrix[row]) > maxRow: >> maxRow = sum(matrix[row]) >> indexOfMaxRow = row >> >> print("Employee 7", indexOfMaxRow, "has worked: ", maxRow, "hours") >> >> > >What other constraints does your professor make? For example, are you >allowed to use sorting with custom key function? > >If you want to keep it simple, then invert your logic to find the min >element. Then after finding that min element, print it, delete it, and >repeat the whole thing till the matrix is empty. > > > > >-- >DaveA >___ >Tutor maillist - Tutor@python.org >To unsubscribe or change subscription options: >http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor