[Tutor] map vs. list comprehension

2006-02-14 Thread Michael Broe
I read somewhere that the function 'map' might one day be deprecated  
in favor of list comprehensions.

But I can't see a way to do this in a list comprehension:

 >>> map (pow, [2, 2, 2, 2], [1, 2, 3, 4])
[2, 4, 8, 16]

Is there a way?

Cheers,
Mike
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] map vs. list comprehension

2006-02-14 Thread Michael Broe

On Feb 14, 2006, at 3:46 PM, Andre Roberge wrote:

> [2**i for i in [1, 2, 3, 4]]

Ah yes, I'm sorry, I was thinking of the most general case, where the  
arguments are
two arbitrary lists. My example was too specific.

Is there a way to do something like the following in a list  
comprehension?

map(pow, [1, 7, 6, 2], [3, 8, 2, 5])

Mike

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Unexpected behavior of +=

2006-02-15 Thread Michael Broe
I just discovered the following behavior, but can't find any  
documentation about it:

 >>> list = []
 >>> list = list + 'abc'
Traceback (most recent call last):
   File "", line 1, in ?
TypeError: can only concatenate list (not "str") to list

but:

 >>> list = []
 >>> list += 'abc'
 >>> list
['a', 'b', 'c']

Is this a special characteristic that has been added to the augmented  
assignment operator +=; or is it an automatic consequence of +=  
assignment being performed'in place'? (Tho I can't see how it could  
be...)

It just seems very un-Pythonesque to be able to successfully  
concatenate objects of different types like this. And it seems very  
inconsistent with standard assignment.

Indeed, the Python Reference Manual, section 6.3.1 states:

"With the exception of assigning to tuples and multiple targets in a  
single statement, the assignment done by augmented assignment  
statements is handled the same way as normal assignments. Similarly,  
with the exception of the possible in-place behavior, the binary  
operation performed by augmented assignment is the same as the normal  
binary operations."

...which is patently not the case here.

I was scandalized lol!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] 'in-place' methods

2006-02-17 Thread Michael Broe
I think I understand this sorting-a-list 'in place' stuff, and things  
of that kind (reversing for example); but I am finding it very  
difficult to get used to, since sorting a list doesn't return the  
sorted list as a value, but simply does the work as a side effect.

The place where it really trips me up is when I want to do something  
like pass a sorted list as a value in a program, say.

Here is a trivial example. Say I want to define a function to always  
print lists in sorted form. I can't do it as follows, tho it's what I  
want to do intuitively:

def sort_print(L):
print L.sort()

This is on a strict analogy to writing a function that prints strings  
in upper case only:

def upper_print(s):
print s.upper()

(The fact that the second works and the first doesn't really does bug  
me as a newbie.)

Anyway, first question: is the fact that the first doesn't work  
purely a function of the fact that lists are mutable? (At least I  
could kind of understand that, that methods work differently for  
objects of different types). Or is it possible to have methods  
like .upper() that return the result of performing the operation even  
for lists?

Second question. Do I really have to write the sort_print() function  
like this:

def sort_print(L):
L.sort()
print L

i.e. first perform the operation in-place, then pass the variable? Is  
this the idiomatic way of doing it?

Third question: is this in-place behavior of methods in effect  
restricted to lists, or are there other places I should be on the  
lookout for it?

Cheers,
Mike


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Unicode and regexes

2006-03-10 Thread Michael Broe
Does Python support the Unicode-flavored class-specifications in  
regular expressions, e.g. \p{L} ? It doesn't work in the following  
code, any ideas?

-

#! /usr/local/bin/python

""" usage: ./uni_read.py file
"""
import codecs
import re

text = codecs.open(sys.argv[1], mode='r', encoding='utf-8').read()

unicode_property_pattern = re.compile(r"\p{L}")
dot_pattern = re.compile(".")

letters = unicode_property_pattern.findall(text)
characters = dot_pattern.findall(text)

print 'var letters =', letters
print 'var characters = ', characters

-

The input file, encoded in utf-8 is

abc 

The output is:

var letters = []
var characters =  [u'a', u'b', u'c', u' ', u'\u03b1', u'\u03b2',  
u'\u03b3']

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Unicode and regexes

2006-03-11 Thread Michael Broe
Thanks Kent, for breaking the bad news. I'm not angry, just terribly,  
terribly disappointed. :)

"From http://www.unicode.org/unicode/reports/tr18/ I see that \p{L} is
intended to select Unicode letters, and it is part of a large number of
selectors based on Unicode character properties."

Yeah, that's the main cite, and yeah, a large, large number. The only  
sane way to use regexes with Unicode. Also see Friedl's 'Mastering  
Regular Expressions' Chapter 3: or actually, if you are a Python only  
person, don't: it will make you weep.

"Python doesn't support this syntax. It has limited support for  
Unicode character properties [...]".

Umm Earth to Python-guys, you *have heard* of Unicode, right? Call me  
crazy, but in this day and age, I assume a scripting language with  
regex support will implement standard Unicode conventions, unless  
there is a compelling reason not to. Very odd.

Back to Perl. Right now. Just kidding. Not. Sheesh. This is a big  
flaw in Python, IMHO. I never saw it coming.


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Bigrams and nested dictionaries

2006-03-28 Thread Michael Broe
I'm playing with the whole idea of creating bigram (digram?)  
frequencies for text analysis and cryptographic and entropy analysis  
etc (this is as much an exercise in learning Python and programming  
as anything else, I realise everything has already been done  
somewhere somehow :) Though I *am* aiming to run this over unicoded  
phonetic representations of natural languages, which doesn't seem to  
be that common.

I implemented a single character count using a dictionary, and it  
worked really well:

Character_Count = {}
for character in text:
Character_Count[character] = Character_Count.get(character, 0) + 1

And so for bigrams I was thinking of creating a data-structure that  
was a nested dictionary, where the key was the character in position  
1, and the value was a sub-dictionary that gave a count of each  
character in position 2.

So the dictionary might begin something like:

{'a': {'a':1, 'b':8, 'c':10,...}, 'b' : {'a':23, 'b':0, 'c': 
1,...},..., 'z' : {'a':11, 'b':0, 'c':0,...}}

The count of the character in position one could be retrieved by  
summing over the characters in position 2, so the bigram and single- 
character counts can be done in one pass.

I don't want anyone to tell me how to do this :) I'd just like to get  
a feel for:

(i) is this idea of a nested dictionary a good/efficient/tractable  
data-structure?

(ii) is there a straightforward path to the goal of constructing this  
thing in a single pass over the input string (yes or no will suffice  
for the moment!), or is there a can of worms lurking in my future.

I'd rather have less information than more, but there is something  
about grabbing characters two-at-a-time, but moving forward one-at-a- 
time, and sticking the count down a level inside the dictionary  
that's just a little baffling at the moment. On the other hand it's  
like moving a two-character window across the text and recording  
information as you go, which seems like it should be a good thing to  
do computationally. I like being baffled, I'd just kinda like to know  
if this is a good problem to work on to gain enlightenment, or a bad  
one and I should think about a totally different path with a less  
jazzy data structure.



___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Bigrams and nested dictionaries

2006-03-28 Thread Michael Broe
Well I ran into an interesting glitch already. For a dictionary D, I  
can pull out a nested value using this syntax:

 >>> D['b']['a']
23

and I can assign to this dictionary using

 >>> D['d'] = {'a':7, 'b':0'}

but I can't assign like this:

 >>> D['d']['c'] = 1
TypeError: object does not support item assignment.

hmmm.


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Bigrams and nested dictionaries

2006-03-28 Thread Michael Broe
Aha! John wrote:

"Are you sure you haven't mistakenly assigned something other than a  
dict to D or D['d'] ?"

Thanks for the tip! Yup that was it (and apologies for not reporting  
the problem more precisely). I hadn't initialized the nested  
dictionary before trying to assign to it. (I think Perl doesn't  
require initialization of dictionaries prior to assignment, which in  
this case, would be a nice thing...)

 >>> D['a'] = {'a':1, 'b':2}#oops
Traceback (most recent call last):
   File "", line 1, in ?
NameError: name 'D' is not defined
 >>> D = {}
 >>> D['a'] = {'a':1, 'b':2}
 >>> D
{'a': {'a': 1, 'b': 2}}
 >>> D['c']['a'] = 1   #ooops
Traceback (most recent call last):
   File "", line 1, in ?
KeyError: 'c'
 >>> D['c'] = {}
 >>> D['c']['a'] = 1
 >>> D
{'a': {'a': 1, 'b': 2}, 'c': {'a': 1}}

And Kent wrote:

"Encapsulating this in a generator is probably a good plan."

Yay, I get to play with generators... thanks for the suggestion, I  
would never have looked in that direction.

--
Python: [x for S in L for x in S]
Mathematica: Flatten[L] (but where's the fun in that?)




___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Bigrams and nested dictionaries

2006-04-03 Thread Michael Broe
Well coming up with this has made me really love Python. I worked on  
this with my online pythonpenpal Kyle, and here is what we came up  
with. Thanks to all for input so far.

My first idea was to use a C-type indexing for-loop, to grab a two- 
element sequence [i, i+1]:

dict = {}
for i in range(len(t) - 1):
if not dict.has_key(t[i]):
dict[t[i]] = {}
if not dict[t[i]].has_key(t[i+1]):
dict[t[i]][t[i+1]] = 1
else:
dict[t[i]][t[i+1]] += 1

Which works, but. Kyle had an alternative take, with no indexing, and  
after we worked on this strategy it seemed very Pythonesque, and ran  
almost twice as fast.



#!/usr/local/bin/python

import sys
file = open(sys.argv[1], 'rb').read()

# We imagine a 2-byte 'window' moving over the text from left to right
#
#  +---+
# L  o  n  | d   o |  n  .  M  i  c  h  a  e  l  m  a  s  t  e   
r  m ...
#  +---+
#
# At any given point, we call the leftmost byte visible in the window  
'L', and the
# rightmost byte 'R'.
#
#  +---+
# L  o  n  | L=d   R=o |  n  .  M  i  c  h  a  e  l  m  a  s  t   
e  r  m ...
#  +---+
#
# When the program begins, the first byte is preloaded into L, and we  
position R
# at the second byte of the file.
#

dict = {}

L = file[0]
for R in file[1:]:  # move right edge of window across the file
if not L in dict:
dict[L] = {}

if not R in dict[L]:
dict[L][R] = 1
else:
dict[L][R] += 1

L = R   # move character in R over to L

# that's it. here's a printout strategy:

for entry in dict:
print entry, ':', sum(dict[entry].values())
print dict[entry]
print






___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor