Re: [Tutor] Any Tutor there ? Removing redundant par ameters in a models file having include files.

2010-03-02 Thread Steven D'Aprano
On Tue, 2 Mar 2010 11:25:44 am Andreas Kostyrka wrote:
> Furthermore I do not think that most of the "core" community has a
> problem with the alternate implementations, as they provide very
> useful functions (it helps on the architecture side, because it
> limits somewhat what can be done, it helps on the personal side,
> because it increases the value of Python skills, ...), ...

The Python development team values alternative implementations, as it 
gives Python the language a much wider user base.

It also allows other people to shoulder some of the development burden. 
For example, people who want Python without the limitations of the C 
call stack can use Stackless Python, instead of ordinary CPython. 
Google is sponsoring a highly optimized version of Python with a JIT 
compiler: Unladen Swallow. It looks likely that Unladen Swallow will 
end up being merged with CPython too, which will be a great benefit.


-- 
Steven D'Aprano
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] parsing a "chunked" text file

2010-03-02 Thread Andrew Fithian
Hi tutor,

I have a large text file that has chunks of data like this:

headerA n1
line 1
line 2
...
line n1
headerB n2
line 1
line 2
...
line n2

Where each chunk is a header and the lines that follow it (up to the next
header). A header has the number of lines in the chunk as its second field.

I would like to turn this file into a dictionary like:
dict = {'headerA':[line 1, line 2, ... , line n1], 'headerB':[line1, line 2,
... , line n2]}

Is there a way to do this with a dictionary comprehension or do I have to
iterate over the file with a "while 1" loop?

-Drew
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] parsing a "chunked" text file

2010-03-02 Thread Christian Witts

Andrew Fithian wrote:

Hi tutor,

I have a large text file that has chunks of data like this:

headerA n1
line 1
line 2
...
line n1
headerB n2
line 1
line 2
...
line n2

Where each chunk is a header and the lines that follow it (up to the 
next header). A header has the number of lines in the chunk as its 
second field.


I would like to turn this file into a dictionary like:
dict = {'headerA':[line 1, line 2, ... , line n1], 'headerB':[line1, 
line 2, ... , line n2]}


Is there a way to do this with a dictionary comprehension or do I have 
to iterate over the file with a "while 1" loop?


-Drew


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
  


A solution that could work for you could be something like...

dict([(z.splitlines()[0].split()[0],z.splitlines()[1:]) for z in [x for 
x in open(filename).read().split('header') if x.strip()]])


{'A': ['line 1', 'line 2', '...', 'line n1'], 'B': ['line 1', 'line 2', 
'...', 'line n2']}


Of course that doesn't look very pretty and only works for a specific 
case as demonstrated on your sample data.


--
Kind Regards,
Christian Witts
Business Intelligence

C o m p u s c a n | Confidence in Credit

Telephone: +27 21 888 6000
National Cell Centre: 0861 51 41 31
Fax: +27 21 413 2424
E-mail: cwi...@compuscan.co.za

NOTE:  This e-mail (including attachments )is subject to the disclaimer 
published at: http://www.compuscan.co.za/live/content.php?Item_ID=494.
If you cannot access the disclaimer, request it from 
email.disclai...@compuscan.co.za or 0861 514131.

National Credit Regulator Credit Bureau Registration No. NCRCB6 



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] parsing a "chunked" text file

2010-03-02 Thread spir
On Mon, 1 Mar 2010 22:22:43 -0800
Andrew Fithian  wrote:

> Hi tutor,
> 
> I have a large text file that has chunks of data like this:
> 
> headerA n1
> line 1
> line 2
> ...
> line n1
> headerB n2
> line 1
> line 2
> ...
> line n2
> 
> Where each chunk is a header and the lines that follow it (up to the next
> header). A header has the number of lines in the chunk as its second field.
> 
> I would like to turn this file into a dictionary like:
> dict = {'headerA':[line 1, line 2, ... , line n1], 'headerB':[line1, line 2,
> ... , line n2]}
> 
> Is there a way to do this with a dictionary comprehension or do I have to
> iterate over the file with a "while 1" loop?

The nice way would be to split the source into a list of chunk texts. But there 
seems to be no easy way to do this without traversing the source. If the source 
is generated, just add blank lines (so that the sep is '\n\n'). Then a dict 
comp can map items using any makeChunk() func.

If this is not doable, I would traverse lines using a "while n < s" loop, where 
n is current line # & and s the size of lines.

Denis
-- 


la vita e estrany

spir.wikidot.com

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] correctly format and insert html block using python into mysql table

2010-03-02 Thread Norman Khine
hello,
I have this code:

>>> import re
>>> import MySQLdb, csv, sys
>>> conn = MySQLdb.connect (host = "localhost",user = "usr", passwd= "pass",db 
>>> = "databasename")
>>> c = conn.cursor()
>>> file = open('Data/asdsp-lao-farmers-et-batieng-products.html', 'r')
>>> data = file.read()
>>> get_records = re.compile(r""">> class=\"flexicontent\">(.*)<\/div>""", re.DOTALL).findall
>>> get_titles = re.compile(r"""(.*)<\/h3>""").findall
>>> get_description = re.compile(r"""(.*)<\/div>""", 
>>> re.DOTALL).findall

>>> block_record = []
>>> block_url = []
>>> records = get_records(data)
>>> for record in records:
... description = get_description(record)
... print description # see http://paste.lisp.org/+21XF for output
... c.execute("INSERT INTO a (description) VALUES (%s)", description)
>>> c.commit()
>>> c.close()

the problem is that the 'html' comes out like:

http://paste.lisp.org/+21XF

is there a way to format the output so that it does not include the
\n\t\t and has the correct encoding?

thanks
norman
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] parsing a "chunked" text file

2010-03-02 Thread Steven D'Aprano
On Tue, 2 Mar 2010 05:22:43 pm Andrew Fithian wrote:
> Hi tutor,
>
> I have a large text file that has chunks of data like this:
>
> headerA n1
> line 1
> line 2
> ...
> line n1
> headerB n2
> line 1
> line 2
> ...
> line n2
>
> Where each chunk is a header and the lines that follow it (up to the
> next header). A header has the number of lines in the chunk as its
> second field.

And what happens if the header is wrong? How do you handle situations 
like missing headers and empty sections, header lines which are wrong, 
and duplicate headers?

line 1
line 2
headerB 0
headerC 1
line 1
headerD 2
line 1
line 2
line 3
line 4
headerE 23
line 1
line 2
headerB 1
line 1



This is a policy decision: do you try to recover, raise an exception, 
raise a warning, pad missing lines as blank, throw away excess lines, 
or what?


> I would like to turn this file into a dictionary like:
> dict = {'headerA':[line 1, line 2, ... , line n1], 'headerB':[line1,
> line 2, ... , line n2]}
>
> Is there a way to do this with a dictionary comprehension or do I
> have to iterate over the file with a "while 1" loop?

I wouldn't do either. I would treat this as a pipe-line problem: you 
have a series of lines that need to be processed. You can feed them 
through a pipe-line of filters:

def skip_blanks(lines):
"""Remove leading and trailing whitespace, ignore blank lines."""
for line in lines:
line = line.strip()
if line:
yield line

def collate_section(lines):
"""Return a list of lines that belong in a section."""
current_header = ""
accumulator = []
for line in lines:
if line.startswith("header"):
yield (current_header, accumulator)
current_header = line
accumulator = []
else:
accumulator.append(line)
yield (current_header, accumulator)


Then put them together like this:


fp = open("my_file.dat", "r")
data = {}  # don't shadow the built-in dict
non_blank_lines = skip_blanks(fp)
sections = collate_sections(non_blank_lines)
for (header, lines) in sections:
data[header] = lines


Of course you can add your own error checking.


-- 
Steven D'Aprano
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Why is the max size so low in this mail list?

2010-03-02 Thread Lie Ryan
On 03/02/2010 04:13 AM, Wayne Watson wrote:
> See Subject. 40K here, but other Python lists allow for larger (total)
> sizes.

I don't know, I've never realized it; that's an indication that the 40K
limit is reasonable, at least to me. What did you get for posting >40K
mails? Is your mail bounced? And if it does, is the bounce message
helpful, like "please use pastebin or put a link"?

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Any Tutor there ? Removing redundant parameters in a models file having include files.

2010-03-02 Thread Karim Liateni


Hello,

Thanks a lot for this state of the art of the language, very instructive.
I see now top of the iceberg  ;o)

Karim

Steven D'Aprano wrote:

On Tue, 2 Mar 2010 07:07:57 am Karim Liateni wrote:
  

Thanks for this precision!
I'm using standard python so this is ok!
Why people use proprietary python ?
To have more trouble ? To be different from the rest of community ?



Python is a language, but there can be many different implementations of 
that language, just like there are different C compilers or different 
Javascript engines.


CPython is the version which was made first, it is the most common 
version, but it is not the only one. It is called CPython because it is 
written in C.


Jython is a version of Python written in Java, and it was created by 
people wanting to use Python as a front-end to Java libraries, and to 
take advantage of Java's garbage collector.


IronPython is Microsoft's version of Python written for .Net and Mono.

PyPy is an experimental version of Python written in Python, used by 
people wanting to experiment with Python compilers.


"Python for S60" is a version of Python written for Nokia's S60 devices.

CapPython is an experimental version of Python designed for security.

There are many others, they are all Python, but they have differences.
  


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] getting diagonals from a matrix

2010-03-02 Thread David Eccles (gringer)
I've managed to drum up some code to obtain a list containing joined diagonal
elements of a matrix (I'm making a word finder generator), but am wondering if
there's any better way to do this:

# setup so code snippet works properly
sizeW = 4
sizeH = 3
puzzleLayout = ['spam'] * (sizeW * sizeH)

# Starting with, say, an array with these indices:
# 0 1 2 3
# 4 5 6 7
# 8 9 A B

# I want the following items for a back diagonal (not in square brackets):
# [-2],[3],8 (+5) | div 4 = -1, 1, 2
# [-1],4,9 (+5)   | div 4 = -1, 1, 2
# 0,5,A (+5)  | div 4 =  0, 1, 2
# 1,6,B (+5)  | div 4 =  0, 1, 2
# 2,7,[C] (+5)| div 4 =  0, 1, 3
# 3,[8],[D] (+5)  | div 4 =  0, 2, 3

# in other words, increase sequence by sizeW + 1 each time (sizeW - 1
# for forward diagonals), only selecting if the line you're on matches
# the line you want to be on

# back as in backslash-like diagonal
puzzleDiagBack = [(''.join([puzzleLayout[pos*(sizeW+1) + i] \
for pos in range(sizeH) if (pos*(sizeW+1) + i) / sizeW == pos])) \
for i in range(-sizeH+1,sizeW)]
puzzleDiagBackRev = [(''.join(reversed([puzzleLayout[pos*(sizeW+1) + i] \
for pos in range(sizeH) if (pos*(sizeW+1) + i) / sizeW == pos]))) \
for i in range(-sizeH+1,sizeW)]
# fwd as in forwardslash-like diagonal
puzzleDiagFwdRev = [(''.join([puzzleLayout[pos*(sizeW-1) + i] \
for pos in range(sizeH) if (pos*(sizeW-1) + i) / sizeW == pos])) \
for i in range(sizeW+sizeH-1)]
puzzleDiagFwd = [(''.join(reversed([puzzleLayout[pos*(sizeW-1) + i] \
for pos in range(sizeH) if (pos*(sizeW-1) + i) / sizeW == pos]))) \
for i in range(sizeW+sizeH-1)]

Cheers,
David Eccles (gringer)
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor