Re: [Tutor] Any Tutor there ? Removing redundant par ameters in a models file having include files.
On Tue, 2 Mar 2010 11:25:44 am Andreas Kostyrka wrote: > Furthermore I do not think that most of the "core" community has a > problem with the alternate implementations, as they provide very > useful functions (it helps on the architecture side, because it > limits somewhat what can be done, it helps on the personal side, > because it increases the value of Python skills, ...), ... The Python development team values alternative implementations, as it gives Python the language a much wider user base. It also allows other people to shoulder some of the development burden. For example, people who want Python without the limitations of the C call stack can use Stackless Python, instead of ordinary CPython. Google is sponsoring a highly optimized version of Python with a JIT compiler: Unladen Swallow. It looks likely that Unladen Swallow will end up being merged with CPython too, which will be a great benefit. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] parsing a "chunked" text file
Hi tutor, I have a large text file that has chunks of data like this: headerA n1 line 1 line 2 ... line n1 headerB n2 line 1 line 2 ... line n2 Where each chunk is a header and the lines that follow it (up to the next header). A header has the number of lines in the chunk as its second field. I would like to turn this file into a dictionary like: dict = {'headerA':[line 1, line 2, ... , line n1], 'headerB':[line1, line 2, ... , line n2]} Is there a way to do this with a dictionary comprehension or do I have to iterate over the file with a "while 1" loop? -Drew ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] parsing a "chunked" text file
Andrew Fithian wrote: Hi tutor, I have a large text file that has chunks of data like this: headerA n1 line 1 line 2 ... line n1 headerB n2 line 1 line 2 ... line n2 Where each chunk is a header and the lines that follow it (up to the next header). A header has the number of lines in the chunk as its second field. I would like to turn this file into a dictionary like: dict = {'headerA':[line 1, line 2, ... , line n1], 'headerB':[line1, line 2, ... , line n2]} Is there a way to do this with a dictionary comprehension or do I have to iterate over the file with a "while 1" loop? -Drew ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor A solution that could work for you could be something like... dict([(z.splitlines()[0].split()[0],z.splitlines()[1:]) for z in [x for x in open(filename).read().split('header') if x.strip()]]) {'A': ['line 1', 'line 2', '...', 'line n1'], 'B': ['line 1', 'line 2', '...', 'line n2']} Of course that doesn't look very pretty and only works for a specific case as demonstrated on your sample data. -- Kind Regards, Christian Witts Business Intelligence C o m p u s c a n | Confidence in Credit Telephone: +27 21 888 6000 National Cell Centre: 0861 51 41 31 Fax: +27 21 413 2424 E-mail: cwi...@compuscan.co.za NOTE: This e-mail (including attachments )is subject to the disclaimer published at: http://www.compuscan.co.za/live/content.php?Item_ID=494. If you cannot access the disclaimer, request it from email.disclai...@compuscan.co.za or 0861 514131. National Credit Regulator Credit Bureau Registration No. NCRCB6 ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] parsing a "chunked" text file
On Mon, 1 Mar 2010 22:22:43 -0800 Andrew Fithian wrote: > Hi tutor, > > I have a large text file that has chunks of data like this: > > headerA n1 > line 1 > line 2 > ... > line n1 > headerB n2 > line 1 > line 2 > ... > line n2 > > Where each chunk is a header and the lines that follow it (up to the next > header). A header has the number of lines in the chunk as its second field. > > I would like to turn this file into a dictionary like: > dict = {'headerA':[line 1, line 2, ... , line n1], 'headerB':[line1, line 2, > ... , line n2]} > > Is there a way to do this with a dictionary comprehension or do I have to > iterate over the file with a "while 1" loop? The nice way would be to split the source into a list of chunk texts. But there seems to be no easy way to do this without traversing the source. If the source is generated, just add blank lines (so that the sep is '\n\n'). Then a dict comp can map items using any makeChunk() func. If this is not doable, I would traverse lines using a "while n < s" loop, where n is current line # & and s the size of lines. Denis -- la vita e estrany spir.wikidot.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] correctly format and insert html block using python into mysql table
hello, I have this code: >>> import re >>> import MySQLdb, csv, sys >>> conn = MySQLdb.connect (host = "localhost",user = "usr", passwd= "pass",db >>> = "databasename") >>> c = conn.cursor() >>> file = open('Data/asdsp-lao-farmers-et-batieng-products.html', 'r') >>> data = file.read() >>> get_records = re.compile(r""">> class=\"flexicontent\">(.*)<\/div>""", re.DOTALL).findall >>> get_titles = re.compile(r"""(.*)<\/h3>""").findall >>> get_description = re.compile(r"""(.*)<\/div>""", >>> re.DOTALL).findall >>> block_record = [] >>> block_url = [] >>> records = get_records(data) >>> for record in records: ... description = get_description(record) ... print description # see http://paste.lisp.org/+21XF for output ... c.execute("INSERT INTO a (description) VALUES (%s)", description) >>> c.commit() >>> c.close() the problem is that the 'html' comes out like: http://paste.lisp.org/+21XF is there a way to format the output so that it does not include the \n\t\t and has the correct encoding? thanks norman ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] parsing a "chunked" text file
On Tue, 2 Mar 2010 05:22:43 pm Andrew Fithian wrote: > Hi tutor, > > I have a large text file that has chunks of data like this: > > headerA n1 > line 1 > line 2 > ... > line n1 > headerB n2 > line 1 > line 2 > ... > line n2 > > Where each chunk is a header and the lines that follow it (up to the > next header). A header has the number of lines in the chunk as its > second field. And what happens if the header is wrong? How do you handle situations like missing headers and empty sections, header lines which are wrong, and duplicate headers? line 1 line 2 headerB 0 headerC 1 line 1 headerD 2 line 1 line 2 line 3 line 4 headerE 23 line 1 line 2 headerB 1 line 1 This is a policy decision: do you try to recover, raise an exception, raise a warning, pad missing lines as blank, throw away excess lines, or what? > I would like to turn this file into a dictionary like: > dict = {'headerA':[line 1, line 2, ... , line n1], 'headerB':[line1, > line 2, ... , line n2]} > > Is there a way to do this with a dictionary comprehension or do I > have to iterate over the file with a "while 1" loop? I wouldn't do either. I would treat this as a pipe-line problem: you have a series of lines that need to be processed. You can feed them through a pipe-line of filters: def skip_blanks(lines): """Remove leading and trailing whitespace, ignore blank lines.""" for line in lines: line = line.strip() if line: yield line def collate_section(lines): """Return a list of lines that belong in a section.""" current_header = "" accumulator = [] for line in lines: if line.startswith("header"): yield (current_header, accumulator) current_header = line accumulator = [] else: accumulator.append(line) yield (current_header, accumulator) Then put them together like this: fp = open("my_file.dat", "r") data = {} # don't shadow the built-in dict non_blank_lines = skip_blanks(fp) sections = collate_sections(non_blank_lines) for (header, lines) in sections: data[header] = lines Of course you can add your own error checking. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Why is the max size so low in this mail list?
On 03/02/2010 04:13 AM, Wayne Watson wrote: > See Subject. 40K here, but other Python lists allow for larger (total) > sizes. I don't know, I've never realized it; that's an indication that the 40K limit is reasonable, at least to me. What did you get for posting >40K mails? Is your mail bounced? And if it does, is the bounce message helpful, like "please use pastebin or put a link"? ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Any Tutor there ? Removing redundant parameters in a models file having include files.
Hello, Thanks a lot for this state of the art of the language, very instructive. I see now top of the iceberg ;o) Karim Steven D'Aprano wrote: On Tue, 2 Mar 2010 07:07:57 am Karim Liateni wrote: Thanks for this precision! I'm using standard python so this is ok! Why people use proprietary python ? To have more trouble ? To be different from the rest of community ? Python is a language, but there can be many different implementations of that language, just like there are different C compilers or different Javascript engines. CPython is the version which was made first, it is the most common version, but it is not the only one. It is called CPython because it is written in C. Jython is a version of Python written in Java, and it was created by people wanting to use Python as a front-end to Java libraries, and to take advantage of Java's garbage collector. IronPython is Microsoft's version of Python written for .Net and Mono. PyPy is an experimental version of Python written in Python, used by people wanting to experiment with Python compilers. "Python for S60" is a version of Python written for Nokia's S60 devices. CapPython is an experimental version of Python designed for security. There are many others, they are all Python, but they have differences. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] getting diagonals from a matrix
I've managed to drum up some code to obtain a list containing joined diagonal elements of a matrix (I'm making a word finder generator), but am wondering if there's any better way to do this: # setup so code snippet works properly sizeW = 4 sizeH = 3 puzzleLayout = ['spam'] * (sizeW * sizeH) # Starting with, say, an array with these indices: # 0 1 2 3 # 4 5 6 7 # 8 9 A B # I want the following items for a back diagonal (not in square brackets): # [-2],[3],8 (+5) | div 4 = -1, 1, 2 # [-1],4,9 (+5) | div 4 = -1, 1, 2 # 0,5,A (+5) | div 4 = 0, 1, 2 # 1,6,B (+5) | div 4 = 0, 1, 2 # 2,7,[C] (+5)| div 4 = 0, 1, 3 # 3,[8],[D] (+5) | div 4 = 0, 2, 3 # in other words, increase sequence by sizeW + 1 each time (sizeW - 1 # for forward diagonals), only selecting if the line you're on matches # the line you want to be on # back as in backslash-like diagonal puzzleDiagBack = [(''.join([puzzleLayout[pos*(sizeW+1) + i] \ for pos in range(sizeH) if (pos*(sizeW+1) + i) / sizeW == pos])) \ for i in range(-sizeH+1,sizeW)] puzzleDiagBackRev = [(''.join(reversed([puzzleLayout[pos*(sizeW+1) + i] \ for pos in range(sizeH) if (pos*(sizeW+1) + i) / sizeW == pos]))) \ for i in range(-sizeH+1,sizeW)] # fwd as in forwardslash-like diagonal puzzleDiagFwdRev = [(''.join([puzzleLayout[pos*(sizeW-1) + i] \ for pos in range(sizeH) if (pos*(sizeW-1) + i) / sizeW == pos])) \ for i in range(sizeW+sizeH-1)] puzzleDiagFwd = [(''.join(reversed([puzzleLayout[pos*(sizeW-1) + i] \ for pos in range(sizeH) if (pos*(sizeW-1) + i) / sizeW == pos]))) \ for i in range(sizeW+sizeH-1)] Cheers, David Eccles (gringer) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor