Dinesh and Kent -
I've been lurking along as you run this problem to ground. The syntax you
are working on looks very slippery, and reminds me of some of the issues I
had writing a generic street address parser with pyparsing
(http://pyparsing.wikispaces.com/file/view/streetAddressParser.py). Ma
Pyparsing has a built-in helper called nestedExpr that fits neatly in with
this data. Here is the whole script:
from pyparsing import nestedExpr
syntax_tree = nestedExpr()
results = syntax_tree.parseString(st_data)
from pprint import pprint
pprint(results.asList())
Prints:
[[['S',
['NP-SB
-Original Message-
From: Emad Nawfal (عماد نوفل) [mailto:emadnaw...@gmail.com]
Sent: Saturday, February 14, 2009 8:59 AM
To: Paul McGuire
Cc: tutor@python.org
Subject: Re: [Tutor] extracting phrases and their memberships from syntax
Thank you so much Paul, Kent, and Hoftkamp.
I was
Has anyone already mentioned the article in Python Magazine, May, 2008?
-- Paul
___
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor
I second Alan G's appreciation for a well-thought-through and well-conveyed
description of your text processing task. (Is "Alan G" his gangsta name, I
wonder?)
This pyparsing snippet may point you to some easier-to-follow code,
especially once you go beyond the immediate task and do more exhausti
Emad wrote:
>>>
Since I'm learning Pyparsing, this was a nice excercise. I've written this
elementary script which does the job well in light of the data we have
from pyparsing import *
ID_TAG = Literal("")
FULL_NAME_TAG1 = Literal("")
END_TAG = Literal("', 'Joseph
You can look at a site called UtilityMill (http://utilitymill.com/) that
hosts user-defined Python code, and wraps it in an API to be used
interactively through an HTML form, or programmatically over HTTP. I'm
pretty sure that the author makes the source code available for this site.
Also, you co
These are very interesting links, and I just downloaded the evoque code to
see how they handled the one "impossible" case that has stymied me in
supporting pyparsing under Python 2.x (pre-2.6) and Python 3.0 with a single
code base: exception handling.
For those new to this topic, here is the prob
1. Python is not Java (see Philip Eby's blog entry
http://dirtsimple.org/2004/12/python-is-not-java.html). Let go of your
concepts that only Items can go into an ItemCollection - Python already has
some perfectly good collection builtins. Instead of writing a custom
ItemCollection, why not write
Ronald -
I really encourage you to try to embrace some of the basic Python idioms as
part of your Java->Python journey:
1. Iterators
for item in list_of_items:
# do something with item
Is all that is needed to visit each item in a Python list. Your verbose
MoveFirst, MoveNext, if more <> N
>
> xrange was a kludge to improve on range's memory efficiency
> but it is a horrible name that obscures the code.
>
> Also it does not exist in v3 so if you use it you will need to change
> the code for v3. It is as well to be as consistent with v3 as possible
> IMHO
>
> Alan G
I have felt th
For the given test case, this pyparsing sample parses the data, without
having to anticipate all the possible 2-letter keys.
from pyparsing import *
integer = Word(nums)
DASH = Literal('-').suppress()
LT = Literal('<').suppress()
GT = Literal('>').suppress()
entrynum = LT + integer + GT
keycod
Original:
'case_def_gen':['case_def','gen','null'],
'nsuff_fem_pl':['nsuff','null', 'null'],
'abbrev': ['abbrev, null, null'],
'adj': ['adj, null, null'],
'adv': ['adv, null, null'],}
Note the values for 'abbrev', 'adj' and 'adv' are not lists, but strings
containing comma-separated lists.
S
If you are looking for all possible 60-card deals of this deck, then you
also probably want to filter out duplicate deals caused by equivalent cards
exchanging places. That is, say you had a deck of 3 cards: 2 Mountain, and
1 Vial, and you want to deal out al 3 in various order of cards. Using th
My favorite rendition of this is in the related field of text-to-speech. In
early Object Management Group days, OMG conference proceedings were
available as transcripts, but there was substantial delay in getting these
out. An attempt to automate this with TTS software of the time was
discarded q
This is a good little pyparsing exercise. Pyparsing makes it easy to define
the structure of a "Write { ( )* }" block, and use the names
given to the parsed tokens to easily find the "name" and "file" entries.
from pyparsing import (Literal, Word, alphanums, empty, restOfLine, dictOf)
# make up
Pedro -
If you are trying to extract a simple pattern like a numeric word followed
by an alpha word, I would suggest using one of the scanString or
searchString methods. scanString is probably the better choice, since you
seem to need not only the matching tokens, but also the location within the
> Hi Paul. Thanks. I would like to get better at pyparsing. Can you
> recommend a good resource for learning. I am finding that what info I
> have found on the subject (on the web) isn't really explaining it from
> the ground up as much as I would like.
Hrmmm, not sure what to recommend - I might
Ok, I've seen various passes at this problem using regex, split('='), etc.,
but the solutions seem fairly fragile, and the OP doesn't seem happy with
any of them. Here is how this problem looks if you were going to try
breaking it up with pyparsing:
- Each line starts with an integer, and the stri
> -Original Message-
> From: amr...@iisermohali.ac.in [mailto:amr...@iisermohali.ac.in]
> Sent: Wednesday, July 29, 2009 12:13 AM
> To: Paul McGuire
> Cc: tutor@python.org
> Subject: Re: [Tutor] how to get blank value
>
> Thanks for help Sir but with these comm
ng the
0'th element. So I'm going to fix pyparsing so that in the next release,
you'll be able to reference the sub-elements as:
print parsedEntries.country.tag
print parsedEntries.country.ai.war
print parsedEntries.country.ai.ferocity
This *may* break some existing code,
x27;, so RHS only worked
if it was handed a quoted string. Probably good practice to always enclose
in quotes the expression being assigned to a Forward using '<<'.
-- Paul
-Original Message-
From: Liam Clarke [mailto:[EMAIL PROTECTED]
Sent: Saturday, July 23, 2005 9:03
^ RHS ) + RBRACE ) )
-- Paul
-Original Message-
From: Liam Clarke [mailto:[EMAIL PROTECTED]
Sent: Sunday, July 24, 2005 10:21 AM
To: Paul McGuire
Cc: tutor@python.org
Subject: Re: [Tutor] Parsing problem
Hi Paul,
That is fantastic. It works, and using that pp.group is the key with the
nest
-Original Message-
From: Liam Clarke [mailto:[EMAIL PROTECTED]
Sent: Sunday, July 24, 2005 10:21 AM
To: Paul McGuire
Cc: tutor@python.org
Subject: Re: [Tutor] Parsing problem
Hi Paul,
That is fantastic. It works, and using that pp.group is the key with the
nested braces.
I just ran th
s is about the only optimization I can think of.
-- Paul
-Original Message-
From: Liam Clarke [mailto:[EMAIL PROTECTED]
Sent: Monday, July 25, 2005 7:38 AM
To: Paul McGuire
Cc: tutor@python.org
Subject: Re: [Tutor] Parsing problem
Hi Paul,
Well various tweaks and such done, it parses perfectly,
Liam -
The two arguments to Word work this way:
- the first argument lists valid *initial* characters
- the second argument lists valid *body* or subsequent characters
For example, in the identifier definition,
identifier = pp.Word(pp.alphas, pp.alphanums + "_/:.")
identifiers *must* start wit
x27;10'], ['bar', '20'
[['foo', '10'], ['bar', '20']]
10
20
-- Paul
-Original Message-
From: Liam Clarke [mailto:[EMAIL PROTECTED]
Sent: Monday, July 25, 2005 7:38 AM
To: Paul McGuire
Cc: tutor@python.org
Subject:
integer expression "masks" the real,
and since it occurs first, it will match first. The two solutions are:
number = (integer ^ real)
Or
number = (real | integer)
That is, use an Or, which will match the longest, or reorder the MatchFirst
to put the most restrictive expression first.
Welcom
> Also interesting is that our processors, which aren't overly
> far apart in clock speed, vary so greatly in processing this
> problem. Maybe Intel is better *grin*
Urp, turns out I have an "Athlon Inside" label right here on the deck of my
laptop! Maybe the difference is my 1.2Gb of RAM.
Ara -
I found your question about the Pyparsing-based adventure game that I wrote.
You can find more info on this from the presentation I made at PyCon'06,
(http://www.python.org/pycon/2006/papers/4/). This link opens up at the
title page, there are navigation controls in the lower right corner o
1. Don't name your dict 'dict' or your list 'list', as this then masks the
builtin dict and list types.
2. Your application is a textbook case for defaultdict:
from collections import defaultdict
recordDict = defaultdict(list)
for record in recordList:
recordDict[record[0]].append(record)
Here is a pyparsing approach to your question. I've added some comments to
walk you through the various steps. By using pyparsing's makeHTMLTags
helper method, it is easy to write short programs to skim selected data tags
from out of an HTML page.
-- Paul
from pyparsing import makeHTMLTags, Sk
>>>
My initial tests using pickle and a simple class system (shown below) have
failed. The method shown below fails with a AttributeError:
'FakeModule' object has no attribute 'Spod', so when I create a an empty
class Spod in the new session, it generates an IndexError:(list index out of
range)
I
Augghh! I can't stand it!!! If position is a boolean, then *why* must we
test if it is equal to True?!!! It's a boolean! Just test it! For that
matter, let's rename "position" to something a little more direct,
"print_line" perhaps?
Did you know that files are now iterators? If going through
>>
Hi - I wrote a custom exception class as follows:
class CustomError(Exception):
def __init__(self, msg):
super(CustomError, self).__init__(self, msg)
But this doesn't work as expected:
>>
Correct use of super would be:
class
Here is a solution for your problem that will compute the values up to
(5000,5000) in about 16 seconds:
upperlimit = 5000
cube = {}
for i in xrange(1,upperlimit):
cube[i] = i*i*i
cubes = set(cube.itervalues())
for i in xrange(upperlimit, int(upperlimit*1.26)):
cubes.add(i*i*i)
pairs = [
>
> > def __init__(self,p1,p2):
> > self.p1 = p1
> > self.p2 = p2
> >
> > And since a line should not have zero length (although you might argue
> > with that!) you could also check if
> > p1==p2
>
> In this case he should define Point.__cmp__() so the comparison is by
value rather than iden
> def length(self):
> dx,dy = self.p1 - self.p2
> return (dx**2 + dy **2) ** 0.5
How about:
def length(self):
return math.hypot( *(self.p1 - self.p2) )
Compiled C code will be much faster than squaring and square rooting.
-- Paul
___
T
I have used pywinauto for such tasks in the past.
http://pywinauto.openqa.org/
In my case, I used pywinauto to automate mouse clicks on a browser in order
to auto-play a Flash game running in the browser. I had to use PIL to take
screenshots and then process images to "read" the screen.
-- Pau
I am thinking along similar lines as Simone, but I suspect there is still a
missing step. I think you need to know what the mapping is from old name to
new name, so that you can do something like rename or copy the files that
these names represent.
Python's built-in zip() method does the trick he
One problem with my previous post. I didn't look closely enough at the
original list of files, it turns out you have multiple entries for some of
them. If you used the glob module to create this list (with something like
"original_list = glob.glob('frame.*.dpx')"), then there should be no problem
Instead of trying to match on the weird characters, in order to remove them,
here is a pyparsing program that ignores those header lines and just
extracts the interesting data for each section.
In a pyparsing program, you start by defining what patterns you want to look
for. This is similar to th
>>> the below doesn't work in python
>>> >>> if !(os.access(target_dir, os.F_OK)):
>>> SyntaxError: invalid syntax
What is that '!'? doing there? A left-handed factorial?
I'm just kidding, I know that '!' is a common 'not' operator in many other
programming languages. But this is Python, ma
Question 1:
format_code := '+' | '-' | '*' | '#'
I need to specify that a single, identical, format_code code may be
repeated.
Not that a there may be several one on a sequence.
format := (format_code)+
would catch '+-', which is wrong. I want only patterns such as '--',
'+++',...
T
It's not free, but I have had good success with Enterprise Architect from
Sparx Systems (http://www.sparxsystems.com.au/). It will generate class
diagrams from Python, C/C++, C#, Java. It also supports the full complement
of UML diagrams - sequence diagrams are a special treat when you just drag
I guess PEP8 gives some tips on how things should be named, and what case to
use, but I think Denis's question goes beyond that. I think the point of
Denis's question was not to imply that resources like help or docstrings are
not useful, or would not be available. I think he is polling to see wh
I would say, though, that you should be careful in your implementation of
is_on_line, for floating point round-off errors. Try this at the command
prompt (Python 2.5.2, with __future__ division imported):
>>> 49 * (1/49) == 1
False
>>> 1 - 49 * (1/49)
1.1102230246251565e-016
I would suggest a sl
Even simpler than Rich Lovely's:
newlist = [a+b for a,b in itertools.izip(l1[:-1], l1[1:])]
is to just use the built-in zip:
newlist = [a+b for a,b in zip(l1[:-1], l1[1:])]
since you can be sure that l1[:-1] and l1[1:] will always be the same
length, so there is no need for a fill valu
The Python Wiki has some example decorators at
http://wiki.python.org/moin/PythonDecoratorLibrary. I think the CGIMethod
wrapper is a good intuitive example, and memoize is a good technique to add
to your Python toolkit.
-- Paul
___
Tutor maillist -
> By the way, (totally off-topic, of course, my apologies): what do all
> y'all call the "@" operator?
Back when the "what syntax should we use for decorators?" debate raged, this
symbol was referred to as a "pie", I guess because it looks like the swirl
on top of a cream pie. I think this term
Denis -
What you are seeing is standard procedure for any built-in type - no dynamic
assignment of attributes allowed. Here is an analogous case to your
example, but based on str instead of object:
greeting = "bon jour"
greeting.language = "French" # raises AttributeError: 'str' object has no
a
Sometimes pyparsing is less stressful than struggling with RE's
typoglyphics, especially for a one-off conversion (also note handling of
quoted strings - if a 'new Date(y,m,d)' occurs inside a quoted string, this
script *won't* convert it):
from pyparsing import nums,Word,Literal,quotedString
# p
Wow! Everybody jumped on the "floating point inaccuracy" topic, I'm
surprised no one tried to find out what the OP was trying to do with his
cube root solver in the first place. Of course, the first-cut approach to
solving the cube root is to raise to the 1/3 power, but this is not the only
possi
"Finding the shortest word among a list of words" sounds like something of a
trick question to me. I think a more complete problem statement would be
"Find the list of words that are the shortest", since there is no guarantee
that the list does not contain two words of the same shortest length. I
-Original Message-
From: tutor-bounces+ptmcg=austin.rr@python.org
[mailto:tutor-bounces+ptmcg=austin.rr@python.org] On Behalf Of
tutor-requ...@python.org
Sent: Wednesday, January 21, 2009 11:29 AM
To: tutor@python.org
Subject: Tutor Digest, Vol 59, Issue 109
Send Tutor mailing l
If your actual interest in defining a bit type is to work with an array of
bits, try Ilan Schnell's bitarray module
(http://pypi.python.org/pypi/bitarray/0.3.4). It uses a compiled extension,
so it is quite fast and space-efficient.
Ilan gave a presentation on this at the Texas Unconference last
56 matches
Mail list logo