On Tue, Feb 10, 2009 at 5:54 PM, Kent Johnson wrote:
> Another attempt attached, it recognizes the n. separator and gets the last
> item.
And here is the actual attachment.
Kent
# Parser for legal citations, PLY version
# This version doesn't parse the names
from ply import lex, yacc
debug =
Subject: Re: [Tutor] Picking up citations
To:
Message-ID: <0a8f5cca89bf4b08becd3c4b86f18...@awa2>
Content-Type: text/plain; charset="us-ascii"
Dinesh and Kent -
I've been lurking along as you run this problem to ground. The syntax you
are working on looks very slippery, and remin
On Tue, Feb 10, 2009 at 12:42 PM, Dinesh B Vadhia
wrote:
> Kent
>
> The citation without the name is perfect (and this appears to be how most
> citation parsers work). There are two issues in the test run:
>
> 1. The parallel citation 422 U.S. 490, 499 n. 10, 95 S.Ct. 2197, 2205 n.
> 10, 45 L.Ed
sh B Vadhia
Cc: tutor@python.org
Subject: Re: [Tutor] Picking up citations
On Tue, Feb 10, 2009 at 12:42 PM, Dinesh B Vadhia
wrote:
> Kent
>
> The citation without the name is perfect (and this appears to be how most
> citation parsers work). There are two issues in the test run:
&g
Dinesh and Kent -
I've been lurking along as you run this problem to ground. The syntax you
are working on looks very slippery, and reminds me of some of the issues I
had writing a generic street address parser with pyparsing
(http://pyparsing.wikispaces.com/file/view/streetAddressParser.py). Ma
On Tue, Feb 10, 2009 at 12:42 PM, Dinesh B Vadhia
wrote:
> Kent
>
> The citation without the name is perfect (and this appears to be how most
> citation parsers work). There are two issues in the test run:
>
> 1. The parallel citation 422 U.S. 490, 499 n. 10, 95 S.Ct. 2197, 2205 n.
> 10, 45 L.Ed
last citation ie. 463 U.S. 29, 43, 103 S.Ct. 2856,
2867, 77 L.Ed.2d 443 (1983). I tested it on another sample text and it missed
the last citation too.
Thanks!
Dinesh
From: Kent Johnson
Sent: Tuesday, February 10, 2009 4:01 AM
To: Dinesh B Vadhia
Cc: tutor@python.org
Subject: Re: [Tutor] Pi
On Mon, Feb 9, 2009 at 12:51 PM, Dinesh B Vadhia
wrote:
> Kent /Emmanuel
>
> Below are the results using the PLY parser and Regex versions on the
> attached 'sierra' data which I think covers the common formats. Here are
> some 'fully unparsed" citations that were missed by the programs:
>
> Smit
On Mon, 09 Feb 2009 14:42:47 -0800, Marc Tompkins wrote:
> Aha! My list of "magic words"!
> (Sorry for the top post - anybody know how to change quoting defaults in
> Android Gmail?)
> --- www.fsrtechnologies.com
>
> On Feb 9, 2009 2:16 PM, "Dinesh B Vadhia"
> wrote:
>
> Kent /Emmanuel
>
> I
Aha! My list of "magic words"!
(Sorry for the top post - anybody know how to change quoting defaults in
Android Gmail?)
--- www.fsrtechnologies.com
On Feb 9, 2009 2:16 PM, "Dinesh B Vadhia" wrote:
Kent /Emmanuel
I found a list of words before the first word that can be removed which I
think i
Kent /Emmanuel
I found a list of words before the first word that can be removed which I think
is the only way to successfully parse the citations. Here they are:
| E.g. | Accord | See |See + Also | Cf. | Compare | Contra | But + See | But +
Cf. | See Generally | Citing | In |
Dinesh
Kent /Emmanuel
Below are the results using the PLY parser and Regex versions on the attached
'sierra' data which I think covers the common formats. Here are some 'fully
unparsed" citations that were missed by the programs:
Smith v. Wisconsin Dept. of Agriculture, 23 F.3d 1134, 1141 (7th Cir.1
On Sun, Feb 8, 2009 at 7:07 PM, Dinesh B Vadhia
wrote:
> Hi Kent
>
> From pyparsing to PLY in a few days ... this is too much to handle! I tried
> the program and like you said it works except for the inclusion of the full
> name. I tested it on different text and it doesn't work as expected (se
On Sun, Feb 8, 2009 at 5:53 PM, Emmanuel Ruellan
wrote:
> Dinesh B Vadhia wrote:
>> Hi! I want to process text that contains citations, in this case in legal
>> documents, and pull-out each individual citation.
>
>
> Here is my stab at it, using regular expressions. Any comments welcome.
It's a
Dinesh B Vadhia wrote:
> Hi! I want to process text that contains citations, in this case in legal
> documents, and pull-out each individual citation.
Here is my stab at it, using regular expressions. Any comments welcome.
I had to use two regexes, one to find all citations, and the other one
I guess I'm in the mood for a parsing challenge this weekend, I wrote
a PLY version of the citation parser, see attached. It generates
exactly the output you asked for except for the inclusion of "In" in
the name.
Kent
# Parser for legal citations, PLY version
from ply import lex, yacc
text = ""
, 493 U.S. 146, 159-60 (1934)"
I didn't know about pyparsing which appears to be very powerful and have joined
their list. Thank-you for your help.
Dinesh
From: Kent Johnson
Sent: Saturday, February 07, 2009 1:19 PM
To: Dinesh B Vadhia
Cc: tutor@python.org
Subject: Re: [Tutor]
On Sat, Feb 7, 2009 at 1:19 PM, Kent Johnson wrote:
>
> It is correct except for the inclusion of "In" in the name and the
> extra space before the comma separating the page numbers in the last
> citation.
>
As I've been reading along, I've been thinking that the word "In" qualifies
as a "magic
It turns out you can use Or expressions to cause a kind of
backtracking in Pyparsing. This is very close to what you want:
Name1 = Forward()
Name1 << Combine(Word(alphas) + Name1 | Word(alphas) + Suppress('v.'),
joinString=' ', adjacent=False).setResultsName('name1')
Name2 = Combine(OneOrMore(Word
On Sat, Feb 7, 2009 at 11:53 AM, Dinesh B Vadhia
wrote:
> Wow Kent, what a great start!
>
> I found this
> http://mail.python.org/pipermail/python-list/2006-April/376149.html which
> lays out some patterns of legal citations ie.
Here is another good reference:
http://philip.greenspun.com/politics
n petit jury, he would clearly have
standing to challenge the systematic exclusion of any identifiable group from
jury service."
Okay, I'd better get to grips with pyparsing!
Dinesh
From: Kent Johnson
Sent: Saturday, February 07, 2009 6:21 AM
To: Dinesh B Vadhia
Cc: tutor@pytho
On Sat, Feb 7, 2009 at 1:11 AM, Dinesh B Vadhia
wrote:
> Hi! I want to process text that contains citations, in this case in legal
> documents, and pull-out each individual citation. Here is a sample text:
> The results required are:
>
> Carter v. Jury Commission of Greene County, 396 U.S. 32
On Sat, Feb 7, 2009 at 1:11 AM, Dinesh B Vadhia
wrote:
> Hi! I want to process text that contains citations, in this case in legal
> documents, and pull-out each individual citation.
> Before attempting to solve this problem I thought I'd first ask if anyone
> has seen a solution before?
This g
Le Fri, 6 Feb 2009 22:11:14 -0800,
"Dinesh B Vadhia" a écrit :
> Hi! I want to process text that contains citations, in this case in legal
> documents, and pull-out each individual citation. Here is a sample text:
>
> text = "Page 500 Carter v. Jury Commission of Greene County, 396 U.S. 320,
Hi! I want to process text that contains citations, in this case in legal
documents, and pull-out each individual citation. Here is a sample text:
text = "Page 500 Carter v. Jury Commission of Greene County, 396 U.S. 320, 90
S.Ct. 518, 24 L.Ed.2d 549 (1970); Lathe Turner v. Fouche, 396 U.S. 34
25 matches
Mail list logo