Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-04 Thread boB Stepp
On Thu, Apr 2, 2015 at 2:49 PM, Albert-Jan Roskam wrote: > > - > On Thu, Apr 2, 2015 1:17 PM CEST Alan Gauld wrote: > >>On 02/04/15 12:09, Dave Angel wrote: >> >>> Ah, Jon Bentley (notice the extra 'e'). I should dig out my *Pearls >>> books, and have a trip down memor

Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-04 Thread Joe Farro
Joe Farro gmail.com> writes: > indentation doesn't (always) reflect the hierarchy of the data being > generated, which seems more clear. Meant to say: However, the indentation doesn't (always) reflect the hierarchy of the data being generated, which seems more clear **in the bs4 version**. __

Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-04 Thread Joe Farro
Joe Farro gmail.com> writes: > > Thanks, Peter. > > Peter Otten <__peter__ web.de> writes: > > > Can you give a real-world example where your DSL is significantly cleaner > > than the corresponding code using bs4, or lxml.xpath, or lxml.objectify? Peter, I worked up what I hope is a fairly

Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-02 Thread Alan Gauld
On 02/04/15 20:49, Albert-Jan Roskam wrote: Yes, the Pearls books should be required reading Is this the book you are referring to? http://www.amazon.com/Programming-Pearls-2nd-Edition-Bentley/dp/0201657880 Yes thats it. -- Alan G Author of the Learn to Program web site http://www.alan-g

Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-02 Thread Dave Angel
On 04/02/2015 03:49 PM, Albert-Jan Roskam wrote: - On Thu, Apr 2, 2015 1:17 PM CEST Alan Gauld wrote: On 02/04/15 12:09, Dave Angel wrote: Ah, Jon Bentley (notice the extra 'e'). I should dig out my *Pearls books, and have a trip down memory lane. I bet 95% of t

Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-02 Thread Albert-Jan Roskam
- On Thu, Apr 2, 2015 1:17 PM CEST Alan Gauld wrote: >On 02/04/15 12:09, Dave Angel wrote: > >> Ah, Jon Bentley (notice the extra 'e'). I should dig out my *Pearls >> books, and have a trip down memory lane. I bet 95% of those are still >> useful, even if they refer

Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-02 Thread Emile van Sebille
On 4/2/2015 4:22 AM, Dave Angel wrote: There was somewhere in one of the books a list of 'good practice,' including an item something like: Solve the right problem. There's a world of wisdom in that one alone. +1 Emile ___ Tutor maillist - Tuto

Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-02 Thread Joe Farro
Alan Gauld btinternet.com> writes: > DSL? Good to know the term/acronym is not ubiquitous. I was going for succinct, possibly too succinct... > Have you looked at the existing web scraping tools in Python? > There are several to pick from. They all avoid the kind of mess > you describe. I'm fam

Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-02 Thread Joe Farro
Thanks, Peter. Peter Otten <__peter__ web.de> writes: > Can you give a real-world example where your DSL is significantly cleaner > than the corresponding code using bs4, or lxml.xpath, or lxml.objectify? Yes, definitely. Will work something up. > Your code on github looks good to me (too fe

Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-02 Thread Peter Otten
Joe Farro wrote: > The package implements a DSL that is intended to make web-scraping a bit > more maintainable :) > > I generally find my scraping code ends up being rather chaotic with > querying, regex manipulations, conditional processing, conversions, etc., > ending up being to close togethe

Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-02 Thread Dave Angel
On 04/02/2015 07:17 AM, Alan Gauld wrote: On 02/04/15 12:09, Dave Angel wrote: Ah, Jon Bentley (notice the extra 'e'). I should dig out my *Pearls books, and have a trip down memory lane. I bet 95% of those are still useful, even if they refer to much earlier versions of language(s). Yes, t

Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-02 Thread Alan Gauld
On 02/04/15 12:09, Dave Angel wrote: Ah, Jon Bentley (notice the extra 'e'). I should dig out my *Pearls books, and have a trip down memory lane. I bet 95% of those are still useful, even if they refer to much earlier versions of language(s). Yes, the Pearls books should be required reading

Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-02 Thread Dave Angel
On 04/02/2015 06:41 AM, Alan Gauld wrote: On 02/04/15 10:50, Dave Angel wrote: On 04/02/2015 04:22 AM, Alan Gauld wrote: DSL? This is "Domain Specific Language". This is a language built around a specific problem domain, Ah, Thanks Dave! I am used to those being called simply "Little lang

Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-02 Thread Alan Gauld
On 02/04/15 10:50, Dave Angel wrote: On 04/02/2015 04:22 AM, Alan Gauld wrote: DSL? This is "Domain Specific Language". This is a language built around a specific problem domain, Ah, Thanks Dave! I am used to those being called simply "Little languages" after the famous Jon Bently ACM art

Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-02 Thread Dave Angel
On 04/02/2015 04:22 AM, Alan Gauld wrote: DSL? This is "Domain Specific Language". This is a language built around a specific problem domain, in order to more easily express problems for that domain than the usual general purpose languages. I was a bit surprised to find few google matche

Re: [Tutor] Request review: A DSL for scraping a web page

2015-04-02 Thread Alan Gauld
On 02/04/15 04:18, Joe Farro wrote: Hello, I recently wrote a python package and was wondering if anyone might have time to review it? This list is for people learning Python and answering questions about the core language and standard library. I suspect this is more appropriate to the main py

[Tutor] Request review: A DSL for scraping a web page

2015-04-02 Thread Joe Farro
Hello, I recently wrote a python package and was wondering if anyone might have time to review it? I'm fairly new to python - it's been about 1/2 of my workload at work for the past year. Any suggestions would be super appreciated. https://github.com/tiffon/take https://pypi.python.org/pypi/tak