On Mon, Feb 1, 2010 at 1:19 PM, Kent Johnson <ken...@tds.net> wrote:
> On Mon, Feb 1, 2010 at 6:29 AM, Norman Khine <nor...@khine.net> wrote:
>
>> thanks, what about the whitespace problem?
>
> \s* will match any amount of whitespace includin newlines.

thank you, this worked well.

here is the code:

###
import re
file=open('producers_google_map_code.txt', 'r')
data =  repr( file.read().decode('utf-8') )

block = re.compile(r"""openInfoWindowHtml\(.*?\\ticon: myIcon\\n""")
b = block.findall(data)
block_list = []
for html in b:
        namespace = {}
        t = re.compile(r"""<strong>(.*)<\/strong>""")
        title = t.findall(html)
        for item in title:
                namespace['title'] = item
        u = re.compile(r"""a href=\"\/(.*)\">En savoir plus""")
        url = u.findall(html)
        for item in url:
                namespace['url'] = item
        g = re.compile(r"""GLatLng\((\-?\d+\.\d*)\,\\n\s*(\-?\d+\.\d*)\)""")
        lat = g.findall(html)
        for item in lat:
                namespace['LatLng'] = item
        block_list.append(namespace)

###

can this be made better?

>
> Kent
>
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to