On Mon, Feb 1, 2010 at 1:19 PM, Kent Johnson <ken...@tds.net> wrote: > On Mon, Feb 1, 2010 at 6:29 AM, Norman Khine <nor...@khine.net> wrote: > >> thanks, what about the whitespace problem? > > \s* will match any amount of whitespace includin newlines.
thank you, this worked well. here is the code: ### import re file=open('producers_google_map_code.txt', 'r') data = repr( file.read().decode('utf-8') ) block = re.compile(r"""openInfoWindowHtml\(.*?\\ticon: myIcon\\n""") b = block.findall(data) block_list = [] for html in b: namespace = {} t = re.compile(r"""<strong>(.*)<\/strong>""") title = t.findall(html) for item in title: namespace['title'] = item u = re.compile(r"""a href=\"\/(.*)\">En savoir plus""") url = u.findall(html) for item in url: namespace['url'] = item g = re.compile(r"""GLatLng\((\-?\d+\.\d*)\,\\n\s*(\-?\d+\.\d*)\)""") lat = g.findall(html) for item in lat: namespace['LatLng'] = item block_list.append(namespace) ### can this be made better? > > Kent > _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor