Re: [Tutor] unicode decode/encode issue

2016-09-26 Thread Steven D'Aprano
I'm sorry, I have misinterpreted your question. On Mon, Sep 26, 2016 at 12:59:04PM -0400, bruce wrote: > I've got a page from a web fetch. I'm simply trying to go from utf-8 to > ascii. Why would you do that? It's 2016, not 1953, and ASCII is well and truly obsolete. (ASCII was even obsolete i

Re: [Tutor] unicode decode/encode issue

2016-09-26 Thread bruce
Hey folks. (peter!) Thanks for the reply. I wound up doing: #s=s.replace('\u2013', '-') #s=s.replace(u'\u2013', '-') #s=s.replace(u"\u2013", "-") #s=re.sub(u"\u2013", "-", s) s=s.encode("ascii", "ignore") s=s.replace(u"\u2013", "-") s=s.replace("–", "-") ##<<< this was actually in

Re: [Tutor] unicode decode/encode issue

2016-09-26 Thread Peter Otten
bruce wrote: > Hi. > > Ive got a "basic" situation that should be simpl. So it must be a user > (me) issue! > > > I've got a page from a web fetch. I'm simply trying to go from utf-8 to > ascii. I'm not worried about any cruft that might get stripped out as the > data is generated from a us sit

Re: [Tutor] unicode decode/encode issue

2016-09-26 Thread Steven D'Aprano
On Mon, Sep 26, 2016 at 12:59:04PM -0400, bruce wrote: > When I look at the input content, I have : > > u'English 120 Course Syllabus \u2013 Fall \u2013 2006' > > So, any pointers on replacing the \u2013 with a simple '-' (dash) (or I > could even handle just a ' ' (space) You misinterpret wha

[Tutor] unicode decode/encode issue

2016-09-26 Thread bruce
Hi. Ive got a "basic" situation that should be simpl. So it must be a user (me) issue! I've got a page from a web fetch. I'm simply trying to go from utf-8 to ascii. I'm not worried about any cruft that might get stripped out as the data is generated from a us site. (It's a college/class dataset