Re: Unicode 7
On Thu, 01 May 2014 21:42:21 -0700, Rustom Mody wrote: > Whats the best cure for headache? > > Cut off the head o_O I don't think so. > Whats the best cure for Unicode? > > Ascii Unicode is not a problem to be solved. The inability to write standard human text in ASCII is a problem, e.g. one cannot write “ASCII For Dummies” © 2014 by Zöe Smith, now on sale 99¢ so even *Americans* cannot represent all their common characters in ASCII, let alone specialised characters from mathematics, science, the printing industry, and law. And even Americans sometimes need to write text in Foreign. Where is your ASCII now? The solution is to have at least one encoding which contains the additional characters needed. The plethora of such additional encodings is a problem. The solution is a single encoding that covers all needed characters, like Unicode, so that there is no need to handle multiple encodings. The inability for plain text files to record metadata of what encoding they use is a problem. The solution is to standardize on a single, world- wide encoding, like Unicode. > Saying however that there is no headache in unicode does not make the > headache go away: > > http://lucumr.pocoo.org/2014/1/5/unicode-in-2-and-3/ > > No I am not saying that the contents/style/tone are right. However > people are evidently suffering the transition. Denying it is not a help. Transitions are always more painful than after the transition has settled down. As I have said repeatedly, I look forward for the day when nobody but document archivists and academics need care about legacy encodings. But we're not there yet. > And unicode consortium's ways are not exactly helpful to its own cause: > Imagine the C standard committee deciding that adding mandatory garbage > collection to C is a neat idea > > Unicode consortium's going from old BMP to current (6.0) SMPs to > who-knows-what in the future is similar. I don't see the connection. -- Steven D'Aprano http://import-that.dreamwidth.org/ -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Thu, 01 May 2014 19:02:48 -0700, Rustom Mody wrote: > I dont know how one causally connects the 'headaches' but Ive seen - > mojibake Mojibake is certainly more common with multiple encodings, but the solution to that is Unicode, not ASCII. In fact, in your blog post you even link to a post of mine where I explain that ASCII has gone through multiple backwards incompatible changes over the decades, which means you can have a limited form of mojibake even in pure ASCII. Between changes over various versions of ASCII, and ambiguous characters allowed by the standard, you needed some sort of out-of-band metadata to tell you whether they intended an @ or a `, a | or a ¬, a £ or a #, to mention only a few. It's only since the 1980s that ASCII, actual 7-bit US ASCII, has become an unambiguous standard. But that's okay, because that merely allowed people to create dozens of 7-bit and 8-bit variations on ASCII, all incompatible with each other, and *call them ASCII* regardless of the actual standard name. Between ambiguities in actual ASCII, and common practice to label non- ASCII as ASCII, I can categorically say that mojibake has always been possible in so-called "plain text". If you haven't noticed it, it was because you were only exchanging documents with people who happened to use the same set of characters as you. > - unicode 'number-boxes' (what are these called?) They are missing character glyphs, and they have nothing to do with Unicode. They are due to deficiencies in the text font you are using. Admittedly with Unicode's 0x10 possible characters (actually more, since a single code point can have multiple glyphs) it isn't surprising that most font designers have neither the time, skill or desire to create a glyph for every single code point. But then the same applies even for more restrictive 8-bit encodings -- sometimes font designers don't even bother providing glyphs for *ASCII* characters. (E.g. they may only provide glyphs for uppercase A...Z, not lowercase.) > - Worst of all what we > *dont* see -- how many others dont see what we see? Again, this a deficiency of the font. There are very few code points in Unicode which are intended to be invisible, e.g. space, newline, zero- width joiner, control characters, etc., but they ought to be equally invisible to everyone. No printable character should ever be invisible in any decent font. > I never knew of any of this in the good ol days of ASCII You must have been happy with a very impoverished set of symbols, then. > ¶ Passive voice is often the best choice in the interests of political > correctness > > It would be a pleasant surprise if everyone sees a pilcrow at start of > line above I do. -- Steven D'Aprano http://import-that.dreamwidth.org/ -- https://mail.python.org/mailman/listinfo/python-list
Re: Off-topic circumnavigating the earth in a mile or less
On Thu, 01 May 2014 21:57:57 +0100, Adam Funk wrote: > On 2014-05-01, Terry Reedy wrote: > >> On 4/30/2014 7:46 PM, Ian Kelly wrote: >> >>> It also works if your starting point is (precisely) the north pole. I >>> believe that's the canonical answer to the riddle, since there are no >>> bears in Antarctica. >> >> For the most part, there are no bears within a mile of the North Pole >> either. "they are rare north of 88°" (ie, 140 miles from pole). >> https://en.wikipedia.org/wiki/Polar_bears They mostly hunt in or near >> open water, near the coastlines. >> >> I find it amusing that someone noticed and posted an alternate, >> non-canonical solution. How might a bear be near the south pole? As >> long as we are being creative, suppose some jokester mounts a near >> life-size stuffed black bear, made of cold-tolerant artificial >> materials, near but not at the South Pole. The intent is to give fright >> to naive newcomers. Someone walking in a radius 1/2pi circle about the >> pole might easily see it. > > OK, change bear to bird & the question to "What kind of bird is it?" Arctic Turn is a valid answer for all locations :-) -- Pardon me, but do you know what it means to be TRULY ONE with your BOOTH! -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Fri, May 2, 2014 at 6:08 PM, Steven D'Aprano wrote: > ... even *Americans* cannot represent all their common characters in > ASCII, let alone specialised characters from mathematics, science, the > printing industry, and law. Aside: What additional characters does law use that aren't in ASCII? Section § and paragraph ¶ are used frequently, but you already mentioned the printing industry. Are there other symbols? ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Fri, May 2, 2014 at 6:45 PM, Steven D'Aprano wrote: >> - unicode 'number-boxes' (what are these called?) > > They are missing character glyphs, and they have nothing to do with > Unicode. They are due to deficiencies in the text font you are using. > > Admittedly with Unicode's 0x10 possible characters (actually more, > since a single code point can have multiple glyphs) it isn't surprising > that most font designers have neither the time, skill or desire to create > a glyph for every single code point. But then the same applies even for > more restrictive 8-bit encodings -- sometimes font designers don't even > bother providing glyphs for *ASCII* characters. > > (E.g. they may only provide glyphs for uppercase A...Z, not lowercase.) This is another area where Unicode has given us "a great improvement over the old method of giving satisfaction". Back in the 1990s on OS/2, DOS, and Windows, a missing glyph might be (a) blank, (b) a simple square with no information, or (c) copied from some other font (common with dingbats fonts). With Unicode, the standard is to show a little box *with the hex digits in it*. Granted, those boxes are a LOT more readable for BMP characters than SMP (unless your text is huge, six digits in the space of one character will make them pretty tiny), and a "Unicode" font will generally include all (or at least most) of the BMP, but it's still better than having no information at all. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
Chris Angelico writes: > On Fri, May 2, 2014 at 6:08 PM, Steven D'Aprano > wrote: > > ... even *Americans* cannot represent all their common characters in > > ASCII, let alone specialised characters from mathematics, science, > > the printing industry, and law. > > Aside: What additional characters does law use that aren't in ASCII? > Section § and paragraph ¶ are used frequently, but you already > mentioned the printing industry. Are there other symbols? ASCII does not contain “©” (U+00A9 COPYRIGHT SIGN) nor “®” (U+00AE REGISTERED SIGN), for instance. -- \ “I got some new underwear the other day. Well, new to me.” —Emo | `\ Philips | _o__) | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Fri, May 2, 2014 at 7:16 PM, Ben Finney wrote: > Chris Angelico writes: > >> On Fri, May 2, 2014 at 6:08 PM, Steven D'Aprano >> wrote: >> > ... even *Americans* cannot represent all their common characters in >> > ASCII, let alone specialised characters from mathematics, science, >> > the printing industry, and law. >> >> Aside: What additional characters does law use that aren't in ASCII? >> Section § and paragraph ¶ are used frequently, but you already >> mentioned the printing industry. Are there other symbols? > > ASCII does not contain “©” (U+00A9 COPYRIGHT SIGN) nor “®” (U+00AE > REGISTERED SIGN), for instance. Heh! I forgot about those. U+00A9 in particular has gone so mainstream that it's easy to think of it not as "I'm going to switch to my 'British English + Legal' dictionary now" and just as "This is a critical part of the basic dictionary". ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Designing a network in Python
On Wednesday, 30 April 2014 20:38:07 UTC+2, Joseph L. Casale wrote: > > I don't know how to do that stuff in python. Basically, I'm trying to pull > > certain data from the > > xml file like the node-name, source, destination and the capacity. Since, I > > am done with that > > part, I now want to have a link between source and destination and assign > > capacity to it. > > I dont mind writing you an SQLite schema and accessor class, can you define > your data in a tabular > format and mail it to me offline, we add relationships etc as we go. > > Hopefully it inspires you to adopt this approach in the future as it often > proves powerful. > > jlc Thanks a lot for your help. But, how do I mail you? I can't find your mail id here -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
Chris Angelico writes: > (common with dingbats fonts). With Unicode, the standard is to show > a little box *with the hex digits in it*. Granted, those boxes are a > LOT more readable for BMP characters than SMP (unless your text is > huge, six digits in the space of one character will make them pretty > tiny), and a "Unicode" font will generally include all (or at least > most) of the BMP, but it's still better than having no information I needed to see such tiny numbers just today, just the four of them in the tiny box. So I pressed C-+ a few times to _make_ the text huge, obtained my information, and returned to my normal text size with C--. Perfect. Usually all I need to know is that I have a character for which I don't have a glyph, but this time I wanted to record the number because I was testing things rather than reading the text. -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
Ben Finney : >> Aside: What additional characters does law use that aren't in ASCII? >> Section § and paragraph ¶ are used frequently, but you already >> mentioned the printing industry. Are there other symbols? > > ASCII does not contain “©” (U+00A9 COPYRIGHT SIGN) nor “®” (U+00AE > REGISTERED SIGN), for instance. The em-dash is mapped on my keyboard — I use it quite often. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Friday, May 2, 2014 2:15:41 PM UTC+5:30, Steven D'Aprano wrote: > On Thu, 01 May 2014 19:02:48 -0700, Rustom Mody wrote: > > - Worst of all what we > > *dont* see -- how many others dont see what we see? > Again, this a deficiency of the font. There are very few code points in > Unicode which are intended to be invisible, e.g. space, newline, zero- > width joiner, control characters, etc., but they ought to be equally > invisible to everyone. No printable character should ever be invisible in > any decent font. Thats not what I meant. I wrote http://blog.languager.org/2014/04/unicoded-python.html – mostly on a debian box. Later on seeing it on a less heavily setup ubuntu box, I see ⟮ ⟯ ⟬ ⟭ ⦇ ⦈ ⦉ ⦊ have become 'missing-glyph' boxes. It leads me ask, how much else of what I am writing, some random reader has simply not seen? Quite simply we can never know – because most are going to go away saying "mojibaked/garbled rubbish" Speaking of what you understood of what I said: Yes invisible chars is another problem I was recently bitten by. I pasted something from google into emacs' org mode. Following that link again I kept getting a broken link. Until I found that the link had an invisible char The problem was that emacs was faithfully rendering that char according to standard, ie invisibly! -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Fri, 02 May 2014 19:01:44 +1000, Chris Angelico wrote: > On Fri, May 2, 2014 at 6:08 PM, Steven D'Aprano > wrote: >> ... even *Americans* cannot represent all their common characters in >> ASCII, let alone specialised characters from mathematics, science, the >> printing industry, and law. > > Aside: What additional characters does law use that aren't in ASCII? > Section § and paragraph ¶ are used frequently, but you already mentioned > the printing industry. Are there other symbols? I was thinking of copyright, trademark, registered mark, and similar. I think these are all of relevant characters: py> for c in '©®℗™': ... unicodedata.name(c) ... 'COPYRIGHT SIGN' 'REGISTERED SIGN' 'SOUND RECORDING COPYRIGHT' 'TRADE MARK SIGN' -- Steven D'Aprano http://import-that.dreamwidth.org/ -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Fri, 02 May 2014 03:39:34 -0700, Rustom Mody wrote: > On Friday, May 2, 2014 2:15:41 PM UTC+5:30, Steven D'Aprano wrote: >> On Thu, 01 May 2014 19:02:48 -0700, Rustom Mody wrote: >> > - Worst of all what we >> > *dont* see -- how many others dont see what we see? > >> Again, this a deficiency of the font. There are very few code points in >> Unicode which are intended to be invisible, e.g. space, newline, zero- >> width joiner, control characters, etc., but they ought to be equally >> invisible to everyone. No printable character should ever be invisible >> in any decent font. > > Thats not what I meant. > > I wrote http://blog.languager.org/2014/04/unicoded-python.html > – mostly on a debian box. > Later on seeing it on a less heavily setup ubuntu box, I see > ⟮ ⟯ ⟬ ⟭ ⦇ ⦈ ⦉ ⦊ > have become 'missing-glyph' boxes. > > It leads me ask, how much else of what I am writing, some random reader > has simply not seen? > Quite simply we can never know – because most are going to go away > saying "mojibaked/garbled rubbish" > > Speaking of what you understood of what I said: Yes invisible chars is > another problem I was recently bitten by. I pasted something from google > into emacs' org mode. Following that link again I kept getting a broken > link. > > Until I found that the link had an invisible char > > The problem was that emacs was faithfully rendering that char according > to standard, ie invisibly! And you've never been bitten by an invisible control character in ASCII text? You've lived a sheltered life! Nothing you are describing is unique to Unicode. -- Steven D'Aprano http://import-that.dreamwidth.org/ -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
Steven D'Aprano : > And you've never been bitten by an invisible control character in > ASCII text? You've lived a sheltered life! That reminds me: " " (nonbreakable space) is often used between numbers and units, for example. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On 2014-05-02 19:08, Chris Angelico wrote: > This is another area where Unicode has given us "a great improvement > over the old method of giving satisfaction". Back in the 1990s on > OS/2, DOS, and Windows, a missing glyph might be (a) blank, (b) a > simple square with no information, or (c) copied from some other > font (common with dingbats fonts). With Unicode, the standard is to > show a little box *with the hex digits in it*. Granted, those boxes > are a LOT more readable for BMP characters than SMP (unless your > text is huge, six digits in the space of one character will make > them pretty tiny), and a "Unicode" font will generally include all > (or at least most) of the BMP, but it's still better than having no > information at all. I'm pleased when applications & fonts work properly, using both the placeholder fonts for "this character is legitimate but I can't display it with a font, so here, have a box with the codepoint numbers in it until I'm directed to use a more appropriate font at which point you'll see it correctly" and the "somebody crammed garbage in here, so I'll display it with "�" (U+FFFD) which is designated for exactly this purpose". -tkc -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Friday, May 2, 2014 5:25:37 PM UTC+5:30, Steven D'Aprano wrote: > On Fri, 02 May 2014 03:39:34 -0700, Rustom Mody wrote: > > On Friday, May 2, 2014 2:15:41 PM UTC+5:30, Steven D'Aprano wrote: > >> On Thu, 01 May 2014 19:02:48 -0700, Rustom Mody wrote: > >> > - Worst of all what we > >> > *dont* see -- how many others dont see what we see? > >> Again, this a deficiency of the font. There are very few code points in > >> Unicode which are intended to be invisible, e.g. space, newline, zero- > >> width joiner, control characters, etc., but they ought to be equally > >> invisible to everyone. No printable character should ever be invisible > >> in any decent font. > > Thats not what I meant. > > I wrote http://blog.languager.org/2014/04/unicoded-python.html > > – mostly on a debian box. > > Later on seeing it on a less heavily setup ubuntu box, I see > > ⟮ ⟯ ⟬ ⟭ ⦇ ⦈ ⦉ ⦊ > > have become 'missing-glyph' boxes. > > It leads me ask, how much else of what I am writing, some random reader > > has simply not seen? > > Quite simply we can never know – because most are going to go away > > saying "mojibaked/garbled rubbish" > > Speaking of what you understood of what I said: Yes invisible chars is > > another problem I was recently bitten by. I pasted something from google > > into emacs' org mode. Following that link again I kept getting a broken > > link. > > Until I found that the link had an invisible char > > The problem was that emacs was faithfully rendering that char according > > to standard, ie invisibly! > And you've never been bitten by an invisible control character in ASCII > text? You've lived a sheltered life! For control characters Ive seen: - garbage (the ASCII equiv of mojibake) - Straight ^A^B^C - Maybe their names NUL,SOH,STX,ETX,EOT,ENQ,ACK… - Or maybe just a little dot . - More pathological behavior: a control sequence putting the terminal into some other mode But I dont ever remember seeing a control character become invisible (except [ \t\n\f]) -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On 2014-05-02 03:39, Ben Finney wrote: Rustom Mody writes: Yes, the headaches go a little further back than Unicode. Okay, so can you change your article to reflect the fact that the headaches both pre-date Unicode, and are made much easier by Unicode? There is a certain large old book... Ah yes, the neo-Sumerian story “Enmerkar_and_the_Lord_of_Aratta” https://en.wikipedia.org/wiki/Enmerkar_and_the_Lord_of_Aratta>. Probably inspired by stories older than that, of course. In which is described the building of a 'tower that reached up to heaven'... At which point 'it was decided'¶ to do something to prevent that. And our headaches started. And other myths with fantastic reasons for the diversity of language https://en.wikipedia.org/wiki/Mythical_origins_of_language>. I never knew of any of this in the good ol days of ASCII Yes, by ignoring all other writing systems except one's own – and thereby excluding most of the world's people – the system can be made simpler. ASCII lacked even £. I can remember assembly listings in magazines containing lines such as: LDA £0 I even (vaguely) remember an advert with a character that looked like Ł, presumably because they didn't have £. In a UK magazine? Very strange! Hopefully the proportion of programmers who still feel they can make such a parochial choice is rapidly shrinking. -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Friday, May 2, 2014 5:25:37 PM UTC+5:30, Steven D'Aprano wrote: > On Fri, 02 May 2014 03:39:34 -0700, Rustom Mody wrote: > > On Friday, May 2, 2014 2:15:41 PM UTC+5:30, Steven D'Aprano wrote: > >> On Thu, 01 May 2014 19:02:48 -0700, Rustom Mody wrote: > >> > - Worst of all what we > >> > *dont* see -- how many others dont see what we see? > >> Again, this a deficiency of the font. There are very few code points in > >> Unicode which are intended to be invisible, e.g. space, newline, zero- > >> width joiner, control characters, etc., but they ought to be equally > >> invisible to everyone. No printable character should ever be invisible > >> in any decent font. > > Thats not what I meant. > > I wrote http://blog.languager.org/2014/04/unicoded-python.html > > – mostly on a debian box. > > Later on seeing it on a less heavily setup ubuntu box, I see > > ⟮ ⟯ ⟬ ⟭ ⦇ ⦈ ⦉ ⦊ > > have become 'missing-glyph' boxes. > > It leads me ask, how much else of what I am writing, some random reader > > has simply not seen? > > Quite simply we can never know – because most are going to go away > > saying "mojibaked/garbled rubbish" > > Speaking of what you understood of what I said: Yes invisible chars is > > another problem I was recently bitten by. I pasted something from google > > into emacs' org mode. Following that link again I kept getting a broken > > link. > > Until I found that the link had an invisible char > > The problem was that emacs was faithfully rendering that char according > > to standard, ie invisibly! > And you've never been bitten by an invisible control character in ASCII > text? You've lived a sheltered life! > Nothing you are describing is unique to Unicode. Just noticed a small thing in which python does a bit better than haskell: $ ghci let (fine, fine) = (1,2) Prelude> (fine, fine) (1,2) Prelude> In case its not apparent, the fi in the first fine is a ligature. Python just barfs: >>> fine = 1 File "", line 1 fine = 1 ^ SyntaxError: invalid syntax >>> The point of that example is to show that unicode gives all kind of "Aaah! Gotcha!!" opportunities that just dont exist in the old world. Python may have got this one right but there are surely dozens of others. On the other hand I see more eagerness for unicode source-text there eg. https://github.com/i-tu/Hasklig http://www.haskell.org/ghc/docs/latest/html/users_guide/syntax-extns.html#unicode-syntax http://www.haskell.org/haskellwiki/Unicode-symbols http://hackage.haskell.org/package/base-unicode-symbols Some music 𝄞 𝄢 ♭ 𝄱 to appease the utf-8 gods -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On 2014-05-02 09:08, Steven D'Aprano wrote: On Thu, 01 May 2014 21:42:21 -0700, Rustom Mody wrote: Whats the best cure for headache? Cut off the head o_O I don't think so. Whats the best cure for Unicode? Ascii Unicode is not a problem to be solved. The inability to write standard human text in ASCII is a problem, e.g. one cannot write “ASCII For Dummies” © 2014 by Zöe Smith, now on sale 99¢ [snip] Shouldn't that be "Zoë"? -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On 05/02/2014 10:50 AM, Rustom Mody wrote: > Python just barfs: > fine = 1 > File "", line 1 > fine = 1 > ^ > SyntaxError: invalid syntax > > The point of that example is to show that unicode gives all kind of > "Aaah! Gotcha!!" opportunities that just dont exist in the old world. > Python may have got this one right but there are surely dozens of others. Except that it doesn't. This has nothing to do with unicode handling. It has everything to do with what defines an identifier in Python. This is no different than someone wondering why they can't start an identifier in Python 1.x with a number or punctuation mark. -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On 5/2/14 12:50 PM, Rustom Mody wrote: Just noticed a small thing in which python does a bit better than haskell: $ ghci let (fine, fine) = (1,2) Prelude> (fine, fine) (1,2) Prelude> In case its not apparent, the fi in the first fine is a ligature. Python just barfs: >>>fine = 1 File "", line 1 fine = 1 ^ SyntaxError: invalid syntax >>> Surely by now we could at least be explicit about which version of Python we are talking about? $ python2.7 Python 2.7.2 (default, Oct 11 2012, 20:14:37) [GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> fine = 1 File "", line 1 fine = 1 ^ SyntaxError: invalid syntax >>> ^D $ python3.4 Python 3.4.0b1 (default, Dec 16 2013, 21:05:22) [GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> fine = 1 >>> fine 1 In Python 2 identifiers must be ASCII. Python 3 allows many Unicode characters in identifiers (see PEP 3131 for details: http://legacy.python.org/dev/peps/pep-3131/) -- Ned Batchelder, http://nedbatchelder.com -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
Rustom Mody wrote:
> Just noticed a small thing in which python does a bit better than haskell:
> $ ghci
> let (fine, fine) = (1,2)
> Prelude> (fine, fine)
> (1,2)
> Prelude>
>
> In case its not apparent, the fi in the first fine is a ligature.
>
> Python just barfs:
Not Python 3:
Python 3.3.2+ (default, Feb 28 2014, 00:52:16)
[GCC 4.8.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> (fine, fine) = (1,2)
>>> (fine, fine)
(2, 2)
No copy-and-paste errors involved:
>>> eval("\ufb01ne")
2
>>> eval(b"fine".decode("ascii"))
2
--
https://mail.python.org/mailman/listinfo/python-list
Hi. I want to create a script to read a file placed in a remote linux server using python..need help..?
I have created the script till here ..
import os
os.chdir("/var/log")
fd = open("t1.txt", "r")
for line in fd:
if re.match("(.*)(file1)(.*)", line):
print line,
Output :
file1
this script i ran on the linux server, but now i want to run this script from
another linux server and get the output displayed there..how can i do that...
i tried to use : pexpect
but getting no help..
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cookie not retrieving as it should in some cases
On Fri, 02 May 2014 01:11:05 -0700, Ferrous Cranus wrote:
> # retrieve cookie from client's browser otherwise set it try:
> cookie = cookies.SimpleCookie( os.environ.get('HTTP_COOKIE', '') )
> cookieID = cookie['ID'].value
> except:
> cookieID = str( time.time() )
> cookieID = cookieID[-3:]
>
> cookie['ID'] = cookieID
>
>
> Many times i noticed that the script instead of retrieving the cookie ID
> value so to identify each visitor uniquely it insteads set its again.
> The same think also happens when someone comes to superhost.gr via a
> link from anothwe webpage
>
> can somebody tell me why this is happening?
> is there some flaw in my code? Perhaps it can be written more
> efficiently?
I had a similar issue when using Beaker middleware for WSGI which was
caused by me not specifying a location for the storage of the cookie
database.
--
There is a multi-legged creature crawling on your shoulder.
-- Spock, "A Taste of Armageddon", stardate 3193.9
--
https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
Marko Rauhamaa writes: > That reminds me: " " [U+00A0 NON-BREAKING SPACE] is often used between > numbers and units, for example. The non-breaking space (“ ” U+00A0) is frequently used in text to keep conceptually inseparable text such as “100 km” from automatic word breaks https://en.wikipedia.org/wiki/Non-breaking_space>. Because of established, conflicting conventions for separating groups of digits (“1,234.00” in many countries; “1.234,00” in many others) https://en.wikipedia.org/wiki/Thousands_separator#Digit_grouping>, the “ ” U+2009 THIN SPACE https://en.wikipedia.org/wiki/Thin_Space> is recommended for separating digit groups (e.g. “1 234 567 m”) https://en.wikipedia.org/wiki/SI_units#General_rules>. -- \ “We spend the first twelve months of our children's lives | `\ teaching them to walk and talk and the next twelve years | _o__) telling them to sit down and shut up.” —Phyllis Diller | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
In article , Ben Finney wrote: > The non-breaking space (â â U+00A0) is frequently used in text to keep > conceptually inseparable text such as â100 kmâ from automatic word > breaks https://en.wikipedia.org/wiki/Non-breaking_space>. Which, by the way, argparse doesn't honor... http://bugs.python.org/issue16623 -- https://mail.python.org/mailman/listinfo/python-list
Re: Hi. I want to create a script to read a file placed in a remote linux server using python..need help..?
On Fri, 02 May 2014 12:55:18 -0700, Bhawani Singh wrote:
> I have created the script till here ..
>
> import os
>
> os.chdir("/var/log")
> fd = open("t1.txt", "r")
> for line in fd:
> if re.match("(.*)(file1)(.*)", line):
> print line,
>
> Output :
>
> file1
>
>
> this script i ran on the linux server, but now i want to run this script
> from another linux server and get the output displayed there..how can i
> do that...
>
> i tried to use : pexpect but getting no help..
Method a:
Go and sit in front of the keyboard on the other linux server, run the
script and read the screen.
Method b:
Use telnet to login to your account on the other server, run the script.
To run your script on someone elses machine usually needs you to be able
to access their machine somehow. Either you are permitted to do it, in
which case you should already know how to do it, or you're not permitted
to do it, in which case we're not going to teach you how to do it here.
--
Denis McMahon, [email protected]
--
https://mail.python.org/mailman/listinfo/python-list
Re: Hi. I want to create a script to read a file placed in a remote linux server using python..need help..?
In article , Denis McMahon wrote: > Method b: > > Use telnet to login to your account on the other server, run the script. Ugh. I hope nobody is using telnet anymore. Passwords send in plain text over the network. Bad. All uses of telnet should have long since been replaced with ssh. One of the cool thinks about ssh is that not only does it give you remote shell connectivity, but it can be used to execute commands remotely, over the same secure channel. There is an awesome python package called fabric (http://www.fabfile.org/) which makes it trivial to do this inside of a python program. You can use it as a command-line tool, or as a library embedded in another python script. -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Friday, May 2, 2014 11:37:02 PM UTC+5:30, Peter Otten wrote:
> Rustom Mody wrote:
> > Just noticed a small thing in which python does a bit better than haskell:
> > $ ghci
> > let (fine, fine) = (1,2)
> > Prelude> (fine, fine)
> > (1,2)
> > In case its not apparent, the fi in the first fine is a ligature.
> > Python just barfs:
> Not Python 3:
> Python 3.3.2+ (default, Feb 28 2014, 00:52:16)
> [GCC 4.8.1] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> (fine, fine) = (1,2)
> >>> (fine, fine)
> (2, 2)
> No copy-and-paste errors involved:
> >>> eval("\ufb01ne")
> 2
> >>> eval(b"fine".decode("ascii"))
> 2
Aah! Thanks Peter (and Ned and Michael) — 2-3 confusion — my bad.
I am confused about the tone however:
You think this
>>> (fine, fine) = (1,2) # and no issue about it
is fine?
--
https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Sat, May 3, 2014 at 10:58 AM, Rustom Mody wrote:
> You think this
>
(fine, fine) = (1,2) # and no issue about it
>
> is fine?
Not sure which part you're objecting to. Are you saying that this
should be an error:
>>> a, a = 1, 2 # simple ASCII identifier used twice
or that Python should take the exact sequence of codepoints, rather
than normalizing?
Python 3.5.0a0 (default:6a0def54c63d, Mar 26 2014, 01:11:09)
[GCC 4.7.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> fine = 1
>>> vars()
{'__package__': None, '__spec__': None, '__doc__': None, 'fine': 1,
'__loader__': ,
'__builtins__': , '__name__':
'__main__'}
As regards normalization, I would be happy with either "keep it
exactly as you provided" or "normalize according to ", as long as it's consistent. It's like
what happens with SQL identifiers: according to the standard, an
unquoted name should be uppercased, but some databases instead
lowercase them. It doesn't break code (modulo quoted names, not
applicable here), as long as it's consistent.
(My reading of PEP 3131 is that NFKC is used; is that what's
implemented, or was that a temporary measure and/or something for Py2
to consider?)
ChrisA
--
https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On 5/2/14 8:58 PM, Rustom Mody wrote:
On Friday, May 2, 2014 11:37:02 PM UTC+5:30, Peter Otten wrote:
Rustom Mody wrote:
Just noticed a small thing in which python does a bit better than haskell:
$ ghci
let (fine, fine) = (1,2)
Prelude> (fine, fine)
(1,2)
In case its not apparent, the fi in the first fine is a ligature.
Python just barfs:
Not Python 3:
Python 3.3.2+ (default, Feb 28 2014, 00:52:16)
[GCC 4.8.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
(fine, fine) = (1,2)
(fine, fine)
(2, 2)
No copy-and-paste errors involved:
eval("\ufb01ne")
2
eval(b"fine".decode("ascii"))
2
Aah! Thanks Peter (and Ned and Michael) — 2-3 confusion — my bad.
I am confused about the tone however:
You think this
(fine, fine) = (1,2) # and no issue about it
is fine?
Can you be more explicit? It seems like you think it isn't fine. Why
not? What bothers you about it? Should there be an issue?
--
Ned Batchelder, http://nedbatchelder.com
--
https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Saturday, May 3, 2014 6:48:21 AM UTC+5:30, Ned Batchelder wrote:
> On 5/2/14 8:58 PM, Rustom Mody wrote:
> > On Friday, May 2, 2014 11:37:02 PM UTC+5:30, Peter Otten wrote:
> >> Rustom Mody wrote:
> >>> Just noticed a small thing in which python does a bit better than haskell:
> >>> $ ghci
> >>> let (fine, fine) = (1,2)
> >>> Prelude> (fine, fine)
> >>> (1,2)
> >>> In case its not apparent, the fi in the first fine is a ligature.
> >>> Python just barfs:
> >> Not Python 3:
> >> Python 3.3.2+ (default, Feb 28 2014, 00:52:16)
> >> [GCC 4.8.1] on linux
> >> Type "help", "copyright", "credits" or "license" for more information.
> > (fine, fine) = (1,2)
> > (fine, fine)
> >> (2, 2)
> >> No copy-and-paste errors involved:
> > eval("\ufb01ne")
> >> 2
> > eval(b"fine".decode("ascii"))
> >> 2
> > Aah! Thanks Peter (and Ned and Michael) — 2-3 confusion — my bad.
> > I am confused about the tone however:
> > You think this
> (fine, fine) = (1,2) # and no issue about it
> > is fine?
> Can you be more explicit? It seems like you think it isn't fine. Why
> not? What bothers you about it? Should there be an issue?
Two identifiers that to some programmers
- can look the same
- and not to others
- and that the language treats as different
is not fine (or fine) to me.
Putting them together as I did is summarizing the problem.
Think of them textually widely separated.
And the code (un)serendipitously 'working' (ie not giving NameErrors)
--
https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Sat, May 3, 2014 at 11:42 AM, Rustom Mody wrote: > Two identifiers that to some programmers > - can look the same > - and not to others > - and that the language treats as different > > is not fine (or fine) to me. The language treats them as the same, though. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Fri, 02 May 2014 17:58:51 -0700, Rustom Mody wrote: > I am confused about the tone however: You think this > (fine, fine) = (1,2) # and no issue about it > > is fine? It's no worse than any other obfuscated variable name: MOOSE, MO0SE, M0OSE = 1, 2, 3 xl, x1 = 1, 2 If you know your victim is reading source code in Ariel font, "rn" and "m" are virtually indistinguishable except at very large sizes. -- Steven D'Aprano http://import-that.dreamwidth.org/ -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Saturday, May 3, 2014 7:24:08 AM UTC+5:30, Chris Angelico wrote: > On Sat, May 3, 2014 at 11:42 AM, Rustom Mody wrote: > > Two identifiers that to some programmers > > - can look the same > > - and not to others > > - and that the language treats as different > > is not fine (or fine) to me. > The language treats them as the same, though. Whoops! I seem to be goofing a lot today Saw Peter's >>> (fine, fine) = (1,2) Didn't notice his next line >>> (fine, fine) (2, 2) So then I am back to my original point: Python is giving better behavior than Haskell in this regard! [Earlier reached this conclusion via a wrong path] -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Sat, 03 May 2014 02:02:32 +, Steven D'Aprano wrote: > On Fri, 02 May 2014 17:58:51 -0700, Rustom Mody wrote: > >> I am confused about the tone however: You think this >> > (fine, fine) = (1,2) # and no issue about it >> >> is fine? > > > It's no worse than any other obfuscated variable name: > > MOOSE, MO0SE, M0OSE = 1, 2, 3 > xl, x1 = 1, 2 > > If you know your victim is reading source code in Ariel font, "rn" and > "m" are virtually indistinguishable except at very large sizes. Ooops! I too missed that Python normalises the name fine to fine, so in fact this is not a case of obfuscation. -- Steven D'Aprano http://import-that.dreamwidth.org/ -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On Sat, May 3, 2014 at 12:02 PM, Steven D'Aprano wrote: > If you know your victim is reading source code in Ariel font, "rn" and > "m" are virtually indistinguishable except at very large sizes. I kinda like the idea of naming it after a bratty teenager who rebels against her father and runs away from home, but normally the font's called Arial. :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Unicode 7
On 5/2/2014 9:15 PM, Chris Angelico wrote: (My reading of PEP 3131 is that NFKC is used; is that what's implemented, or was that a temporary measure and/or something for Py2 to consider?) The 3.4 docs say "The syntax of identifiers in Python is based on the Unicode standard annex UAX-31, with elaboration and changes as defined below; see also PEP 3131 for further details." ... "All identifiers are converted into the normal form NFKC while parsing; comparison of identifiers is based on NFKC." Without reading UAX-31, I don't know how much was changed, but I suspect not much. In any case, the current rules are intended and very unlikely to change as that would break code going either forward or back for little purpose. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
