Re: python backup script
On 06/05/2013 23:12, [email protected] wrote: On Monday, May 6, 2013 5:48:44 PM UTC-4, Enrico 'Henryx' Bianchi wrote: Enrico 'Henryx' Bianchi wrote: > cmd2 = subprocess.Popen(['gzip' '-c'], > shell=False, > stdout=filename) Doh, my fault: cmd2 = subprocess.Popen(['gzip' '-c'], shell=False, stdout=filename stdin=cmd1.stdout) Enrico Thank you Enrico. I've just tried your script and got this error: stdin=cmd1.stdout) ^ SyntaxError: invalid syntax any idea? Missing comma on the previous line. -- http://mail.python.org/mailman/listinfo/python-list
Re: python backup script
On 06/05/2013 23:40, MMZ wrote:
On Monday, May 6, 2013 6:12:28 PM UTC-4, Chris Angelico wrote:
On Tue, May 7, 2013 at 5:01 AM, MMZ wrote:
> username = config.get('client', 'mmz')
> password = config.get('client', 'pass1')
> hostname = config.get('client', 'localhost')
Are 'mmz', 'pass1', and 'localhost' the actual values you want for
username, password, and hostname? If so, don't pass them through
config.get() at all - just use them directly. In fact, I'd be inclined
to just stuff them straight into the Database_list_command literal;
that way, it's clear how they're used, and the fact that you aren't
escaping them in any way isn't going to be a problem (tip: an
apostrophe in your password would currently break your script).
It's also worth noting that the ~/ notation is a shell feature. You
may or may not be able to use it in config.read().
ChrisA
Thanks Chris. you are right.
So I used them directly and removed configParser. The new error is:
Traceback (most recent call last):
File "./bbk.py", line 11, in ?
for database in os.popen(database_list_command).readlines():
NameError: name 'database_list_command' is not defined
any idea?
Check the spelling (remember that the name is case-sensitive).
--
http://mail.python.org/mailman/listinfo/python-list
Re: Why sfml does not play the file inside a function in this python code?
On 07/05/2013 10:27, [email protected] wrote: from tkinter import * import sfml window = Tk() window.minsize( 640, 480 ) def sonido(): file = sfml.Music.from_file('poco.ogg') file.play() test = Button ( window, text = 'Sound test', command=sonido ) test.place ( x = 10, y = 60) window.mainloop() Using Windows 7, Python 3.3, sfml 1.3.0 library, the file it is played if i put it out of the function. ¿ what am i doing wrong ? Thanks. Perhaps what's happening is that sonido starts playing it and then returns, meaning that there's no longer a reference to it ('file' is local to the function), so it's collected by the garbage collector. If that's the case, try keeping a reference to it, perhaps by making 'file' global (in a simple program like this one, using global should be OK). -- http://mail.python.org/mailman/listinfo/python-list
Re: Why sfml does not play the file inside a function in this python code?
On 07/05/2013 14:56, [email protected] wrote: El martes, 7 de mayo de 2013 12:53:25 UTC+2, MRAB escribió: On 07/05/2013 10:27, [email protected] wrote: > from tkinter import * > import sfml > > window = Tk() > window.minsize( 640, 480 ) > > def sonido(): > file = sfml.Music.from_file('poco.ogg') > file.play() > > test = Button ( window, text = 'Sound test', command=sonido ) > test.place ( x = 10, y = 60) > > window.mainloop() > > Using Windows 7, Python 3.3, sfml 1.3.0 library, the file it is played if i put it out of the function. � what am i doing wrong ? Thanks. > Perhaps what's happening is that sonido starts playing it and then returns, meaning that there's no longer a reference to it ('file' is local to the function), so it's collected by the garbage collector. If that's the case, try keeping a reference to it, perhaps by making 'file' global (in a simple program like this one, using global should be OK). Thanks. A global use of 'sonido' fix the problem. The garbage collector must be the point. But this code is part of a longer project. What can i do to fix it without the use of globals? I will use more functions like this, and i would like to keep learning python as well good programming methodology. Thanks. Presumably the details of the window are (or will be) hidden away in a class, so you could make 'file' an attribute of an instance. Also, please read this: http://wiki.python.org/moin/GoogleGroupsPython because gmail insists on adding extra linebreaks, which can be somewhat annoying. -- http://mail.python.org/mailman/listinfo/python-list
Re: Making safe file names
On 07/05/2013 20:58, Andrew Berg wrote: Currently, I keep Last.fm artist data caches to avoid unnecessary API calls and have been naming the files using the artist name. However, artist names can have characters that are not allowed in file names for most file systems (e.g., C/A/T has forward slashes). Are there any recommended strategies for naming such files while avoiding conflicts (I wouldn't want to run into problems for an artist named C-A-T or CAT, for example)? I'd like to make the files easily identifiable, and there really are no limits on what characters can be in an artist name. Conflicts won't occur if: 1. All of the characters of the artist's name are mapped to an encoding. 2. Different characters map to different encodings. 3. No encoding is a prefix of another encoding. In practice, you'll be mapping most characters to themselves. -- http://mail.python.org/mailman/listinfo/python-list
Re: MySQL Database
On 08/05/2013 19:52, Kevin Holleran wrote:
Hello,
I want to connect to a MySQL database, query for some records,
manipulate some data, and then update the database.
When I do something like this:
db_c.execute("SELECT a, b FROM Users")
for row in db_c.fetchall():
(r,d) = row[0].split('|')
(g,e) = domain.split('.')
db_c.execute("UPDATE Users SET g = '"+ g + "' WHERE a ='"+ row[0])
Will using db_c to update the database mess up the loop that is cycling
through db_c.fetchall()?
You shouldn't be building an SQL string like that because it's
susceptible to SQL injection. You should be doing it more like this:
db_c.execute("UPDATE Users SET g = %s WHERE a = %s", (g, row[0]))
The values will then be handled safely for you.
--
http://mail.python.org/mailman/listinfo/python-list
Re: help on Implementing a list of dicts with no data pattern
On 09/05/2013 00:47, rlelis wrote: Hi guys, I'm working on this long file, where i have to keep reading and storing different excerpts of text (data) in different variables (list). Once done that i want to store in dicts the data i got from the lists mentioned before. I want them on a list of dicts for later RDBMs purpose's. The data i'm working with, don't have fixed pattern (see example bellow), so what i'm doing is for each row, i want to store combinations of word/value (Key-value) to keep track of all the data. My problem is that once i'm iterating over the list (original one a.k.a file_content in the link), then i'm nesting several if clause to match the keys i want. Done that i select the keys i want to give them values and lastly i append that dict into a new list. The problem here is that i end up always with the last line repeated several times for each row it found's. Please take a look on what i have now: http://pastebin.com/A9eka7p9 You're creating a dict for highway_dict and a dict for aging_dict, and then using those dicts for every iteration of the 'for' loop. You're also appending both of the dicts onto the 'queue_row' list for every iteration of the 'for' loop. I think that what you meant to do was, for each match, to create a dict, populate it, and then append it to the list. -- http://mail.python.org/mailman/listinfo/python-list
Re: Urgent:Serial Port Read/Write
On 09/05/2013 16:35, chandan kumar wrote:
Hi all,
I'm new to python and facing issue using serial in python.I'm facing the
below error
*ser.write(port,command)*
*NameError: global name 'ser' is not defined*
*
*
Please find the attached script and let me know whats wrong in my script
and also how can i read data from serial port for the same script.
Best Regards,
Chandan.
RunScripts.py
import time
import os
import serial
import glob
ER_Address = [[0x0A,0x01,0x08,0x99,0xBB,0xBB,0xBB,0xBB,0xBB,0xBB,0xBB]]
"""
Function Name: RunSequence
Function Description:
-
A RunSequence function has Multiple calls to the RunSuite function, each call
is a single testcase consisting of all the parameters
required by the RunTesSuite.
-
"""
def WriteSerialData(command):
'ser' isn't a local variable (local to this function, that is), nor is
it a global variable (global in this file).
ser.write(port,command)
def SendPacket(Packet):
str = chr(len(Packet)) + Packet #Concatenation of Packet with the
PacketLength
print str
WriteSerialData(str)
def CreateFrame(Edata):
It's more efficient to build a list of the characters and then join
them together into a string in one step than to build the string one
character at a time.
Also, indexing into the list is considered 'unPythonic'; it's much
simpler to do it this way:
return chr(0x12) + "".join(chr(d) for d in Edata)
evt = chr(0x12)
evt = evt + chr(Edata[0])
for i in range (1, len(Edata)):
evt = evt + chr(Edata[i])
return evt
def SetRequest(data):
print data
new = []
new = sum(data, [])
Addr = CreateFrame(new)
SendPacket(Addr)
print "SendPacket Done"
ReadPacket()
def OpenPort(COMPort,BAUDRATE):
"""
This function reads the serial port and writes it.
"""
comport=COMPort
BaudRate=BAUDRATE
try:
ser = serial.Serial(
port=comport,
baudrate=BaudRate,
bytesize=serial.EIGHTBITS,
parity=serial.PARITY_NONE,
stopbits=serial.STOPBITS_ONE,
timeout=10,
xonxoff=0,
rtscts=0,
dsrdtr=0
)
if ser.isOpen():
print "Port Opened"
ser.write("Chandan")
string1 = ser.read(8)
print string1
This function returns either ser ...
return ser
else:
print "Port CLosed"
ser.close()
... or 3 ...
return 3
except serial.serialutil.SerialException:
print "Exception"
ser.close()
... or None!
if __name__ == "__main__":
CurrDir=os.getcwd()
files = glob.glob('./*pyc')
for f in files:
os.remove(f)
OpenPort returns either ser or 3 or None, but the result is just
discarded.
OpenPort(26,9600)
SetRequest(ER_Address)
#SysAPI.SetRequest('ER',ER_Address)
print "Test Scripts Execution complete"
--
http://mail.python.org/mailman/listinfo/python-list
Re: object.enable() anti-pattern
On 09/05/2013 19:21, Steven D'Aprano wrote: On Thu, 09 May 2013 09:07:42 -0400, Roy Smith wrote: In article <[email protected]>, Steven D'Aprano wrote: There is no sensible use-case for creating a file without opening it. Sure there is. Sometimes just creating the name in the file system is all you want to do. That's why, for example, the unix "touch" command exists. Since I neglected to make it clear above that I was still talking about file objects, rather than files on disk, I take responsibility for this misunderstanding. I thought that since I kept talking about file *objects* and *constructors*, people would understand that I was talking about in-memory objects rather than on-disk files. Mea culpa. So, let me rephrase that sentence, and hopefully clear up any further misunderstandings. There is no sensible use-case for creating a file OBJECT unless it initially wraps an open file pointer. You might want to do this: f = File(path) if f.exists(): ... This would be an alternative to: if os.path.exists(path): ... This principle doesn't just apply to OOP languages. The standard C I/O library doesn't support creating a file descriptor unless it is a file descriptor to an open file. open() has the semantics: "It shall create an open file description that refers to a file and a file descriptor that refers to that open file description." http://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html and there is no corresponding function to create a *closed* file description. (Because such a thing would be pointless.) [snip] -- http://mail.python.org/mailman/listinfo/python-list
Re: Unicode humor
On 10/05/2013 17:07, rusi wrote: On May 10, 8:32 pm, Chris Angelico wrote: On Sat, May 11, 2013 at 1:24 AM, Ned Batchelder wrote: > On 5/10/2013 11:06 AM, jmfauth wrote: >> On 8 mai, 15:19, Roy Smith wrote: >>> Apropos to any of the myriad unicode threads that have been going on >>> recently: >>>http://xkcd.com/1209/ >> -- >> This reflects a lack of understanding of Unicode. >> jmf > And this reflects a lack of a sense of humor. :) Isn't that a crime in the UK? ChrisA The problem with English humour (as against standard humor) is that its not unicode compliant British humour includes "double entendre", which is not French-compliant. -- http://mail.python.org/mailman/listinfo/python-list
Re: Getting ASCII encoding where unicode wanted under Py3k
On 13/05/2013 16:59, Jonathan Hayward wrote:
I have a Py3k script, pasted below. When I run it I get an error about
ASCII codecs that can't handle byte values that are too high.
The error that I am getting is:
|UnicodeEncodeError: 'ascii' codec can't encode character'\u0161' in position
1442: ordinal not in range(128)
args = ('ascii', "Content-Type: text/html\n\n\n\n...ype='submit'>\n
\n \n", 1442, 1443,'ordinalnot in range(128)')
encoding = 'ascii'
end = 1443
object = "Content-Type: text/html\n\n\n\n...ype='submit'>\n
\n \n"
reason = 'ordinalnot in range(128)'
start = 1442
with_traceback = |
(And that was posted to StackOverflow--one shot in the dark answer so far.)
My code is below. What should I be doing differently to be, in the most
immediate sense, calls to '''%(foo)s''' % locals()?
[snip]
The 'print' functions send its output to sys.stdout, which, in your
case, is set up to encode to ASCII for output, but '\u0161' can't be
encoded to ASCII.
Try encoding to UTF-8 instead:
from codecs import getwriter
sys.stdout = getwriter("utf-8")(sys.stdout.buffer)
--
http://mail.python.org/mailman/listinfo/python-list
Re: Unicode humor
On 15/05/2013 14:19, Jean-Michel Pichavant wrote:
>> >> This reflects a lack of understanding of Unicode.
>>
>> >> jmf
>>
>> > And this reflects a lack of a sense of humor. :)
>>
>> Isn't that a crime in the UK?
>>
>> ChrisA
>
> The problem with English humour (as against standard humor) is that
> its not unicode compliant
>
British humour includes "double entendre", which is not
French-compliant.
I didn't get that one. Which possibly confirm MRAB's statement.
It's called "double entendre" in English (using French words, from "à
double entente"), but that isn't correct French ("double sens").
--
http://mail.python.org/mailman/listinfo/python-list
Re: Unicode humor
On 15/05/2013 18:04, Jean-Michel Pichavant wrote:
- Original Message -
On 15/05/2013 14:19, Jean-Michel Pichavant wrote:
This reflects a lack of understanding of Unicode.
jmf
And this reflects a lack of a sense of humor. :)
Isn't that a crime in the UK?
ChrisA
The problem with English humour (as against standard humor)
is that its not unicode compliant
British humour includes "double entendre", which is not
French-compliant.
I didn't get that one. Which possibly confirm MRAB's statement.
It's called "double entendre" in English (using French words, from
"à double entente"), but that isn't correct French ("double
sens").
Thanks for clarifying, I didn't know "double entendre" had actually a
meaning in english, it's obviously 2 french words but this is the
first time I see them used together.
Occasionally speakers of one language will borrow a word or phrase from
another language and use it in a way a native speaker wouldn't (or even
understand).
--
http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
On 19/05/2013 04:53, Carlos Nepomuceno wrote: Date: Sat, 18 May 2013 22:41:32 -0400 From: [email protected] To: [email protected] Subject: Re: How to write fast into a file in python? On 05/18/2013 01:00 PM, Carlos Nepomuceno wrote: Python really writes '\n\r' on Windows. Just check the files. That's backwards. '\r\n' on Windows, IF you omit the b in the mode when creating the file. Indeed! My mistake just made me find out that Acorn used that inversion on Acorn MOS. According to this[1] (at page 449) the OSNEWL routine outputs '\n\r'. What the hell those guys were thinking??? :p Doing it that way saved a few bytes. Code was something like this: FFE3.OSASCI CMP #&0D FFE5BNE OSWRCH FFE7.OSNEWL LDA #&0A FFE9JSR OSWRCH FFECLDA #&0D FFEE.OSWRCH ... This means that the contents of the accumulator would always be preserved by a call to OSASCI. "OSNEWL This call issues an LF CR (line feed, carriage return) to the currently selected output stream. The routine is entered at &FFE7." [1] http://regregex.bbcmicro.net/BPlusUserGuide-1.07.pdf -- http://mail.python.org/mailman/listinfo/python-list
Re: Accessing Json data (I think I am nearly there) complete beginner
On 23/05/2013 17:09, Andrew Edwards-Adams wrote:
Hey guys
I think its worth stating that I have been trying to code for 1 week.
I am trying to access some Json data. My innitial code was the below:
"import mechanize
import urllib
import re
def getData():
post_url =
"http://www.tweetnaps.co.uk/leaderboards/leaderboard_json/all_time";
browser = mechanize.Browser()
browser.set_handle_robots(False)
browser.addheaders = [('User-agent', 'Firefox')]
#These are the parameters you've got from checking with the aforementioned
tools
parameters = {'page' : '1',
'rp' : '10',
'sortname' : 'total_pl',
'sortorder' : 'desc'}
#Encode the parameters
data = urllib.urlencode(parameters)
trans_array = browser.open(post_url,data).read().decode('UTF-8')
#print trans_array
myfile = open("test.txt", "w")
myfile.write(trans_array)
myfile.close()
getData()
raw_input("Complete")"
I was recommended to use the following code to access the Json data directly,
however I cannot get it to return anything. I think the guy that recommended me
this method must have got something wrong? Or perhaps I am simply incompetent:
import mechanize
import urllib
import json
def getData():
post_url =
"http://www.tweetnaps.co.uk/leaderboards/leaderboard_json/current_week";
browser = mechanize.Browser()
browser.set_handle_robots(False)
browser.addheaders = [('User-agent', 'Firefox')]
#These are the parameters you've got from checking with the aforementioned
tools
parameters = {'page' : '1',
'rp' : '50',
'sortname' : 'total_pl',
'sortorder' : 'desc'
}
#Encode the parameters
data = urllib.urlencode(parameters)
trans_array = browser.open(post_url,data).read().decode('UTF-8')
text1 = json.loads(trans_array)
print text1['rows'][0]['id'] #play around with these values to access
different data..
getData()
He told me to "#play around with these values to access different data.."
really cant get anything out of this, any ideas?
Many thanks AEA
I've just tried it. It prints "1048".
--
http://mail.python.org/mailman/listinfo/python-list
Re: Cutting a deck of cards
On 26/05/2013 18:52, RVic wrote: Suppose I have a deck of cards, and I shuffle them import random cards = [] decks = 6 cards = list(range(13 * 4 * decks)) random.shuffle(cards) So now I have an array of cards. I would like to cut these cards at some random point (between 1 and 13 * 4 * decks - 1, moving the lower half of that to the top half of the cards array. For some reason, I can't see how this can be done (I know that it must be a simple line or two in Python, but I am really stuck here). Anyone have any direction they can give me on this? Thanks, RVic, python newbie The list from its start up to, but excluding, index 'i' is cards[ : i], and the list from index 'i' to its end is cards[i : ]. Now concatenate them those slices. -- http://mail.python.org/mailman/listinfo/python-list
Re: Encodign issue in Python 3.3.1 (once again)
On 28/05/2013 17:00, Νίκος Γκρ33κ wrote: I do not know here to find connections.py Michael. But i do not understand since iam suing the following 2 statements, why a unicode error remains. #needed line, script does *not* work without it sys.stdout = os.fdopen(1, 'w', encoding='utf-8') # connect to database con = pymysql.connect( db = 'pelatologio', host = 'localhost', user = 'myself', passwd = 'mypass', init_command='SET NAMES UTF8' ) cur = con.cursor() Shall i chnage the connector form 'pymysql' => 'MySQLdb' ? A quick look at the documentation tells me that the charset can be specified in the 'connect' call, something like this: con = pymysql.connect( db = 'pelatologio', host = 'localhost', user = 'myself', passwd = 'mypass', init_command='SET NAMES UTF8', charset = 'utf-8' ) -- http://mail.python.org/mailman/listinfo/python-list
Re: The state of pySerial
On 29/05/2013 22:38, Terry Jan Reedy wrote: On 5/29/2013 4:00 PM, William Ray Wing wrote: On May 29, 2013, at 2:23 PM, Ma Xiaojun wrote: Hi, all. pySerial is probably "the solution" for serial port programming. Physical serial port is dead on PC but USB-to-Serial give it a second life. Serial port stuff won't interest end users at all. But it is still used in the EE world and so on. Arduino uses it to upload programs. Sensors may use serial port to communicate with PC. GSM Modem also uses serial port to communicate with PC. Unforunately, pySerial project doesn't seem to have a good state. I find pySerial + Python 3.3 broken on my machine (Python 2.7 is OK) . There are unanswered outstanding bugs, PyPI page has 2.6 while SF homepage still gives 2.5. Any idea? -- http://mail.python.org/mailman/listinfo/python-list Let me add another vote/request for pySerial support. I've been using it with python 2.7 on OS-X, unaware that there wasn't a path forward to python 3.x. If an external sensor absolutely positively has to be readable, then RS-232 is the only way to go. USB interfaces can and do lock up if recovery from a power failure puts power on the external side before the computer has finished initializing the CPU side. RS-232, bless its primitive heart, could care less. Then 'someone' should ask the author his intentions and offer to help or take over. This page: http://pyserial.sourceforge.net/pyserial.html#requirements says: "Python 2.3 or newer, including Python 3.x" I did some RS-232 interfacing in the 1980s, and once past the fiddly start/stop/parity bit, baud rate, and wiring issues, I had a program run connected to multiple machines for years with no more interface problems. -- http://mail.python.org/mailman/listinfo/python-list
Re: User Input
On 30/05/2013 12:48, Eternaltheft wrote: On Thursday, May 30, 2013 7:33:41 PM UTC+8, Eternaltheft wrote: Hi, I'm having trouble oh how prompt the user to enter a file name and how to set up conditions. For example, if there's no file name input by the user, a default is returned Thanks for such a fast reply! and no im not using raw input, im just using input. does raw_input work on python 3? In Python 2 it's called "raw_input" and in Python 3 it's called "input". Python 2 does have a function called "input", but it's not recommended (it's dangerous because it's equivalent to "eval(raw_input())", which will evaluate _whatever_ is entered). -- http://mail.python.org/mailman/listinfo/python-list
Re: The state of pySerial
On 30/05/2013 02:32, Ma Xiaojun wrote: I've already mailed the author, waiting for reply. For Windows people, downloading a exe get you pySerial 2.5, which list_ports and miniterm feature seems not included. To use 2.6, download the tar.gz and use standard "setup.py install" to install it (assume you have .py associated) . There is no C compiling involved in the installation process. For whether Python 3.3 is supported or not. I observed something like: http://paste.ubuntu.com/5715275/ . miniterm works for Python 3.3 at this time. The problem there is that 'desc' is a bytestring, but the regex pattern can match only a Unicode string (Python 3 doesn't let you mix bytestrings and Unicode string like a Python 2). The simplest fix would probably be to decode 'desc' to Unicode. -- http://mail.python.org/mailman/listinfo/python-list
Re: How clean/elegant is Python's syntax?
On 30/05/2013 19:44, Chris Angelico wrote: On Fri, May 31, 2013 at 4:36 AM, Ian Kelly wrote: On Wed, May 29, 2013 at 8:49 PM, rusi wrote: On May 30, 6:14 am, Ma Xiaojun wrote: What interest me is a one liner: print '\n'.join(['\t'.join(['%d*%d=%d' % (j,i,i*j) for i in range(1,10)]) for j in range(1,10)]) Ha,Ha! The join method is one of the (for me) ugly features of python. You can sweep it under the carpet with a one-line join function and then write clean and pretty code: #joinwith def joinw(l,sep): return sep.join(l) I don't object to changing the join method (one of the more shoe-horned string methods) back into a function, but to my eyes you've got the arguments backward. It should be: def join(sep, iterable): return sep.join(iterable) Trouble is, it makes some sense either way. I often put the larger argument first - for instance, I would write 123412341324*5 rather than the other way around - and in this instance, it hardly seems as clear-cut as you imply. But the function can't be written to take them in either order, because strings are iterable too. (And functions that take args either way around aren't better than those that make a decision.) And additional argument (pun not intended) for putting sep second is that you can give it a default value: def join(iterable, sep=""): return sep.join(iterable) -- http://mail.python.org/mailman/listinfo/python-list
Re: lstrip problem - beginner question
On 04/06/2013 16:21, mstagliamonte wrote:
Hi everyone,
I am a beginner in python and trying to find my way through... :)
I am writing a script to get numbers from the headers of a text file.
If the header is something like:
h01 = ('>scaffold_1')
I just use:
h01.lstrip('>scaffold_')
and this returns me '1'
But, if the header is:
h02: ('>contig-100_0')
if I use:
h02.lstrip('>contig-100_')
this returns me with: ''
...basically nothing. What surprises me is that if I do in this other way:
h02b = h02.lstrip('>contig-100')
I get h02b = ('_1')
and subsequently:
h02b.lstrip('_')
returns me with: '1' which is what I wanted!
Why is this happening? What am I missing?
The methods 'lstrip', 'rstrip' and 'strip' don't strip a string, they
strip characters.
You should think of the argument as a set of characters to be removed.
This code:
h01.lstrip('>scaffold_')
will return the result of stripping the characters '>', '_', 'a', 'c',
'd', 'f', 'l', 'o' and 's' from the left-hand end of h01.
A simpler example:
>>> 'xyyxyabc'.lstrip('xy')
'abc'
It strips the characters 'x' and 'y' from the string, not the string
'xy' as such.
They are that way because they have been in Python for a long time,
long before sets and such like were added to the language.
--
http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish => Greek (subprocess complain)
On 05/06/2013 06:40, Michael Torrie wrote:
On 06/04/2013 10:15 PM, Νικόλαος Κούρας wrote:
One of my Greek filenames is "Ευχή του Ιησού.mp3". Just a Greek
filename with spaces. Is there a problem when a filename contain both
english and greek letters? Isn't it still a unicode string?
All i did in my CentOS was 'mv "Euxi tou Ihsou.mp3" "Ευχή του
Ιησού.mp3"
and the displayed filename after 'ls -l' returned was:
is -rw-r--r-- 1 nikos nikos 3511233 Jun 4 14:11 \305\365\367\336\
\364\357\365\ \311\347\363\357\375.mp3
There is no way at all to check the charset used to store it in hdd?
It should be UTF-8, but it doesn't look like it. Is there some linxu
command or some python command that will print out the actual
encoding of '\305\365\367\336\ \364\357\365\
\311\347\363\357\375.mp3' ?
I can see that you are starting to understand things. I can't answer
your question (don't know the answer), but you're correct about one
thing. A filename is just a sequence of bytes. We'd hope it would be
utf-8, but it could be anything. Even worse, it's not possible to tell
from a byte stream what encoding it is unless we just try one and see
what happens. Text editors, for example, have to either make a guess
(utf-8 is a good one these days), or ask, or try to read from the first
line of the file using ascii and see if there's a source code character
set command to give it an idea.
From the previous posts I guessed that the filename might be encoded
using ISO-8859-7:
>>> s = b"\305\365\367\336\ \364\357\365\ \311\347\363\357\375.mp3"
>>> s.decode("iso-8859-7")
'Ευχή\\ του\\ Ιησού.mp3'
Yes, that looks the same.
--
http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish => Greek (subprocess complain)
On 05/06/2013 18:43, Νικόλαος Κούρας wrote: Τη Τετάρτη, 5 Ιουνίου 2013 8:56:36 π.μ. UTC+3, ο χρήστης Steven D'Aprano έγραψε: Somehow, I don't know how because I didn't see it happen, you have one or more files in that directory where the file name as bytes is invalid when decoded as UTF-8, but your system is set to use UTF-8. So to fix this you need to rename the file using some tool that doesn't care quite so much about encodings. Use the bash command line to rename each file in turn until the problem goes away. But renaming ia hsell access like 'mv 'Euxi tou Ihsou.mp3' 'Ευχή του Ιησου.mp3' leade to that unknown encoding of this bytestream '\305\365\367\336\ \364\357\365\ \311\347\363\357\375.mp3' But please tell me Steven what linux tool you think it can encode the weird filename to proper 'Ευχή του Ιησου.mp3' utf-8? or we cna write a script as i suggested to decode back the bytestream using all sorts of available decode charsets boiling down to the original greek letters. Using Python, I think you could get the filenames using os.listdir, passing the directory name as a bytestring so that it'll return the names as bytestrings. Then, for each name, you could decode from its current encoding and encode to UTF-8 and rename the file, passing the old and new paths to os.rename as bytestrings. -- http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish => Greek (subprocess complain)
On 06/06/2013 04:43, Νικόλαος Κούρας wrote:
Τη Τετάρτη, 5 Ιουνίου 2013 9:43:18 μ.μ. UTC+3, ο χρήστης Νικόλαος Κούρας έγραψε:
> Τη Τετάρτη, 5 Ιουνίου 2013 9:32:15 μ.μ. UTC+3, ο χρήστης MRAB έγραψε:
>
> > On 05/06/2013 18:43, οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½ οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½ wrote:
>
> >
>
> > > οΏ½οΏ½ οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½, 5 οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½ 2013 8:56:36
οΏ½.οΏ½. UTC+3, οΏ½ οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½ Steven D'Aprano οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½:
>
> >
>
> > >
>
> >
>
> > > Somehow, I don't know how because I didn't see it happen, you have one or
>
> >
>
> > > more files in that directory where the file name as bytes is invalid when
>
> >
>
> > > decoded as UTF-8, but your system is set to use UTF-8. So to fix this you
>
> >
>
> > > need to rename the file using some tool that doesn't care quite so much
>
> >
>
> > > about encodings. Use the bash command line to rename each file in turn
>
> >
>
> > > until the problem goes away.
>
> >
>
> > >
>
> >
>
> ' leade to that unknown encoding of this bytestream '\305\365\367\336\
\364\357\365\ \311\347\363\357\375.mp3'
>
> >
>
> > >
>
> >
>
> > > But please tell me Steven what linux tool you think it can encode the
weird filename to proper 'οΏ½οΏ½οΏ½οΏ½ οΏ½οΏ½οΏ½ οΏ½οΏ½οΏ½οΏ½οΏ½.mp3' utf-8?
>
> >
>
> > >
>
> >
>
> > > or we cna write a script as i suggested to decode back the bytestream
using all sorts of available decode charsets boiling down to the original greek letters.
>
> >
>
> > >
>
> >
>
>
>
>
>
> Actually you were correct i was typing greek and is aw the fileneme here in
gogole groups as:
>
>
>
> > > But renaming ia hsell access like 'mv 'Euxi tou Ihsou.mp3' 'οΏ½οΏ½οΏ½οΏ½
οΏ½οΏ½οΏ½ οΏ½οΏ½οΏ½οΏ½οΏ½.mp3
>
>
>
> so maybe the filenames have to be decoded to greek-iso but then agian the
contain both greek letters but their extension are in english chars like '.mp3'
>
>
>
>
>
> > Using Python, I think you could get the filenames using os.listdir,
>
> > passing the directory name as a bytestring so that it'll return the
>
> > names as bytestrings.
>
>
>
>
>
> > Then, for each name, you could decode from its current encoding and
>
> > encode to UTF-8 and rename the file, passing the old and new paths to
>
> > os.rename as bytestrings.
>
>
>
> Iam not sure i follow:
>
>
>
> Change this:
>
>
>
> # Compute a set of current fullpaths
>
> fullpaths = set()
>
> path = "/home/nikos/public_html/data/apps/"
>
>
>
> for root, dirs, files in os.walk(path):
>
>for fullpath in files:
>
>fullpaths.add( os.path.join(root, fullpath) )
>
>
>
>
>
> to what to make the full url readable by files.py?
MRAB can you please explain in more clarity your idea of solution?
I was suggesting a way to rename the files so that their names are
encoded in UTF-8 (they appear to be encoded in ISO-8859-7).
You MUST TEST IT thoroughly first, of course, before trying it on the
actual files.
It could go something like this:
import os
# Give the path as a bytestring so that we'll get the names as bytestrings.
root_folder = b"/home/nikos/public_html/data/apps/"
# Setting TESTING to True will make it print out what renamings it will
do, but
# not actually do them.
TESTING = True
# Walk through the files.
for root, dirs, files in os.walk(root_folder):
for name in files:
try:
# Is this name encoded in UTF-8?
name.decode("utf-8")
except UnicodeDecodeError:
# Decoding from UTF- failed, which means that the name is
not valid
# UTF-8.
# It appears (from elsewhere) that the names are encoded in
# ISO-8859-7, so decode from that and re-encode to UTF-8.
new_name = name.decode("iso-8859-7").encode("utf-8")
old_path = os.path.join(root, name)
new_path = os.path.join(root, new_name)
if TESTING:
print("Will rename {!r} to {!r}".format(old_path,
new_path))
else:
print("Renaming {!r} to {!r}".format(old_path, new_path))
os.rename(old_path, new_path)
--
http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish => Greek (subprocess complain)
On 06/06/2013 13:04, Νικόλαος Κούρας wrote:
First of all thank you for helping me MRAB.
After make some alternation to your code ia have this:
# Give the path as a bytestring so that we'll get the filenames as bytestrings
path = b"/home/nikos/public_html/data/apps/"
# Setting TESTING to True will make it print out what renamings it will do, but
not actually do them
TESTING = True
# Walk through the files.
for root, dirs, files in os.walk( path ):
for filename in files:
try:
# Is this name encoded in UTF-8?
filename.decode('utf-8')
except UnicodeDecodeError:
# Decoding from UTF-8 failed, which means that the name
is not valid UTF-8
# It appears that the filenames are encoded in
ISO-8859-7, so decode from that and re-encode to UTF-8
new_filename =
filename.decode('iso-8859-7').encode('utf-8')
old_path = os.path.join(root, filename)
new_path = os.path.join(root, new_filename)
if TESTING:
print( '''Will rename {!r} --->
{!r}'''.format( old_path, new_path ) )
else:
print( '''Renaming {!r} --->
{!r}'''.format( old_path, new_path ) )
os.rename( old_path, new_path )
sys.exit(0)
-
and the output can be seen here: http://superhost.gr/cgi-bin/files.py
We are in test mode so i dont know if when renaming actually take place what
the encodings will be.
Shall i switch off test mode and try it for real?
The first one is '/home/nikos/public_html/data/apps/Ευχή του Ιησού.mp3'.
The second one is '/home/nikos/public_html/data/apps/Σκέψου έναν
αριθμό.exe'.
These names are currently encoded in ISO-8859-7, but will be encoded in
UTF-8 if you turn off test mode.
If you're happy for that change to happen, then go ahead.
--
http://mail.python.org/mailman/listinfo/python-list
Re: How to store a variable when a script is executing for next time execution?
On 06/06/2013 16:37, Chris Angelico wrote:
On Thu, Jun 6, 2013 at 10:14 PM, Dave Angel wrote:
If you're planning on having the files densely populated (meaning no gaps in
the numbering), then you could use a binary search to find the last one.
Standard algorithm would converge with 10 existence checks if you have a
limit of 1000 files.
Or, if you can dedicate a directory to those files, you could go even simpler:
dataFile = open('filename0.0.%d.json'%len(os.listdir()), 'w')
The number of files currently existing equals the number of the next file.
Assuming no gaps.
--
http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish => Greek (subprocess complain)
On 06/06/2013 19:13, Νικόλαος Κούρας
wrote:
Τη Πέμπτη, 6 Ιουνίου 2013 3:50:52 μ.μ. UTC+3, ο χρήστης MRAB έγραψε:
> If you're happy for that change to happen, then go ahead.
I have made some modifications to the code you provided me but i think something that doesnt accur to me needs fixing.
for example i switched:
# Give the path as a bytestring so that we'll get the filenames as bytestrings
path = b"/home/nikos/public_html/data/apps/"
# Walk through the files.
for root, dirs, files in os.walk( path ):
for filename in files:
to:
# Give the path as a bytestring so that we'll get the filenames as bytestrings
path = os.listdir( b'/home/nikos/public_html/data/apps/' )
os.listdir returns a list of the names of the objects in the given
directory.
# iterate over all filenames in the apps directory
Exactly, all the names.
for fullpath in path
# Grabbing just the filename from path
The name is a bytestring. Note, name, NOT full path.
The following line will fail because the name is a bytestring,
and you can't mix bytestrings with Unicode strings:
filename = fullpath.replace( '/home/nikos/public_html/data/apps/', '' )
^ bytestring
^ Unicode string
^ Unicode string
I dont know if it has the same effect:
Here is the the whole snippet:
=
# Give the path as a bytestring so that we'll get the filenames as bytestrings
path = os.listdir( b'/home/nikos/public_html/data/apps/' )
# iterate over all filenames in the apps directory
for fullpath in path
# Grabbing just the filename from path
filename = fullpath.replace( '/home/nikos/public_html/data/apps/', '' )
try:
# Is this name encoded in utf-8?
filename.decode('utf-8')
except UnicodeDecodeError:
# Decoding from UTF-8 failed, which means that the name is not valid utf-8
# It appears that this filename is encoded in greek-iso, so decode from that and re-encode to utf-8
new_filename = filename.decode('iso-8859-7').encode('utf-8')
# rename filename form greek bytestream-> utf-8 bytestream
old_path = os.path.join(root, filename)
new_path = os.path.join(root, new_filename)
os.rename( old_path, new_path )
#
# Compute a set of current fullpaths
path = os.listdir( '/home/nikos/public_html/data/apps/' )
# Load'em
for fullpath in path:
try:
# Check the presence of a file against the database and insert if it doesn't exist
cur.execute('''SELECT url FROM files WHERE url = "" (fullpath,) )
data = ""#URL is unique, so should only be one
if not data:
# First time for file; primary key is automatic, hit is defaulted
cur.execute('''INSERT INTO files (url, host, lastvisit) VALUES (%s, %s, %s)''', (fullpath, host, lastvisit) )
except pymysql.ProgrammingError as e:
print( repr(e) )
==
The error is:
[Thu Jun 06 21:10:23 2013] [error] [client 79.103.41.173] File "files.py", line 64
[Thu Jun 06 21:10:23 2013] [error] [client 79.103.41.173] for fullpath in path
[Thu Jun 06 21:10:23 2013] [error] [client 79.103.41.173]^
[Thu Jun 06 21:10:23 2013] [error] [client 79.103.41.173] SyntaxError: invalid syntax
Doesn't os.listdir( ...) returns a list with all filenames?
But then again when replacing take place to shert the fullpath to just the filane i think it doesn't not work because the os.listdir was opened as bytestring and not as a string
What am i doing wrong?
You're changing things without checking what they do!
--
http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish => Greek (subprocess complain)
On 06/06/2013 22:07, Lele Gaifax wrote: Νικόλαος Κούρας writes: Tahnks here is what i have up until now with many corrections. I'm afraid many more are needed :-) ... # rename filename form greek bytestreams --> utf-8 bytestreams old_path = b'/home/nikos/public_html/data/apps/' + b'filename') new_path = b'/home/nikos/public_html/data/apps/' + b'new_filename') os.rename( old_path, new_path ) a) there are two syntax errors, you have spurious close brackets there b) you are basically assigning *constant* expressions to both variables, most probably not what you meant Yet again, he's changed things unnecessarily, and the code was meant only as a one-time fix to correct the encoding of some filenames. :-( -- http://mail.python.org/mailman/listinfo/python-list
Re: trigger at TDM/2 only
On 07/06/2013 01:03, cerr wrote:
Hi,
I have a process that I can trigger only at a certain time. Assume I have a TDM
period of 10min, that means, I can only fire my trigger at the 5th minute of
every 10min cycle i.e. at XX:05, XX:15, XX:25... For hat I came up with
following algorithm which oly leaves the waiting while loop if minute % TDM/2
is 0 but not if minute % TDM is 0:
min = datetime.datetime.now().timetuple().tm_hour*60 +
datetime.datetime.now().timetuple().tm_min
while not (min%tdm_timeslot != 0 ^ min%(int(tdm_timeslot/2)) != 0):
time.sleep(10)
logger.debug("WAIT
"+str(datetime.datetime.now().timetuple().tm_hour*60 +
datetime.datetime.now().timetuple().tm_min))
logger.debug(str(min%(int(tdm_timeslot/2)))+" -
"+str(min%tdm_timeslot))
min = datetime.datetime.now().timetuple().tm_hour*60 +
datetime.datetime.now().timetuple().tm_min
logger.debug("RUN UPDATE CHECK...")
But weird enough, the output I get is something like this:
I would expect my while to exit the loop as soon as the minute turns 1435...
why is it staying in? What am I doing wrong here?
WAIT 1434
3 - 3
WAIT 1434
4 - 4
WAIT 1434
4 - 4
WAIT 1434
4 - 4
WAIT 1434
4 - 4
WAIT 1434
4 - 4
WAIT 1435
4 - 4
WAIT 1435
0 - 5
WAIT 1435
0 - 5
WAIT 1435
0 - 5
WAIT 1435
0 - 5
WAIT 1435
0 - 5
WAIT 1436
0 - 5
RUN UPDATE CHECK...
Possibly it's due to operator precedence. The bitwise operators &, |
and ^ have a higher precedence than comparisons such as !=.
A better condition might be:
min % tdm_timeslot != tdm_timeslot // 2
or, better yet, work out how long before the next trigger time and then
sleep until then.
--
http://mail.python.org/mailman/listinfo/python-list
Re: Problems with serial port interface
On 07/06/2013 11:17, [email protected] wrote: Sorry for my quote, but do you have any suggestion? Il giorno martedì 4 giugno 2013 23:25:21 UTC+2, [email protected] ha scritto: Hi, i'm programming in python for the first time: i want to create a serial port reader. I'm using python3.3 and pyQT4; i'm using also pyserial. Below a snippet of the code: class CReader(QThread): def start(self, ser, priority = QThread.InheritPriority): self.ser = ser QThread.start(self, priority) self._isRunning = True self.numData=0; def run(self): print("Enter Creader") while True: if self._isRunning: try: data = self.ser.read(self.numData) n = self.ser.inWaiting() if n: data = self.ser.read(n) self.emit(SIGNAL("newData(QString)"), data.decode('cp1252', 'ignore')) self.ser.flushInput() except: pass else: return def stop(self): self._isRunning = False self.wait() This code seems work well, but i have problems in this test case: +baud rate:19200 +8/n/1 +data transmitted: 1 byte every 5ms After 30seconds (more or less) the program crashes: seems a buffer problem, but i'm not really sure. What's wrong? Using a "bare except" like this: try: ... except: ... is virtually always a bad idea. The only time I'd ever do that would be, say, to catch something, print a message, and then re-raise it: try: ... except: print("Something went wrong!") raise Even then, catching Exception would be better than a bare except. A bare except will catch _every_ exception, including NameError (which would mean that it can't find a name, possibly due to a spelling error). A bare except with pass, like you have, is _never_ a good idea. Python might be trying to complain about a problem, but you're preventing it from doing so. Try removing the try...except: pass and let Python tell you if it has a problem. -- http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish => Greek (subprocess complain)
On 07/06/2013 12:53, Νικόλαος Κούρας wrote:
[snip]
#
# Collect filenames of the path dir as bytes
greek_filenames = os.listdir( b'/home/nikos/public_html/data/apps/' )
for filename in greek_filenames:
# Compute 'path/to/filename' in bytes
greek_path = b'/home/nikos/public_html/data/apps/' + b'filename'
try:
This is a worse way of doing it because the ISO-8859-7 encoding has 1
byte per codepoint, meaning that it's more 'tolerant' (if that's the
word) of errors. A sequence of bytes that is actually UTF-8 can be
decoded as ISO-8859-7, giving gibberish.
UTF-8 is less tolerant, and it's the encoding that ideally you should
be using everywhere, so it's better to assume UTF-8 and, if it fails,
try ISO-8859-7 and then rename so that any names that were ISO-8859-7
will be converted to UTF-8.
That's the reason I did it that way in the code I posted, but, yet
again, you've changed it without understanding why!
filepath = greek_path.decode('iso-8859-7')
# Rename current filename from greek bytes --> utf-8 bytes
os.rename( greek_path, filepath.encode('utf-8') )
except UnicodeDecodeError:
# Since its not a greek bytestring then its a proper utf8
bytestring
filepath = greek_path.decode('utf-8')
[snip]
--
http://mail.python.org/mailman/listinfo/python-list
Re: Errin when executing a cgi script that sets a cookie in the browser
On 07/06/2013 08:51, Νικόλαος Κούρας wrote: Finally no suexec erros any more after chown all log files to nobody:nobody and thei corresponding paths. Now the error has been transformed to: [Fri Jun 07 10:48:47 2013] [error] [client 79.103.41.173] (2)No such file or directory: exec of '/home/nikos/public_html/cgi-bin/koukos.py' failed [Fri Jun 07 10:48:47 2013] [error] [client 79.103.41.173] Premature end of script headers: koukos.py [Fri Jun 07 10:48:47 2013] [error] [client 79.103.41.173] File does not exist: /home/nikos/public_html/500.shtml but from interpretor view: [email protected] [~/www/cgi-bin]# python koukos.py Set-Cookie: nikos=admin; expires=Mon, 02 Jun 2014 07:50:18 GMT; Path=/ Content-type: text/html; charset=utf-8 ΑΠΟ ΔΩ ΚΑΙ ΣΤΟ ΕΞΗΣ ΔΕΝ ΣΕ ΕΙΔΑ, ΔΕΝ ΣΕ ΞΕΡΩ, ΔΕΝ ΣΕ ΑΚΟΥΣΑ! ΘΑ ΕΙΣΑΙ ΠΛΕΟΝ Ο ΑΟΡΑΤΟΣ ΕΠΙΣΚΕΠΤΗΣ!! (2)No such file or directory: exec of '/home/nikos/public_html/cgi-bin/koukos.py' failed Can find what? koukos.py is there inside the cg-bin dir with 755 perms. It's looking for '/home/nikos/public_html/cgi-bin/koukos.py'. Have a look in '/home/nikos/public_html/cgi-bin'. Is 'koukos.py' in there? -- http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish => Greek (subprocess complain)
On 07/06/2013 20:31, Zero Piraeus wrote: : On 7 June 2013 14:52, Νικόλαος Κούρας wrote: File "/home/nikos/public_html/cgi-bin/files.py", line 81 [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] if( flag == 'greek' ) [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] ^ [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] SyntaxError: invalid syntax [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] Premature end of script headers: files.py --- i dont know why that if statement errors. Oh for f... READ SOME DOCUMENTATION, FOR THE LOVE OF BOB!!! READ YOUR OWN EFFING CODE! Look at this: http://docs.python.org/2/tutorial/controlflow.html Read it now? Of course not. Go away and read it. Now have you read it? GO AND READ IT. What does an if statement end with? Hint: yep, that's it. Have you noticed how the line in the traceback doesn't match the line in the post? -- http://mail.python.org/mailman/listinfo/python-list
Re: Errin when executing a cgi script that sets a cookie in the browser
On 07/06/2013 19:24, Νικόλαος Κούρας wrote: Τη Παρασκευή, 7 Ιουνίου 2013 5:32:09 μ.μ. UTC+3, ο χρήστης MRAB έγραψε: Can find what? koukos.py is there inside the cg-bin dir with 755 perms. It's looking for '/home/nikos/public_html/cgi-bin/koukos.py'. Its looking for its self?!?! Have a look in '/home/nikos/public_html/cgi-bin'. Is 'koukos.py' in there? Yes it is. [email protected] [~/www/cgi-bin]# ls -l total 56 drwxr-xr-x 2 nikos nikos 4096 Jun 6 20:29 ./ drwxr-x--- 4 nikos nobody 4096 Jun 5 11:32 ../ -rwxr-xr-x 1 nikos nikos 1199 Apr 25 15:33 convert.py* -rwxr-xr-x 1 nikos nikos 5434 Jun 7 14:51 files.py* -rw-r--r-- 1 nikos nikos170 May 30 15:18 .htaccess -rwxr-xr-x 1 nikos nikos 1160 Jun 6 06:27 koukos.py* -rwxr-xr-x 1 nikos nikos 9356 Jun 6 09:13 metrites.py* -rwxr-xr-x 1 nikos nikos 13512 Jun 6 09:13 pelatologio.py* [email protected] [~/www/cgi-bin]# The prompt says "~/www/cgi-bin". Is that the same as "/home/nikos/public_html/cgi-bin"? Try: ls -l /home/nikos/public_html/cgi-bin -- http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish => Greek (subprocess complain)
On 08/06/2013 07:49, Νικόλαος Κούρας wrote: Τη Σάββατο, 8 Ιουνίου 2013 5:52:22 π.μ. UTC+3, ο χρήστης Cameron Simpson έγραψε: On 07Jun2013 11:52, =?utf-8?B?zp3Or866zr/PgiDOk866z4EzM866?= wrote: | [email protected] [~/www/cgi-bin]# [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] File "/home/nikos/public_html/cgi-bin/files.py", line 81 | [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] if( flag == 'greek' ) | [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] ^ | [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] SyntaxError: invalid syntax | [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] Premature end of script headers: files.py | --- | i dont know why that if statement errors. Python statements that continue (if, while, try etc) end in a colon, so: Oh iam very sorry. Oh my God i cant beleive i missed a colon *again*: I have corrected this: # # Collect filenames of the path dir as bytes filename_bytes = os.listdir( b'/home/nikos/public_html/data/apps/' ) for filename in filename_bytes: # Compute 'path/to/filename' into bytes filepath_bytes = b'/home/nikos/public_html/data/apps/' + b'filename' flag = False try: # Assume current file is utf8 encoded filepath = filepath_bytes.decode('utf-8') flag = 'utf8' except UnicodeDecodeError: try: # Since current filename is not utf8 encoded then it has to be greek-iso encoded filepath = filepath_bytes.decode('iso-8859-7') flag = 'greek' except UnicodeDecodeError: print( '''I give up! File name is unreadable!''' ) if flag == 'greek': # Rename filename from greek bytes --> utf-8 bytes os.rename( filepath_bytes, filepath.encode('utf-8') ) == Now everythitng were supposed to work but instead iam getting this surrogate error once more. What is this surrogate thing? Since i make use of error cathcing and handling like 'except UnicodeDecodeError:' then it utf8's decode fails for some reason, it should leave that file alone and try the next file? try: # Assume current file is utf8 encoded filepath = filepath_bytes.decode('utf-8') flag = 'utf8' except UnicodeDecodeError: This is what it supposed to do, correct? == [Sat Jun 08 09:39:34 2013] [error] [client 79.103.41.173] File "/home/nikos/public_html/cgi-bin/files.py", line 94, in [Sat Jun 08 09:39:34 2013] [error] [client 79.103.41.173] cur.execute('''SELECT url FROM files WHERE url = %s''', (filename,) ) [Sat Jun 08 09:39:34 2013] [error] [client 79.103.41.173] File "/usr/local/lib/python3.3/site-packages/PyMySQL3-0.5-py3.3.egg/pymysql/cursors.py", line 108, in execute [Sat Jun 08 09:39:34 2013] [error] [client 79.103.41.173] query = query.encode(charset) [Sat Jun 08 09:39:34 2013] [error] [client 79.103.41.173] UnicodeEncodeError: 'utf-8' codec can't encode character '\\udcce' in position 35: surrogates not allowed Look at the traceback. It says that the exception was raised by: query = query.encode(charset) which was called by: cur.execute('''SELECT url FROM files WHERE url = %s''', (filename,) ) But what is 'filename'? And what has it to do with the first code snippet? Does the traceback have _anything_ to do with the first code snippet? -- http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish => Greek (subprocess complain)
On 08/06/2013 17:53, Νικόλαος Κούρας wrote:
Sorry for th delay guys, was busy with other thigns today and i am still
reading your resposes, still ahvent rewad them all just Cameron's:
Here is what i have now following Cameron's advices:
#
# Collect filenames of the path directory as bytes
path = b'/home/nikos/public_html/data/apps/'
filenames_bytes = os.listdir( path )
for filename_bytes in filenames_bytes:
try:
filename = filename_bytes.decode('utf-8)
except UnicodeDecodeError:
# Since its not a utf8 bytestring then its for sure a greek
bytestring
# Prepare arguments for rename to happen
utf8_filename = filename_bytes.encode('utf-8')
greek_filename = filename_bytes.encode('iso-8859-7')
utf8_path = path + utf8_filename
greek_path = path + greek_filename
# Rename current filename from greek bytes --> utf8 bytes
os.rename( greek_path, utf8_path )
==
I know this is wrong though.
Yet you did it anyway!
Since filename_bytes is the current filename encoded as utf8 or greek-iso
then i cannot just *encode* what is already encoded by doing this:
utf8_filename = filename_bytes.encode('utf-8')
greek_filename = filename_bytes.encode('iso-8859-7')
Try reading and understanding the code I originally posted.
--
http://mail.python.org/mailman/listinfo/python-list
Re: A certainl part of an if() structure never gets executed.
On 11/06/2013 21:20, Νικόλαος Κούρας wrote:
[code]
if not re.search( '=', name ) and not re.search( '=', month )
and not re.search( '=', year ):
cur.execute( '''SELECT * FROM works WHERE clientsID =
(SELECT id FROM clients WHERE name = %s) and MONTH(lastvisit) = %s and
YEAR(lastvisit) = %s ORDER BY lastvisit ASC''', (name, month, year) )
elif not re.search( '=', month ) and not re.search( '=', year ):
cur.execute( '''SELECT * FROM works WHERE
MONTH(lastvisit) = %s and YEAR(lastvisit) = %s ORDER BY lastvisit ASC''',
(month, year) )
elif not re.search( '=', year ):
cur.execute( '''SELECT * FROM works WHERE
YEAR(lastvisit) = %s ORDER BY lastvisit ASC''', year )
else:
print('''Πώς να γίνει αναζήτηση αφού δεν επέλεξες
ούτε πελάτη ούτε μήνα ή τουλάχιστον το έτος?''')
print( '' )
sys.exit(0)
data = cur.fetchall()
hits = money = 0
for row in data:
hits += 1
money = money + row[2]
..
..
selects based on either name, month, year or all of them
[/code]
The above if structure works correctly *only* if the user sumbits by form:
name, month, year
or
month, year
If, he just enter a year in the form and sumbit then, i get no error, but no
results displayed back.
Any ideas as to why this might happen?
What are the values of 'name', 'month' and 'year' in each of the cases?
Printing out ascii(name), ascii(month) and ascii(year), will be helpful.
Then try stepping through those lines in your head.
--
http://mail.python.org/mailman/listinfo/python-list
Re: A certainl part of an if() structure never gets executed.
On 12/06/2013 02:25, [email protected] wrote: Τη Τετάρτη, 12 Ιουνίου 2013 1:43:21 π.μ. UTC+3, ο χρήστης MRAB έγραψε: On 11/06/2013 21:20, Νικόλαος Κούρας wrote: [snip] What are the values of 'name', 'month' and 'year' in each of the cases? Printing out ascii(name), ascii(month) and ascii(year), will be helpful. Then try stepping through those lines in your head. i hav epribted all values of those variables and they are all correct. i just dont see why ti fails to enter the specific if case. is there a shorter and more clear way to write this? i didnt understood what Rick trie to told me. can you help me write it more easily? What are the values that are printed? -- http://mail.python.org/mailman/listinfo/python-list
Re: A certainl part of an if() structure never gets executed.
On 12/06/2013 12:17, Νικόλαος Κούρας wrote: As with most of your problems you are barking up the wrong tree. Why not use the actual value you get from the form to check whether you have a valid month? Do you understand why "0" is submitted instead of "=="? Bye, Andreas I have corrected the enumerate loop but it seems thet now the year works and the selected name nad month fail: if '=' not in ( name and month and year ): cur.execute( '''SELECT * FROM works WHERE clientsID = (SELECT id FROM clients WHERE name = %s) and MONTH(lastvisit) = %s and YEAR(lastvisit) = %s ORDER BY lastvisit ASC''', (name, month, year) ) elif '=' not in ( month and year ): cur.execute( '''SELECT * FROM works WHERE MONTH(lastvisit) = %s and YEAR(lastvisit) = %s ORDER BY lastvisit ASC''', (month, year) ) elif '=' not in year: cur.execute( '''SELECT * FROM works WHERE YEAR(lastvisit) = %s ORDER BY lastvisit ASC''', year ) else: print( 'Πώς να γίνει αναζήτηση αφού δεν επέλεξες ούτε πελάτη ούτε μήνα ή τουλάχιστον το έτος?' ) print( '' ) sys.exit(0) i tried in , not in and all possible combinations. but somehow it confuses me. doesn't that mean? if '=' not in ( name and month and year ): if '=' does not exists as a char inside the name and month and year variables? i think it does, but why it fails then? You think it does, but you're wrong. -- http://mail.python.org/mailman/listinfo/python-list
Re: A certainl part of an if() structure never gets executed.
On 12/06/2013 18:13, Νικόλαος Κούρας wrote:
On 12/6/2013 7:40 μμ, MRAB wrote:
On 12/06/2013 12:17, Νικόλαος Κούρας wrote:
As with most of your problems you are barking up the wrong tree.
Why not use the actual value you get from the form to check whether you
have a valid month?
Do you understand why "0" is submitted instead of "=="?
Bye, Andreas
I have corrected the enumerate loop but it seems thet now the year works
and the selected name nad month fail:
if '=' not in ( name and month and year ):
cur.execute( '''SELECT * FROM works WHERE clientsID =
(SELECT id FROM
clients WHERE name = %s) and MONTH(lastvisit) = %s and YEAR(lastvisit) =
%s ORDER BY lastvisit ASC''', (name, month, year) )
elif '=' not in ( month and year ):
cur.execute( '''SELECT * FROM works WHERE MONTH(lastvisit)
= %s and
YEAR(lastvisit) = %s ORDER BY lastvisit ASC''', (month, year) )
elif '=' not in year:
cur.execute( '''SELECT * FROM works WHERE YEAR(lastvisit)
= %s ORDER
BY lastvisit ASC''', year )
else:
print( 'Πώς να γίνει αναζήτηση αφού
δεν επέλεξες
ούτε πελάτη ούτε μήνα ή τουλάχιστον το έτος?' )
print( '' )
sys.exit(0)
i tried in , not in and all possible combinations. but somehow it
confuses me.
doesn't that mean?
if '=' not in ( name and month and year ):
if '=' does not exists as a char inside the name and month and year
variables?
i think it does, but why it fails then?
You think it does, but you're wrong.
How would you telll in english word what this is doing?
if '=' not in ( name and month and year ):
In English, the result of:
x and y
is basically:
if bool(x) is false then the result is x, otherwise the result is y
For example:
>>> bool("")
False
>>> "" and "world"
''
>>> bool("Hello")
True
>>> "Hello" and "world"
'world'
and then what this is doing?
if '=' not in ( name or month or year ):
In English, the result of:
x or y
is basically:
if bool(x) is true then the result is x, otherwise the result is y
For example:
>>> bool("")
False
>>> "" or "world"
'world'
>>> bool("Hello")
True
>>> "Hello" or "world"
'Hello'
These can be strung together, so that:
x and y and z
is equivalent to:
(x and y) and z
and:
x or y or z
is equivalent to:
(x or y) or z
and so on, however many times you wish to do it.
Never before i used not in with soe many variables in parenthesi, up
until now i was specified it as not in var 1 and not in var 2 and not in
var 2 and so on
Keep it simple:
if '=' not in name and '=' not in month and '=' not in year:
There may be a shorter way, but you seem confused enough as it is.
--
http://mail.python.org/mailman/listinfo/python-list
Re: Version Control Software
On 13/06/2013 07:00, cutems93 wrote: Thank you everyone for such helpful responses! Actually, I have one more question. Does anybody have experience with closed source version control software? If so, why did you buy it instead of downloading open source software? Does closed source vcs have some benefits over open source in some part? I've used Microsoft SourceSafe. I didn't like it (does anyone? :-)). -- http://mail.python.org/mailman/listinfo/python-list
Re: Eval of expr with 'or' and 'and' within
On 14/06/2013 18:28, Michael Torrie wrote: On 06/14/2013 10:49 AM, Steven D'Aprano wrote: Correct. In Python, all boolean expressions are duck-typed: they aren't restricted to True and False, but to any "true-ish" and "false-ish" value, or as the Javascript people call them, truthy and falsey values. There are a couple of anomalies -- the timestamp representing midnight is falsey, because it is implemented as a zero number of seconds; also exhausted iterators and generators ought to be considered falsey, since they are empty, but because they don't know they are empty until called, they are actually treated as truthy. But otherwise, the model is very clean. Good explanation! Definitely enlightened me. Thank you. The general rule is that an object is true-ish unless it's false-ish (there are fewer false-ish objects than true-ish objects, e.g. zero vs non-zero int). -- http://mail.python.org/mailman/listinfo/python-list
Re: problem uploading docs to pypi
On 14/06/2013 23:53, Irmen de Jong wrote: Hi, I'm experiencing some trouble when trying to upload the documentation for one of my projects on Pypi. I'm getting a Bad Gateway http error message. Anyone else experiencing this? Is this an intermittent issue or is there a problem with Pypi? Downloading documentation (from pythonhosted.org) works fine. About 10 ten days ago I got the error: Upload failed (503): backend write error while trying to upload to PyPI, and it failed the same way the second time, but worked some time later. -- http://mail.python.org/mailman/listinfo/python-list
Re: Eval of expr with 'or' and 'and' within
On 15/06/2013 00:06, Nobody wrote: On Fri, 14 Jun 2013 16:49:11 +, Steven D'Aprano wrote: Unlike Javascript though, Python's idea of truthy and falsey is actually quite consistent: Beyond that, if a user-defined type implements a __nonzero__() method then it determines whether an instance is true or false. If it implements a __len__() method, then an instance is true if it has a non-zero length. It's __nonzero__ in Python 2, __bool__ in Python 3. -- http://mail.python.org/mailman/listinfo/python-list
Re: Fatal Python error: Py_Initialize: can't initialize sys standard streams
On 15/06/2013 23:10, alex23 wrote: On Jun 16, 7:29 am, [email protected] wrote: I get this error when I try to save .dxf files in Inkscape: Fatal Python error: Py_Initialize: can't initialize sys standard streams Then it seems to recover but it doesn't really recover. It saves the files and then DraftSite won't open them. Here is what the > thing says when Inkscape tried to fix the saving problem. What do you mean by "Inkscape tried to fix the saving problem"? File "D:\Program Files (x86)\Inkscape\python\Lib\encodings\__init__.py", line 123 raise CodecRegistryError,\ ^ SyntaxError: invalid syntax To me that traceback looks like it's Python 3 trying to run code written for Python 2. Here's a report of a similar issue with Blender (which also provides a local install of Python under Windows): http://translate.google.com.au/translate?hl=en&sl=fr&u=http://blenderclan.tuxfamily.org/html/modules/newbb/viewtopic.php%3Ftopic_id%3D36497&prev=/search%3Fq%3Dinkscape%2BCodecRegistryError (Sorry for the ugly url, it's a Google translation of a french language page) Do you have a separate installation of Python? It's possible it may be conflicting. If you rename it's folder to something else (which will temporarily break that install), do you still see this same issue in Inkscape? -- http://mail.python.org/mailman/listinfo/python-list
Re: Updating a filename's counter value failed each time
On 17/06/2013 17:39, Simpleton wrote:
Hello again, something simple this time:
After a user selects a file from the form, that sleection of his can be
found form reading the variable 'filename'
If the filename already exists in to the database i want to update its
counter and that is what i'm trying to accomplish by:
---
if form.getvalue('filename'):
cur.execute('''UPDATE files SET hits = hits + 1, host = %s, lastvisit =
%s WHERE url = %s''', (host, lastvisit, filename) )
---
For some reason this never return any data, because for troubleshooting
i have tried:
-
data = cur.fetchone()
if data:
print("something been returned out of this"_
Since for sure the filename the user selected is represented by a record
inside 'files' table why its corresponding counter never seems to get
updated?
You say "for sure". Really? Then why isn't it working as you expect?
When it comes to debugging, """assumption is the mother of all
-ups""" [insert relevant expletive for ""].
Assume nothing.
What is the value of 'filename'?
What are the entries in the 'files' table?
Print them out, for example:
print("filename is", ascii(filename))
or write them into a log file and then look at them.
--
http://mail.python.org/mailman/listinfo/python-list
Re: Updating a filename's counter value failed each time
On 17/06/2013 19:32, Jens Thoms Toerring wrote:
Νίκος wrote:
On 17/6/2013 8:54 μμ, Jens Thoms Toerring wrote:
> Also take care to check the filename you insert - a malicous
> user might cobble together a file name that is actually a SQL
> statement and then do nasty things to your database. I.e. never
> insert values you received from a user without checking them.
Yes in generally user iput validation is needed always, but here here
the filename being selected is from an html table list of filenames.
But i take it you eman that someone might tried it to pass a bogus
"filename" value from the url like:
http://superhost.gr/cgi-bin/files.py?filename="Select.";
Si that what you mean?
Well, you neer wrote where this filename is coming from.
so all I could assume was that the user can enter a more
or less random file name. If he only can select one from
a list you put together there's probably less of a problem.
But the comma inside the execute statement doesn't protect me from such
actions opposed when i was using a substitute operator?
> I would guess because you forgot the uotes around string
> values in your SQL statement which thus wasn't executed.
i tried you suggestions:
cur.execute('''UPDATE files SET hits = hits + 1, host = %s, lastvisit =
%s WHERE url = "%s"''', (host, lastvisit, filename) )
seems the same as:
cur.execute('''UPDATE files SET hits = hits + 1, host = %s, lastvisit =
%s WHERE url = %s''', (host, lastvisit, filename) )
since everything is tripled quoted already what would the difference be
in "%s" opposed to plain %s ?
As I wrote you need *single* quotes around strings in
SQL statements. Double quotes won't do - this is SQL
and not Python so you're dealing with a different lan-
guage and thus different rules apply. The triple single
quotes are seen by Python, but SQL needs its own.
The query looks safe to me as he _is_ using a parametrised query.
--
http://mail.python.org/mailman/listinfo/python-list
Re: Updating a filename's counter value failed each time
On 17/06/2013 21:44, John Gordon wrote:
In Alister writes:
> #update file's counter if cookie does not exist cur.execute('''UPDATE
> files SET hits = hits + 1, host = %s, lastvisit =
> %s WHERE url = %s''', (host, lastvisit, filename) )
>
> if cur.rowcount:
>print( " database has been affected" )
>
> indeed every time i select afilename the message gets printed bu then
> again noticing the database via phpmyadmin the filename counter is
> always remaining 0, and not added by +1
replase
if cur.rowcount:
print( " database has been affected" )
with print cur.rowcount()
rowcount isn't a method call; it's just an attribute. You don't need
the parentheses.
Well, you do need parentheses, it's just that you need them around the
'print':
if cur.rowcount:
print(cur.rowcount)
--
http://mail.python.org/mailman/listinfo/python-list
Re: Why is regex so slow?
On 18/06/2013 17:45, Roy Smith wrote:
I've got a 170 MB file I want to search for lines that look like:
[2010-10-20 16:47:50.339229 -04:00] INFO (6): songza.amie.history - ENQUEUEING:
/listen/the-station-one
This code runs in 1.3 seconds:
--
import re
pattern = re.compile(r'ENQUEUEING: /listen/(.*)')
count = 0
for line in open('error.log'):
m = pattern.search(line)
if m:
count += 1
print count
--
If I add a pre-filter before the regex, it runs in 0.78 seconds (about
twice the speed!)
--
import re
pattern = re.compile(r'ENQUEUEING: /listen/(.*)')
count = 0
for line in open('error.log'):
if 'ENQ' not in line:
continue
m = pattern.search(line)
if m:
count += 1
print count
--
Every line which contains 'ENQ' also matches the full regex (61425
lines match, out of 2.1 million total). I don't understand why the
first way is so much slower.
Once the regex is compiled, you should have a state machine pattern
matcher. It should be O(n) in the length of the input to figure out
that it doesn't match as far as "ENQ". And that's exactly how long it
should take for "if 'ENQ' not in line" to run as well. Why is doing
twice the work also twice the speed?
I'm running Python 2.7.3 on Ubuntu Precise, x86_64.
I'd be interested in how the 'regex' module
(http://pypi.python.org/pypi/regex) compares. :-)
--
http://mail.python.org/mailman/listinfo/python-list
Re: Why is regex so slow?
On 18/06/2013 20:21, Roy Smith wrote: In article , Mark Lawrence wrote: Out of curiousity have the tried the new regex module from pypi rather than the stdlib version? A heck of a lot of work has gone into it see http://bugs.python.org/issue2636 I just installed that and gave it a shot. It's *slower* (and, much higher variation from run to run). I'm too exhausted fighting with OpenOffice to get this into some sane spreadsheet format, so here's the raw timings: Built-in re module: 0:01.32 0:01.33 0:01.32 0:01.33 0:01.35 0:01.32 0:01.35 0:01.36 0:01.33 0:01.32 regex with flags=V0: 0:01.66 0:01.53 0:01.51 0:01.47 0:01.81 0:01.58 0:01.78 0:01.57 0:01.64 0:01.60 regex with flags=V1: 0:01.53 0:01.57 0:01.65 0:01.61 0:01.83 0:01.82 0:01.59 0:01.60 0:01.55 0:01.82 I reckon that about 1/3 of that time is spent in PyArg_ParseTupleAndKeywords, just getting the arguments! There's a higher initial overhead in using regex than string methods, so working just a line at time will take longer. -- http://mail.python.org/mailman/listinfo/python-list
Re: A few questiosn about encoding
On 20/06/2013 07:26, Steven D'Aprano wrote: On Wed, 19 Jun 2013 18:46:59 -0700, Rick Johnson wrote: On Thursday, June 13, 2013 2:11:08 AM UTC-5, Steven D'Aprano wrote: Gah! That's twice I've screwed that up. Sorry about that! Yeah, and your difficulty explaining the Unicode implementation reminds me of a passage from the Python zen: "If the implementation is hard to explain, it's a bad idea." The *implementation* is easy to explain. It's the names of the encodings which I get tangled up in. You're off by one below! ASCII: Supports exactly 127 code points, each of which takes up exactly 7 bits. Each code point represents a character. 128 codepoints. Latin-1, Latin-2, MacRoman, MacGreek, ISO-8859-7, Big5, Windows-1251, and about a gazillion other legacy charsets, all of which are mutually incompatible: supports anything from 127 to 65535 different code points, usually under 256. 128 to 65536 codepoints. UCS-2: Supports exactly 65535 code points, each of which takes up exactly two bytes. That's fewer than required, so it is obsoleted by: 65536 codepoints. etc. UTF-16: Supports all 1114111 code points in the Unicode charset, using a variable-width system where the most popular characters use exactly two- bytes and the remaining ones use a pair of characters. UCS-4: Supports exactly 4294967295 code points, each of which takes up exactly four bytes. That is more than needed for the Unicode charset, so this is obsoleted by: UTF-32: Supports all 1114111 code points, using exactly four bytes each. Code points outside of the range 0 through 1114111 inclusive are an error. UTF-8: Supports all 1114111 code points, using a variable-width system where popular ASCII characters require 1 byte, and others use 2, 3 or 4 bytes as needed. Ignoring the legacy charsets, only UTF-16 is a terribly complicated implementation, due to the surrogate pairs. But even that is not too bad. The real complication comes from the interactions between systems which use different encodings, and that's nothing to do with Unicode. -- http://mail.python.org/mailman/listinfo/python-list
Re: A few questiosn about encoding
On 20/06/2013 17:37, Chris Angelico wrote: On Fri, Jun 21, 2013 at 2:27 AM, wrote: And all these coding schemes have something in common, they work all with a unique set of code points, more precisely a unique set of encoded code points (not the set of implemented code points (byte)). Just what the flexible string representation is not doing, it artificially devides unicode in subsets and try to handle eache subset differently. UTF-16 divides Unicode into two subsets: BMP characters (encoded using one 16-bit unit) and astral characters (encoded using two 16-bit units in the D800::/5 netblock, or equivalent thereof). Your beloved narrow builds are guilty of exactly the same crime as the hated 3.3. UTF-8 divides Unicode into subsets which are encoded in 1, 2, 3, or 4 bytes, and those who previously used ASCII still need only 1 byte per codepoint! -- http://mail.python.org/mailman/listinfo/python-list
Re: Does upgrade from 2.7.3 to 2.7.5 require uninstall?
On 20/06/2013 19:35, Wanderer wrote: Do I need to uninstall Python 2.7.3 before installing Python 2.7.5? No. -- http://mail.python.org/mailman/listinfo/python-list
Re: Default Value
On 21/06/2013 19:26, Rick Johnson wrote: On Friday, June 21, 2013 12:47:56 PM UTC-5, Rotwang wrote: It isn't clear to me from your posts what exactly you're proposing as an alternative to the way Python's default argument binding works. In your version of Python, what exactly would happen when I passed a mutable argument as a default value in a def statement? E.g. this: >>> a = [1, 2, 3] >>> a.append(a) >>> b = object() >>> def f(x = [None, b, [a, [4]]]): ... pass # do something What would you like to see the interpreter do in this case? Ignoring that this is a completely contrived example that has no use in the real world, here are one of three methods by which i can handle this: The Benevolent Approach: I could cast a "virtual net" over my poor lemmings before they jump off the cliff by throwing an exception: Traceback (most recent screw-up last): Line BLAH in SCRIPT def f(x = [None, b, [a, [4]]]): ArgumentError: No mutable default arguments allowed! What about this: def f(x=Foo()): pass # do something Should it raise an exception? Only if a Foo instance is mutable? How do you know whether such an instance is mutable? The Apathetic Approach: I could just assume that a programmer is responsible for the code he writes. If he passes mutables into a function as default arguments, and then mutates the mutable later, too bad, he'll understand the value of writing solid code after a few trips to exception Hell. The Malevolent Approach (disguised as beneva-loon-icy): I could use early binding to confuse the hell out of him and enjoy the laughs with all my ivory tower buddies as he falls into fits of confusion and rage. Then enjoy again when he reads the docs. Ahh, the gift that just keeps on giving! How does the "Apathetic Approach" differ from the "Malevolent Approach"? Conclusion: As you can probably guess the malevolent approach has some nice fringe benefits. You know, out of all these post, not one of you guys has presented a valid use-case that will give validity to the existence of this PyWart -- at least not one that CANNOT be reproduced by using my fine examples. All you can muster is some weak argument about protecting the lemmings. Is anyone up the challenge? Does anyone here have any real chops? PS: I won't be holding my breath. Speaking of which, on 11 January 2013, in the thread "PyWart: Import resolution order", you were asked: """Got any demonstrable code for Python 4000 yet?""" and you said: """I am working on it. Stay tuned. Rick is going to rock your little programming world /very/ soon.""" How soon is "/very/ soon" (clearly longer than 5 months), and how did you fix this "PyWart"? -- http://mail.python.org/mailman/listinfo/python-list
Re: Default Value
On 21/06/2013 21:44, Rick Johnson wrote: On Friday, June 21, 2013 2:25:49 PM UTC-5, MRAB wrote: On 21/06/2013 19:26, Rick Johnson wrote: > > The Apathetic Approach: > > I could just assume that a programmer is responsible for the > code he writes. If he passes mutables into a function as > default arguments, and then mutates the mutable later, too > bad, he'll understand the value of writing solid code after > a few trips to exception Hell. > > The Malevolent Approach (disguised as beneva-loon-icy): > > I could use early binding to confuse the hell out of him and > enjoy the laughs with all my ivory tower buddies as he falls > into fits of confusion and rage. Then enjoy again when he > reads the docs. Ahh, the gift that just keeps on giving! How does the "Apathetic Approach" differ from the "Malevolent Approach"? In the apathetic approach i allow the programmer to be the sole proprietor of his own misfortunes. He lives by the sword, and thus, he can die by the sword. Alternatively the malevolent approach injects misfortunes for the programmer on the behalf of esoteric rules. In this case he will live by sword, and he could die by the sword, or he could be unexpectedly blown to pieces by a supersonic Howitzer shell. It's an Explicit death versus an Implicit death; and Explicit should ALWAYS win! The only way to strike a reasonable balance between the explicit death and implicit death is to throw up a warning: "INCOMING" Which in Python would be the "MutableArgumentWarning". *school-bell* I notice that you've omitted any mention of how you'd know that the argument was mutable. -- http://mail.python.org/mailman/listinfo/python-list
Re: Default Value
On 22/06/2013 00:51, Rick Johnson wrote: On Friday, June 21, 2013 5:49:51 PM UTC-5, MRAB wrote: I notice that you've omitted any mention of how you'd know that the argument was mutable. My argument has always been that mutables should not be passed into subroutines as default arguments because bad things can happen. And Python's excuse of saving the poor dummies is no excuse. It does not matter if we are passing the arguments into the current implementation of "python functions which maintain state of default mutables arguments between successive calls" or in a more desirable system of truly "stateless subroutines". I also believe that a programmer should not be prevented from passing mutable default arguments, but if he does, I'm not going to provide any sort of protection -- other than possibly throwing up a warning message. So, having mutables as default arguments is a bad idea, but a programmer should not be prevented from doing that, and a warning message should be printed on such occasions. Now, YOU, and everyone else, cannot destroy the main points of my argument because the points are in fact rock solid, however, what you will do is to focus in one small detail, one little tiny (perceived) weakness in the armor, and you will proceed to destroy that small detail (in this case how i will determine mutability), and hope that the destruction of this insignificant detail will start a chain-reaction that will propagate out and bring down my entire position. In order to print a warning, Python needs to know whether the object is mutable, so it's an important detail. So you want me to tell you how to query the mutability of an object... Ha Ha Ha! Sorry, but that's not going to happen! It's a detail that you're not going to help to solve. Why should i help the developers of this language. What have they done for me? They've developed this language, and provided it for free. They've even released the source code. You perceive flaws that you say must be fixed, but you're not going to help to fix them. WOULD YOU OFFER ASSISTANCE TO PEOPLE THAT HAVE TREATED YOU THIS WAY? And let's just be honest. You don't want my assistance. You just want me to fumble the ball. Then you can use that fumble as an excuse to write me off. Nice try! I _do_ want you to help to improve the language, and I don't care if you don't get it right first time. I didn't get it right first time when I worked on the regex module (I think that what I have on PyPI is my _third_ attempt!). You want to gain my respect? Then start engaging in honest debates. Start admitting that yes, somethings about Python are not only undesirable, they're just plain wrong. Python isn't perfect, but then no language is perfect. There will always be compromises, and the need to maintain backwards compatibility means that we're stuck with some "mis-features", but I think it's still worth using; I still much prefer it to other languages. Stop calling me a troll when i am not. And not just me, stop calling other people trolls too! Stop using the personal attacks and straw man arguments. ??? Finally, get the core devs to realize that this list matters and they need to participate (including you know who!) Everyone is a volunteer. The core devs contribute by developing the language, and whether they participate in this particular list is entirely up to them; how they choose to spend _their own_ free time is, again, entirely up to them. -- http://mail.python.org/mailman/listinfo/python-list
Re: Default Value
On 22/06/2013 02:40, Chris Angelico wrote: On Sat, Jun 22, 2013 at 11:31 AM, Steven D'Aprano wrote: Thinking about this, I think that the only safe thing to do in Rickython 4000 is to prohibit putting mutable objects inside tuples. Putting a list or a dict inside a tuple is just a bug waiting to happen! I think you're onto something here, but you really haven't gone far enough. Mutable objects *anywhere* are a problem. The solution? Abolish mutable objects. Strings (bytes and Unicode), integers, decimals (floats are a problem to many people), tuples of the above, and dictionaries mapping any of the above to any other of the above, should be enough to do everything. Pure functional languages don't have mutables, or even variables, but then we're not talking about a pure functional language, we're talking about Python. -- http://mail.python.org/mailman/listinfo/python-list
Re: Default Value
On 22/06/2013 03:32, Rick Johnson wrote: On Friday, June 21, 2013 8:54:50 PM UTC-5, MRAB wrote: On 22/06/2013 00:51, Rick Johnson wrote: > On Friday, June 21, 2013 5:49:51 PM UTC-5, MRAB wrote: > My argument has always been that mutables should not be > passed into subroutines as default arguments because bad > things can happen. [...] I also believe that a programmer > should not be prevented from passing mutable default > arguments [...] So, having mutables as default arguments is a bad idea, but a programmer should not be prevented from doing that, and a warning message should be printed on such occasions. Well i'll admit that does sound like a contradiction. Basically i meant, programmers should be *discouraged* from passing mutables as default arguments but not *prevented*. Of course, utilizing a stateless subroutine like i suggest, argument mutability would not matter. Sometimes when you're passionate about something your explanations become so verbose as to render your idea lost in the noise. Obviously i made that mistake here :) Yes, a more measured explanation tends to work better. :-) In my last reply to Rotwang i explained the functionality i seek to achieve in a set of three interactive examples. Take a look at those and let me know what you think. Hmm. Like they say, "The devil's in the details". As with the mutability thing, I need to think about it some more. Sometimes it seems straight-forward, until you try to do it! :-) > Why should i help the developers of this language. What have > they done for me? They've developed this language, and provided it for free. They've even released the source code. You perceive flaws that you say must be fixed, but you're not going to help to fix them. Agreed. And i am thankful for everyone's contributions. I can be a bit harsh sometimes but my intention has always been to improve Python. I _do_ want you to help to improve the language, and I don't care if you don't get it right first time. I didn't get it right first time when I worked on the regex module (I think that what I have on PyPI is my _third_ attempt!). Well thanks for admitting you are not perfect. I know i am not. We all had to start somewhere and anyone who believes he knows everything is most assuredly a fool. Learning is a perpetual process, same for software evolution. > You want to gain my respect? Then start engaging in honest > debates. Start admitting that yes, somethings about Python > are not only undesirable, they're just plain wrong. Python isn't perfect, but then no language is perfect. There will always be compromises, and the need to maintain backwards compatibility means that we're stuck with some "mis-features", but I think it's still worth using; I still much prefer it to other languages. I understand. We can't break backwards compatibility for everything, even breaking it for some large flaws could cause a fatal abandonment of the language by long time users. I just don't understand why i get so much hostility when i present the flaws for discussion. Part of my intention is to air the flaw, both for new users and old users, but a larger intention is to discover the validity of my, or others, possible solutions. The problem is in _how_ you do it, namely, very confrontationally. You call yourself "RantingRick". People don't like ranting! Instead of saying "This is obviously a flaw, and you're a fool if you don't agree", you should say "IMHO, this is a flaw, and this is how I think it could be fixed". Then, if someone points out a problem in your suggested fix, you can say "OK, I see your point, I'll try to see whether I can think of a way around that". Etc. And even if that solution involves a fork, that is not a bad thing. Creating a new fork and then garnering an acceptance of the new spinoff would lead to at worse, a waste of time and a huge learning experience, or at best, an evolution of the language. > Stop calling me a troll when i am not. And not just me, stop > calling other people trolls too! Stop using the personal > attacks and straw man arguments. Sorry. I failed to explain that this statement was meant not directly for you but as a general statement to all members. Sometimes i feel like my back is against the wall and i'm fighting several foes at once. That can lead to me getting defensive. -- http://mail.python.org/mailman/listinfo/python-list
Re: n00b question on spacing
On 23/06/2013 00:56, Dave Angel wrote:
On 06/22/2013 07:37 PM, Chris Angelico wrote:
On Sun, Jun 23, 2013 at 9:28 AM, Dave Angel wrote:
On 06/22/2013 07:12 PM, Chris Angelico wrote:
On Sun, Jun 23, 2013 at 1:24 AM, Rick Johnson
wrote:
_fmtstr = "Item wrote to MongoDB database {0}, {1}"
msg = _fmtstr.format(_arg1, _arg2)
As a general rule, I don't like separating format strings and their
arguments. That's one of the more annoying costs of i18n. Keep them in
a single expression if you possibly can.
On the contrary, i18n should be done with config files. The format string
**as specified in the physical program**
is the key to the actual string which is located in the file/dict.
Otherwise you're shipping separate source files for each language -- blecch.
What I was trying to say is that the programmereze format string in the
code is replaced at runtime by the French format string in the config file.
The simplest way to translate is to localize the format string; that's
the point of .format()'s named argument system (since it lets you
localize in a way that reorders the placeholders). What that does is
it puts the format string away in a config file, while the replaceable
parts are here in the source. That's why I say that's a cost of i18n -
it's a penalty that has to be paid in order to move text strings away.
Certainly the reorderability of the format string is significant. Not
only can it be reordered, but more than one instance of some of the
values is permissible if needed. (What's missing is a decent handling
of such things as singular/plural, where you want a different version
per country of one (or a few) words from the format string, based on
whether a value is exactly 1.)
[snip]
One vs not-one isn't good enough. Some languages use the singular with
any numbers ending in '1'. Some languages have singular, dual, and
plural. Etc. It's surprising how inventive people can be! :-)
--
http://mail.python.org/mailman/listinfo/python-list
Re: Looking for a name for a deployment framework...
On 24/06/2013 13:50, Roy Smith wrote: In article <[email protected]>, [email protected] wrote: Hi all, Any suggestions for a good name, for a framework that does automatic server deployments? It's like Fabric, but more powerful. It has some similarities with Puppet, Chef and Saltstack, but is written in Python. Key points are that it uses Python, but is still very declarative and supports introspection. It supports parallel deployments, and interactivity. And it has a nice commandline shell with autocompletion for traversing the deployment tree. The repository: https://github.com/jonathanslenders/python-deployer/tree/refactoring-a-lot-v2 Suggestions welcome :) Jonathan Without forming any opinion on the software itself, the best advice I can offer is that naming puns are very popular. If you're thinking of this as a fabric replacement, I would go with cloth, textile, material, gabardine, etc. Snakeskin? Oh, I see that's already taken. :-( -- http://mail.python.org/mailman/listinfo/python-list
Re: [SPAM] Re: Default Value
On 24/06/2013 15:22, Grant Edwards wrote:
On 2013-06-22, Ian Kelly wrote:
On Fri, Jun 21, 2013 at 7:15 PM, Steven D'Aprano
wrote:
On Fri, 21 Jun 2013 23:49:51 +0100, MRAB wrote:
On 21/06/2013 21:44, Rick Johnson wrote:
[...]
Which in Python would be the "MutableArgumentWarning".
*school-bell*
I notice that you've omitted any mention of how you'd know that the
argument was mutable.
That's easy. Just call ismutable(arg). The implementation of ismutable is
just an implementation detail, somebody else can work that out. A
language designer of the sheer genius of Rick can hardly be expected to
worry himself about such trivial details.
While we're at it, I would like to petition for a function
terminates(f, args) that I can use to determine whether a function
will terminate before I actually call it.
I think it should be terminate_time() -- so you can also find out how
long it's going to run. It can return None if it's not going to
terminate...
Surely that should be float("inf")! Anything else would be ridiculous!
:-)
--
http://mail.python.org/mailman/listinfo/python-list
Re: Is this PEP-able? fwhile
On 24/06/2013 23:35, Chris Angelico wrote: On Tue, Jun 25, 2013 at 8:30 AM, Tim Chase wrote: On 2013-06-25 07:38, Chris Angelico wrote: Python has no issues with breaking out of loops, and even has syntax specifically to complement it (the 'else:' clause). Use break/continue when appropriate. from minor_gripes import breaking_out_of_nested_loops_to_top_level True. There are times I do wish for a 'goto'. But if goto were implemented, I would also use it for jumping _into_ loops, and I'm not sure that's going to make the feature popular :) I think a better way would be to label the outer loop somehow and then break out of it by name. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python development tools
On 25/06/2013 03:24, rusi wrote: On Tuesday, June 25, 2013 4:41:22 AM UTC+5:30, Ben Finney wrote: rusi writes: > I dont however think that the two philosophies are the same. See > http://www.tcl.tk/doc/scripting.html That essay constrasts “scripting” versus “system programming”, a useful (though terminologically confusing) distinction. It's a mistake to think that essay contrasts “scripting“ versus “programming”. But the essay never justifies its aversion to “programming” as a term for what it's describing, so that mistake is easy to make. The essay is 15 years old. So a bit dated. Referred to it as it conveys the sense/philosophy of scripting. > On Monday, June 24, 2013 11:50:38 AM UTC+5:30, Ben Finney wrote: > > Any time someone has shown me a “Python script”, I don't see how > > it's different from what I'd call a “Python program”. So I just > > mentally replace “scripting with “programming”. > > If you are saying that python spans the scripting to programming > spectrum exceptionally well, I agree. I'm saying that “scripting” is a complete subset of “programming”, so it's nonsense to talk about “the scripting-to-programming spectrum”. Scripting is, always, programming. Scripts are, always, programs. (But not vice-versa; I do acknowledge there is more to programming than scripting.) I say this because anything anyone has said to me about the former is always something included already by the latter. So I don't see much need for treating scripts as somehow distinct from programs, or scripting as somehow distinct from programming. Whenever you're doing the former, you're doing the latter by definition. My personal associations with the word 'scripting' - Cavalier attitude towards efficiency And convenience for the programmer. """Manipulating long texts using variable-length strings? Yes, I know it's inefficient, but it's still faster than doing it by hand!""" - No interest (and maybe some scorn) towards over-engineering (hence OOP) - Heavy use of regular expressions, also sophistication of the command-line args - A sense (maybe vague) of being glue more than computation, eg. a bash script is almost certain to invoke something other than builtins alone and is more likely to invoke a non-bash script than a bash script. For a C program that likelihood is the other way round. For python it could be either Automating tasks, e.g. controlling other applications and stringing together tasks that you would otherwise be doing by hand. -- http://mail.python.org/mailman/listinfo/python-list
Re: io module and pdf question
On 25/06/2013 17:15, [email protected] wrote: Thank you Rusi and Christian! So it sounds like I should read the pdf data in as binary: import os pdfPath = '~/Desktop/test.pdf' colorlistData = '' with open(os.path.expanduser(pdfPath), 'rb') as f: for i in f: if 'XYZ:colorList' in i: colorlistData = i.split('XYZ:colorList')[1] break print(colorlistData) This gives me the error: TypeError: Type str doesn't support the buffer API I admit I know nothing about binary, except it's ones and zeroes. Is there a way to read it in as binary, convert it to ascii/unicode, and then somehow split it by newline characters so that I can pull the appropriate metadata lines out? For example, XYZ:colorList="DarkBlue,Yellow" In Python 2, string literals like '' are by default bytestrings. If you want a Unicode string you need to add the prefix u, so u''. In Python 3, string literals like '' are by default Unicode. If you want a bytestring you need to add the prefix b, so b''. Python 2 was lax when mixing bytestrings with Unicode strings. Python 3, on the other hand, insists that you know the difference: is it text (Unicode) or binary data (bytestring)? Thanks! Jay -- Most of the PDF objects are therefore not encoded. It is, however, possible to include a PDF into another PDF and to encode it, but that's a rare case. Therefore the metadata can usually be read in text mode. However, to correctly find all objects, the xref-table indexes offsets into the PDF. It must be treated binary in any case, and that's the funny reason for the first 3 characters of the PDF - they must include characters with the 8th bit set, such that FTP applications treat it as binary. Christian -- http://mail.python.org/mailman/listinfo/python-list
Re: Parsing soap/xml result
On 25/06/2013 23:28, miguel olivares varela wrote: I try to parse a soap/xml answer like: http://schemas.xmlsoap.org/soap/envelope/"; xmlns:xsd="http://www.w3.org/2001/XMLSchema"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";> http://schemas.xmlsoap.org/soap/encoding/"; xmlns:ns1="http://192.168.2.135:8490/gift-ws/services/SRV_GIFT_PKG";> http://schemas.xmlsoap.org/soap/encoding/";> xsi:type="xsd:string">0 xsi:type="xsd:string">OK xsi:type="xsd:string"> [snip] The XML contains this: That's the problem. -- http://mail.python.org/mailman/listinfo/python-list
Re: re.finditer() skips unicode into selection
On 26/06/2013 20:18, [email protected] wrote: I am using the following Highlighter class for Spell Checking to work on my QTextEdit. class Highlighter(QSyntaxHighlighter): In Python 2.7, the re module has a somewhat limited idea of what a "word" character is. It recognises 'DEVANAGARI LETTER NA' as a letter, but 'DEVANAGARI VOWEL SIGN E' as a diacritic. The pattern ur'(?u)\w+' will therefore split "नेपाली" into 3 parts. pattern = ur'\w+' def __init__(self, *args): QSyntaxHighlighter.__init__(self, *args) self.dict = None def setDict(self, dict): self.dict = dict def highlightBlock(self, text): if not self.dict: return text = unicode(text) format = QTextCharFormat() format.setUnderlineColor(Qt.red) format.setUnderlineStyle(QTextCharFormat.SpellCheckUnderline) The LOCALE flag is for locale-sensitive 1-byte per character bytestrings. It's rarely useful. The UNICODE flag is for dealing with Unicode strings, which is what you need here. You shouldn't be using both at the same time! unicode_pattern=re.compile(self.pattern,re.UNICODE|re.LOCALE) for word_object in unicode_pattern.finditer(text): if not self.dict.spell(word_object.group()): print word_object.group() self.setFormat(word_object.start(), word_object.end() - word_object.start(), format) But whenever I pass unicode values into my QTextEdit the re.finditer() does not seem to collect it. When I pass "I am a नेपाली" into the QTextEdit. The output is like this: I I I a I am I am I am a I am a I am a I am a I am a I am a I am a I am a It is completely ignoring the unicode. What might be the issue. I am new to PyQt and regex. Im using Python 2.7 and PyQt4. There's an alternative regex implementation at: http://pypi.python.org/pypi/regex It's a drop-in replacement for the re module, but with a lot of additions, including better handling of Unicode. -- http://mail.python.org/mailman/listinfo/python-list
Re: Devnagari Unicode Conversion Issues
On 27/06/2013 16:05, darpan6aya wrote:
How can i convert text of the following type
नेपाली
into devnagari unicode in Python 2.7?
Is that a bytestring? In other words, is its type 'str'?
If so, you need to decode it. That particular string is UTF-8:
>>> print "नेपाली".decode("utf-8")
नेपाली
--
http://mail.python.org/mailman/listinfo/python-list
Re: Why is the argparse module so inflexible?
On 29/06/2013 06:28, Steven D'Aprano wrote: On Fri, 28 Jun 2013 18:36:37 -0700, Ethan Furman wrote: On 06/27/2013 03:49 PM, Steven D'Aprano wrote: [rant] I think it is lousy design for a framework like argparse to raise a custom ArgumentError in one part of the code, only to catch it elsewhere and call sys.exit. At the very least, that OUGHT TO BE A CONFIG OPTION, and OFF BY DEFAULT. [emphasis added] Libraries should not call sys.exit, or raise SystemExit. Whether to quit or not is not the library's decision to make, that decision belongs to the application layer. Yes, the application could always catch SystemExit, but it shouldn't have to. So a library that is explicitly designed to make command-line scripts easier and friendlier should quit with a traceback? Really? Yes, really. [snip] +1 It's the job of argparse to parse the arguments. What should happen if they're invalid is for its caller to decide. -- http://mail.python.org/mailman/listinfo/python-list
Re: MeCab UTF-8 Decoding Problem
On 29/06/2013 12:29, [email protected] wrote: Hi, I am trying to use a program called MeCab, which does syntax analysis on Japanese text. The problem I am having is that it returns a byte string and if I try to print it, it prints question marks for almost all characters. However, if I try to use .decide, it throws an error. Here is my code: #!/usr/bin/python # -*- coding:utf-8 -*- import MeCab tagger = MeCab.Tagger("-Owakati") This is a bytestring. Are you sure it shouldn't be a Unicode string instead, i.e. u'MeCabで遊んでみよう!'? text = 'MeCabで遊んでみよう!' result = tagger.parse(text) print result result = result.decode('utf-8') print result And here is the output: MeCab �� �� ��んで�� �� ��う! Traceback (most recent call last): File "test.py", line 11, in result = result.decode('utf-8') File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode bytes in position 6-7: invalid continuation byte -- (program exited with code: 1) Press return to continue Also my terminal is able to display Japanese characters properly. For example print '日本語' works perfectly fine. Any ideas? -- http://mail.python.org/mailman/listinfo/python-list
Re: math functions with non numeric args
On 30/06/2013 19:53, Andrew Berg wrote:
On 2013.06.30 13:46, Andrew Z wrote:
Hello,
print max(-10, 10)
10
print max('-10', 10)
-10
My guess max converts string to number bye decoding each of the characters to
it's ASCII equivalent?
Where can i read more on exactly how the situations like these are dealt with?
This behavior is fixed in Python 3:
max('10', 10)
Traceback (most recent call last):
File "", line 1, in
TypeError: unorderable types: int() > str()
Python is strongly typed, so it shouldn't magically convert something from one
type to another.
Explicit is better than implicit.
It doesn't magically convert anyway.
In Python 2, comparing objects of different types like that gives a
consistent but arbitrary result: in this case, bytestrings ('str') are
greater than integers ('int'):
>>> max('-10', 10)
'-10'
>>> max('10', -10)
'10'
--
http://mail.python.org/mailman/listinfo/python-list
Re: socket data sending problem
On 03/07/2013 23:38, [email protected] wrote: im trying to do a simple socket test program for a school project using the socket module, but im having difficulty in sending data between the client and host program. so far all tutorials and examples have used something along the lines of: s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) host = socket.gethostname() port = 12345 s.connect((host, port)) and received it on the server end with: s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) host = '' port = 12345 s.bind((host, port)) s.listen(1) conn, addr = s.accept() print ('client is at', addr) data = conn.recv(5) print(data) it all works fine, except for when i try to use: s.send("hello") to send data between the client and server, i just get this error message: >>> Traceback (most recent call last): File "C:/Users/Ollie/Documents/code/chatroom/client3.py", line 9, in s.send("hello") TypeError: 'str' does not support the buffer interface >>> if anyone can either show me what im doing wrong, what this means and what's causing it, or even better how to fix it it would be greatly appreciated You didn't say which version of Python you're using, but I think that you're using Python 3. A socket handles bytes, not Unicode strings, so you need to encode the Unicode strings to bytes before sending, and decode the bytes to Unicode strings after receiving. -- http://mail.python.org/mailman/listinfo/python-list
Re: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position 0: invalid start byte
On 04/07/2013 11:38, Νίκος wrote:
Στις 4/7/2013 12:50 μμ, ο/η Ulrich Eckhardt έγραψε:
Am 04.07.2013 10:37, schrieb Νίκος:
I just started to have this error without changing nothing
Well, undo the nothing that you didn't change. ;)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position 0:
invalid start byte
[Thu Jul 04 11:35:14 2013] [error] [client 108.162.229.97] Premature end
of script headers: metrites.py
Why cant it decode the starting byte? what starting byte is that?
It's the 0xb6 but it's expecting the starting byte of a UTF-8 sequence.
Please do some research on UTF-8, that should clear it up. You could
also search for common causes of that error.
So you are also suggesting that what gesthostbyaddr() returns is not
utf-8 encoded too?
What character is 0xb6 anyways?
Well, it's from a bytestring, so you'll have to specify what encoding
you're using! (It clearly isn't UTF-8.)
If it's ISO-8859-7 (what you've previously referred to as "greek-iso"),
then:
>>> import unicodedata
>>> unicodedata.name(b"\xb6".decode("ISO-8859-7"))
'GREEK CAPITAL LETTER ALPHA WITH TONOS'
You'll need to find out where that bytestring is coming from.
--
http://mail.python.org/mailman/listinfo/python-list
Re: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position 0: invalid start byte
On 04/07/2013 12:29, Νίκος wrote: Στις 4/7/2013 1:54 μμ, ο/η Chris Angelico έγραψε: On Thu, Jul 4, 2013 at 8:38 PM, � wrote: So you are also suggesting that what gesthostbyaddr() returns is not utf-8 encoded too? What character is 0xb6 anyways? It isn't. It's a byte. Bytes are not characters. http://www.joelonsoftware.com/articles/Unicode.html Well in case of utf-8 encoding for the first 127 codepoing we can safely say that a character equals a byte :) Equals? No. Bytes are not characters. (Strictly speaking, they're codepoints, not characters.) And anyway, it's the first _128_ codepoints. -- http://mail.python.org/mailman/listinfo/python-list
Re: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position 0: invalid start byte
On 04/07/2013 12:36, Νίκος wrote:
Στις 4/7/2013 2:06 μμ, ο/η MRAB έγραψε:
On 04/07/2013 11:38, Νίκος wrote:
Στις 4/7/2013 12:50 μμ, ο/η Ulrich Eckhardt έγραψε:
Am 04.07.2013 10:37, schrieb Νίκος:
I just started to have this error without changing nothing
Well, undo the nothing that you didn't change. ;)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position 0:
invalid start byte
[Thu Jul 04 11:35:14 2013] [error] [client 108.162.229.97] Premature
end
of script headers: metrites.py
Why cant it decode the starting byte? what starting byte is that?
It's the 0xb6 but it's expecting the starting byte of a UTF-8 sequence.
Please do some research on UTF-8, that should clear it up. You could
also search for common causes of that error.
So you are also suggesting that what gesthostbyaddr() returns is not
utf-8 encoded too?
What character is 0xb6 anyways?
Well, it's from a bytestring, so you'll have to specify what encoding
you're using! (It clearly isn't UTF-8.)
If it's ISO-8859-7 (what you've previously referred to as "greek-iso"),
then:
>>> import unicodedata
>>> unicodedata.name(b"\xb6".decode("ISO-8859-7"))
'GREEK CAPITAL LETTER ALPHA WITH TONOS'
You'll need to find out where that bytestring is coming from.
Right.
But nowhere in my script(metrites.py) i use an 'Ά' so i really have no
clue where this is coming from.
And you are right if it was a byte came from an utf-8 encoding scheme
then it would be automatically decoded.
The only thing i can say for use is that this problem a[[ear only when i
cloudflare my domain "superhost.gr"
If i un-cloudlflare it it cease to display errors.
Can you tell me hpw to write the following properly:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0] or 'UnResolved'
so even if the function fails "unresolved" to be returned back?
Somehow i need to capture the error.
Or it dosnt have to do it the or operand will be returned?
If gethostbyaddr fails, it raises socket.gaierror, (which, from Python
3.3 onwards, is a subclass of OSError), so try catching that, setting
'host' to 'UnResolved' if it's raised.
Also, try printing out ascii(os.environ['REMOTE_ADDR']).
--
http://mail.python.org/mailman/listinfo/python-list
Re: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position 0: invalid start byte
On 04/07/2013 13:52, Νίκος wrote:
Στις 4/7/2013 3:07 μμ, ο/η MRAB έγραψε:
Also, try printing out ascii(os.environ['REMOTE_ADDR']).
'108.162.229.97' is the result of:
print( ascii(os.environ['REMOTE_ADDR']) )
Seems perfectly valid. and also have a PTR record, so that leaved us
clueless about the internal server error.
For me, socket.gethostbyaddr('108.162.229.97') raises socket.herror,
which is also a subclass of OSError from Python 3.3 onwards.
--
http://mail.python.org/mailman/listinfo/python-list
Re: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position 0: invalid start byte
On 04/07/2013 13:47, Νίκος wrote:
Στις 4/7/2013 3:07 μμ, ο/η MRAB έγραψε:
On 04/07/2013 12:36, Νίκος wrote:
Στις 4/7/2013 2:06 μμ, ο/η MRAB έγραψε:
On 04/07/2013 11:38, Νίκος wrote:
Στις 4/7/2013 12:50 μμ, ο/η Ulrich Eckhardt έγραψε:
Am 04.07.2013 10:37, schrieb Νίκος:
I just started to have this error without changing nothing
Well, undo the nothing that you didn't change. ;)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in
position 0:
invalid start byte
[Thu Jul 04 11:35:14 2013] [error] [client 108.162.229.97] Premature
end
of script headers: metrites.py
Why cant it decode the starting byte? what starting byte is that?
It's the 0xb6 but it's expecting the starting byte of a UTF-8
sequence.
Please do some research on UTF-8, that should clear it up. You could
also search for common causes of that error.
So you are also suggesting that what gesthostbyaddr() returns is not
utf-8 encoded too?
What character is 0xb6 anyways?
Well, it's from a bytestring, so you'll have to specify what encoding
you're using! (It clearly isn't UTF-8.)
If it's ISO-8859-7 (what you've previously referred to as "greek-iso"),
then:
>>> import unicodedata
>>> unicodedata.name(b"\xb6".decode("ISO-8859-7"))
'GREEK CAPITAL LETTER ALPHA WITH TONOS'
You'll need to find out where that bytestring is coming from.
Right.
But nowhere in my script(metrites.py) i use an 'Ά' so i really have no
clue where this is coming from.
And you are right if it was a byte came from an utf-8 encoding scheme
then it would be automatically decoded.
The only thing i can say for use is that this problem a[[ear only when i
cloudflare my domain "superhost.gr"
If i un-cloudlflare it it cease to display errors.
Can you tell me hpw to write the following properly:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0] or
'UnResolved'
so even if the function fails "unresolved" to be returned back?
Somehow i need to capture the error.
Or it dosnt have to do it the or operand will be returned?
If gethostbyaddr fails, it raises socket.gaierror, (which, from Python
3.3 onwards, is a subclass of OSError), so try catching that, setting
'host' to 'UnResolved' if it's raised.
Also, try printing out ascii(os.environ['REMOTE_ADDR']).
I have followed your suggestion by trying this:
try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except socket.gaierror:
host = "UnResolved"
and then re-cloudlflared "superhost.gr" domain
http://superhost.gr/ gives internal server error.
Try catching OSError instead. (As I said, from Python 3.3,
socket.gaierror is a subclass of it.)
--
http://mail.python.org/mailman/listinfo/python-list
Re: Important features for editors
On 04/07/2013 14:22, Tim Chase wrote: On 2013-07-04 05:02, Dave Angel wrote: [snip an excellent list of things to look for in an editor] Also, - the ability to perform changes in bulk, especially across files. Often, this is done with the ability to record/playback macros, though some editors have multiple insertion/edit cursors; others allow for performing a bulk-change command across the entire file or list of files. - folding (the ability to collapse multiple lines of text down to one line). Especially if there are various ways to do it (manual folding, language-block folding, folding by indentation) - multiple clipboard buffers/registers - multiple bookmarks - the ability to interact with external programs (piping a portion of a file through an external utility) - a good community around it in case you have questions - easy navigation to "important" things in your file (where "important" may vary based on file-type, but may include function definitions, paragraph boundaries, matching paren/bracket/brace/tag, etc) Other nice-to-haves include - split window editing - tabbed windows - Unicode support (including various encodings) It's 2013, yet Unicode support is merely a "nice-to-have"? - vimgolf.com ;-) Candidates? emacs - standard on most OS's, available for Windows from And I'll put in a plug for Vim. -- http://mail.python.org/mailman/listinfo/python-list
Re: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position 0: invalid start byte
On 04/07/2013 14:38, Νίκος Γκρ33κ wrote:
Στις 4/7/2013 4:34 μμ, ο/η MRAB έγραψε:
On 04/07/2013 13:47, Νίκος wrote:
Στις 4/7/2013 3:07 μμ, ο/η MRAB έγραψε:
On 04/07/2013 12:36, Νίκος wrote:
Στις 4/7/2013 2:06 μμ, ο/η MRAB έγραψε:
On 04/07/2013 11:38, Νίκος wrote:
Στις 4/7/2013 12:50 μμ, ο/η Ulrich Eckhardt έγραψε:
Am 04.07.2013 10:37, schrieb Νίκος:
I just started to have this error without changing nothing
Well, undo the nothing that you didn't change. ;)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in
position 0:
invalid start byte
[Thu Jul 04 11:35:14 2013] [error] [client 108.162.229.97]
Premature
end
of script headers: metrites.py
Why cant it decode the starting byte? what starting byte is that?
It's the 0xb6 but it's expecting the starting byte of a UTF-8
sequence.
Please do some research on UTF-8, that should clear it up. You could
also search for common causes of that error.
So you are also suggesting that what gesthostbyaddr() returns is not
utf-8 encoded too?
What character is 0xb6 anyways?
Well, it's from a bytestring, so you'll have to specify what encoding
you're using! (It clearly isn't UTF-8.)
If it's ISO-8859-7 (what you've previously referred to as
"greek-iso"),
then:
>>> import unicodedata
>>> unicodedata.name(b"\xb6".decode("ISO-8859-7"))
'GREEK CAPITAL LETTER ALPHA WITH TONOS'
You'll need to find out where that bytestring is coming from.
Right.
But nowhere in my script(metrites.py) i use an 'Ά' so i really have no
clue where this is coming from.
And you are right if it was a byte came from an utf-8 encoding scheme
then it would be automatically decoded.
The only thing i can say for use is that this problem a[[ear only
when i
cloudflare my domain "superhost.gr"
If i un-cloudlflare it it cease to display errors.
Can you tell me hpw to write the following properly:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0] or
'UnResolved'
so even if the function fails "unresolved" to be returned back?
Somehow i need to capture the error.
Or it dosnt have to do it the or operand will be returned?
If gethostbyaddr fails, it raises socket.gaierror, (which, from Python
3.3 onwards, is a subclass of OSError), so try catching that, setting
'host' to 'UnResolved' if it's raised.
Also, try printing out ascii(os.environ['REMOTE_ADDR']).
I have followed your suggestion by trying this:
try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except socket.gaierror:
host = "UnResolved"
and then re-cloudlflared "superhost.gr" domain
http://superhost.gr/ gives internal server error.
Try catching OSError instead. (As I said, from Python 3.3,
socket.gaierror is a subclass of it.)
At least CloudFlare doesn't give me issues:
if i try this:
try:
host = os.environ['REMOTE_ADDR'][0]
except socket.gaierror:
host = "UnResolved"
It's pointless trying to catch a socket exception here because you're
not using a socket, you're just getting a string from an environment
variable.
then i get no errors and a valid ip back
but the above fails.
I don't know how to catch the exception with OSError.
i know only this two:
except socket.gaierror:
except socket.herror
both fail.
What do you mean "I don't know how to catch the exception with
OSError"? You've tried "except socket.gaierror" and "except
socket.herror", well just write "except OSError" instead!
--
http://mail.python.org/mailman/listinfo/python-list
Re: How to make this faster
On 05/07/2013 16:17, Helmut Jarausch wrote:
On Fri, 05 Jul 2013 15:45:25 +0100, Oscar Benjamin wrote:
Presumably then you're now down to the innermost loop as a bottle-neck:
Possibilities= 0
for d in range(1,10) :
if Row_Digits[r,d] or Col_Digits[c,d] or Sqr_Digits[Sq_No,d] : continue
Possibilities+= 1
If you make it so that e.g. Row_Digits[r] is a set of indices rather
than a list of bools then you can do this with something like
Possibilities = len(Row_Digits[r] | Col_Digits[c] | Sqr_Digits[Sq_No])
or perhaps
Possibilities = len(set.union(Row_Digits[r], Col_Digits[c],
Sqr_Digits[Sq_No]))
which I would expect to be a little faster than looping over range
since the loop is then performed under the hood by the builtin
set-type.
It just takes practice.
indeed
It's a little less obvious in Python than in
low-level languages where the bottlenecks will be and which operations
are faster/slower but optimisation always involves a certain amount of
trial and error anyway.
Oscar
I've tried the following version
def find_good_cell() :
Best= None
minPoss= 10
for r,c in Grid :
if Grid[(r,c)] > 0 : continue
Sq_No= (r//3)*3+c//3
Possibilities= 9-len(Row_Digits[r] | Col_Digits[c] | Sqr_Digits[Sq_No])
if ( Possibilities < minPoss ) :
minPoss= Possibilities
Best= (r,c)
if minPoss == 0 : Best=(-1,-1)
return Best
All_digits= set((1,2,3,4,5,6,7,8,9))
def Solve(R_Cells) :
if R_Cells == 0 :
print("\n\n++ S o l u t i o n ++\n")
Print_Grid()
return True
r,c= find_good_cell()
if r < 0 : return False
Sq_No= (r//3)*3+c//3
for d in All_digits - (Row_Digits[r] | Col_Digits[c] | Sqr_Digits[Sq_No]) :
# put d into Grid
Grid[(r,c)]= d
Row_Digits[r].add(d)
Col_Digits[c].add(d)
Sqr_Digits[Sq_No].add(d)
Success= Solve(R_Cells-1)
# remove d again
Grid[(r,c)]= 0
Row_Digits[r].remove(d)
Col_Digits[c].remove(d)
Sqr_Digits[Sq_No].remove(d)
if Success :
Zuege.append((d,r,c))
return True
return False
which turns out to be as fast as the previous "dictionary only version".
Probably, set.remove is a bit slow
For comparison, here's my solution:
from collections import Counter
problem = '''
_
_3_85
__1_2
___5_7___
__4___1__
_9___
5__73
__2_1
4___9
'''
# Build the grid.
digits = "123456789"
grid = []
for row in problem.splitlines():
if not row:
continue
new_row = []
for cell in row:
if cell.isdigit():
new_row.append({cell})
else:
new_row.append(set(digits))
grid.append(new_row)
# Solve the grid.
changed = True
while changed:
changed = False
# Look for cells that contain only one digit.
for r in range(9):
for c in range(9):
if len(grid[r][c]) == 1:
digit = list(grid[r][c])[0]
# Remove from other cells in same row.
for c2 in range(9):
if c2 != c and digit in grid[r][c2]:
grid[r][c2].remove(digit)
changed = True
# Remove from other cells in same column.
for r2 in range(9):
if r2 != r and digit in grid[r2][c]:
grid[r2][c].remove(digit)
changed = True
# Remove from other cells in the same block of 9.
start_row = r - r % 3
start_column = c - c % 3
for r2 in range(start_row, start_row + 3):
for c2 in range(start_column, start_column + 3):
if (r2, c2) != (r, c) and digit in grid[r2][c2]:
grid[r2][c2].remove(digit)
changed = True
# Look for digits that occur in only one cell in a row.
for r in range(9):
counts = Counter()
for c in range(9):
counts += Counter(grid[r][c])
unique = {digit for digit, times in counts.items() if times == 1}
for c in range(9):
if len(grid[r][c]) > 1 and len(grid[r][c] & unique) == 1:
grid[r][c] &= unique
changed = True
# Look for digits that occur in only one cell in a column.
for c in range(9):
counts = Counter()
for r in range(9):
counts += Counter(grid[r][c])
unique = {digit for digit, times in counts.items() if times == 1}
for r in range(9):
if len(grid[r][c]) > 1 and len(grid[r][c] & unique) == 1:
grid[r][c] &= unique
changed = True
# Look for digits that occur in only one cell in a block of 9.
for start_row in range(0, 9, 3):
for start_column in range(0, 9, 3):
counts = Counter()
for r in range(start_row, start_row + 3):
for c in range(start_column, start_column + 3):
counts += Counter(grid[r][c])
unique = {digit for digit, times in counts.items() if times == 1}
for r in range(start_row, start_row + 3):
for c in range(start_column, start_column + 3):
if len(grid[r][c]) > 1 and len(grid[r][c] & unique) == 1:
grid[r][c] &= unique
Re: hex dump w/ or w/out utf-8 chars
On 08/07/2013 21:56, Dave Angel wrote: On 07/08/2013 01:53 PM, [email protected] wrote: Hi Steven, thank you for your reply... I really needed another python guru which is also an English teacher! Sorry if English is not my mother tongue... "uncorrect" instead of "incorrect" (I misapplied the "similarity principle" like "unpleasant...>...uncorrect"). Apart from these trifles, you said: All characters are UTF-8, characters. "a" is a UTF-8 character. So is "ă". Not using python 3, for me (a programmer which was present at the beginning of computer science, badly interacting with many languages from assembler to Fortran and from c to Pascal and so on) it was an hard job to arrange the abrupt transition from characters only equal to bytes to some special characters defined with 2, 3 bytes and even more. Characters do not have a width. [snip] It depends what you mean by "width"! :-) Try this (Python 3): >>> print("A\N{FULLWIDTH LATIN CAPITAL LETTER A}") AA -- http://mail.python.org/mailman/listinfo/python-list
Re: hex dump w/ or w/out utf-8 chars
On 08/07/2013 23:02, Joshua Landau wrote:
On 8 July 2013 22:38, MRAB wrote:
On 08/07/2013 21:56, Dave Angel wrote:
Characters do not have a width.
[snip]
It depends what you mean by "width"! :-)
Try this (Python 3):
print("A\N{FULLWIDTH LATIN CAPITAL LETTER A}")
AA
Serious question: How would one find the width of a character by that
definition?
>>> import unicodedata
>>> unicodedata.east_asian_width("A")
'Na'
>>> unicodedata.east_asian_width("\N{FULLWIDTH LATIN CAPITAL LETTER A}")
'F'
The possible widths are:
N = Neutral
A = Ambiguous
H = Halfwidth
W = Wide
F = Fullwidth
Na = Narrow
All you then need to do is find out what those actually mean...
--
http://mail.python.org/mailman/listinfo/python-list
Re: GeoIP2 for retrieving city and region ?
On 12/07/2013 17:32, Νικόλας wrote: I know i have asked before but hwta i get is ISP city not visitors precise city. GeoLiteCity.dat isnt accurate that's why it comes for free. i must somehow get access to GeoIPCity.dat which is the full version. And of course it can be done, i dont want to believe that it cant. When visiting http://www.geoiptool.com/en/__ip_info/ it pinpoints my _exact_ city of living, not the ISP's. Have you considered that your ISP might be in the same city as you? According to geoiptool, my ISP is near Leeds, UK, but the important point is that _I'm not_. It did not even ask me to allow a geop ip javascript to run it present sit instantly. So, it certainly is possible if only one can find the correct database to use. So, my question now is, if there is some way we can get an accurate Geo City database. -- http://mail.python.org/mailman/listinfo/python-list
Re: RE Module Performance
On 12/07/2013 23:16, Tim Delaney wrote: On 13 July 2013 03:58, Devyn Collier Johnson mailto:[email protected]>> wrote: Thanks for the thorough response. I learned a lot. You should write articles on Python. I plan to spend some time optimizing the re.py module for Unix systems. I would love to amp up my programs that use that module. If you are finding that regular expressions are taking too much time, have a look at the https://pypi.python.org/pypi/re2/ and https://pypi.python.org/pypi/regex/2013-06-26 modules to see if they already give you enough of a speedup. FYI, you're better off going to http://pypi.python.org/pypi/regex because that will take you to the latest version. -- http://mail.python.org/mailman/listinfo/python-list
Re: what thread-synch mech to use for clean exit from a thread
On 15/07/2013 04:04, Steven D'Aprano wrote: On Mon, 15 Jul 2013 10:27:45 +0800, Gildor Oronar wrote: A currency exchange thread updates exchange rate once a minute. If the thread faield to update currency rate for 5 hours, it should inform main() for a clean exit. This has to be done gracefully, because main() could be doing something delicate. I, a newbie, read all the thread sync tool, and wasn't sure which one to use. In fact I am not sure if there is a need of thread sync, because there is no racing cond. I thought of this naive way: class CurrencyExchange(): def __init__(in_case_callback): this.callback = in_case_callback You need to declare the instance parameter, which is conventionally called "self" not "this". Also, your class needs to inherit from Thread, and critically it MUST call the superclass __init__. So: class CurrencyExchange(threading.Thread): def __init__(self, in_case_callback): super(CurrencyExchange, self).__init__() self.callback = in_case_callback But I'm not sure that a callback is the right approach here. See below. def __run__(): Likewise, you need a "self" parameter. while time.time() - self.rate_timestamp < 5*3600: ... # update exchange rate if success: The "==" in this line should, of course, be "=": self.rate_timestamp == time.time() time.sleep(60) this.callback() # rate not updated 5 hours, a crisis I think that a cleaner way is to just set a flag on the thread instance. Initiate it with: self.updates_seen = True in the __init__ method, and then add this after the while loop: self.updates_seen = False def main(): def callback() Go_On = False I don't believe this callback will work, because it will simply create a local variable call "Go_On", not change the non-local variable. In Python 3, you can use the nonlocal keyword to get what you want, but I think a better approach is with a flag on the thread. agio = CurrencyExchange(in_case = callback) agio.start() Go_On = True while Go_On: do_something_delicate(rate_supplied_by=agio) Change to: while agio.updates_seen: do_something_delicate... -- http://mail.python.org/mailman/listinfo/python-list
Re: Question regarding building Python Windows installer
On 15/07/2013 14:11, Mcadams, Philip W wrote: I’m attempting to create a Python 64-bit Windows Installer. Following the instructions here: http://docs.python.org/2/distutils/builtdist.html I’m to navigate to my Python folder and user command: python setup.py build --plat-name=win-amd64 bdist_wininst I get error: COMPILED_WTH_PYDEBUG = (‘—with-pydebug’ in sysconfig.get_config_var(“CONFIG_ARGS”)) TypeError: argument of type ‘NoneType’ is not iterable I also have tried: setup.py build --plat-name=win-amd64 bdist_wininst and get error: File “setup.py”, line 263 Print “%-*s %-*s %-*s” % (longest, e, longet, f, SyntaxError: invalid syntax Does the line really start with "Print" (initial capital letter)? Also, are you using Python 2 or Python 3? From the link above it looks like Python 2. I followed the instructions here: http://docs.python.org/devguide/setup.html to create a PC build for Windows which allows me to run a Python prompt. Now I need to create a Windows Installer to install this Python on a Windows Server 2008 R2 box. To explain why I’m attempting to do this instead of just using the Windows Installer provided by Python: I needed to modify a _ssl.c file in the Python source code to deal a Mercurial that I’m trying to resolve. Any help on why I’m hitting these errors would be appreciated. -- http://mail.python.org/mailman/listinfo/python-list
Re: help on python regular expression named group
On 16/07/2013 11:18, Mohan L wrote: On Tue, Jul 16, 2013 at 2:12 PM, Joshua Landau mailto:[email protected]>> wrote: On 16 July 2013 07:55, Mohan L mailto:[email protected]>> wrote: > > Dear All, > > Here is my script : > > #!/usr/bin/python > import re > > # A string. > logs = "date=2012-11-28 time=21:14:59" > > # Match with named groups. > m = > re.match("(?P(date=(?P[^\s]+))\s+(time=(?P[^\s]+)))", > logs) > > # print > print m.groupdict() > > Output: > > > {'date': '2012-11-28', 'datetime': 'date=2012-11-28 time=21:14:59', 'time': > '21:14:59'} > > > Required output : > == > > {'date': '2012-11-28', 'datetime': '2012-11-28 21:14:59', 'time': > '21:14:59'} > > need help to correct the below regex > > (?P(date=(?P[^\s]+))\s+(time=(?P[^\s]+)))" > > so that It will have : 'datetime': '2012-11-28 21:14:59' instead of > 'datetime': 'date=2012-11-28 time=21:14:59' > > any help would be greatly appreciated Why do you need to do this in a single Regex? Can't you just " ".join(..) the date and time? I using another third party python script. It takes the regex from configuration file. I can't write any code. I have to do all this in single regex. A capture group captures a single substring. What you're asking is for it to with capture 2 substrings (the date and the time) and then join them together, or capture 1 substring and then remove part of it. I don't know of _any_ regex implementation that lets you do that. -- http://mail.python.org/mailman/listinfo/python-list
Re: Converting a list of lists to a single list
On 23/07/2013 22:52, [email protected] wrote: I think that itertools may be able to do what I want but I have not been able to figure out how. I want to convert an arbitrary number of lists with an arbitrary number of elements in each list into a single list as follows. Say I have three lists: [[A0,A1,A2], [B0,B1,B2] [C0,C1,C2]] I would like to convert those to a single list that looks like this: [A0,B0,C0,C1,C2,B1,C0,C1,C2,B2,C0,C1,C2,A1,B0,C0,C1,C2,B1,C0,C1,C2,B2,C0,C1,C2,A2,B0,C0,C1,C2,B1,C0,C1,C2,B2,C0,C1,C2] An easier way to visualize the pattern I want is as a tree. A0 B0 C0 C1 C2 B1 C0 C1 C2 B2 C0 C1 C2 A1 B0 C0 C1 C2 B1 C0 C1 C2 B2 C0 C1 C2 A2 B0 C0 C1 C2 B1 C0 C1 C2 B2 C0 C1 C2 Using recursion: def tree_list(items): if len(items) == 1: return items[0] sublist = tree_list(items[1 : ]) result = [] for item in items[0]: result.append(item) result.extend(sublist) return result items = [["A0","A1","A2"], ["B0","B1","B2"], ["C0","C1","C2"]] print(tree_list(items)) -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Script Hashplings
On 25/07/2013 14:42, Devyn Collier Johnson wrote: If I execute a Python3 script with this haspling (#!/usr/bin/python3.3) and Python3.3 is not installed, but Python3.2 is installed, would the script still work? Would it fall back to Python3.2? Why don't you try it? I hope Dihedral is listening. I would like to see another response from HIM. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Script Hashplings
On 26/07/2013 11:43, Chris Angelico wrote: On Fri, Jul 26, 2013 at 11:37 AM, Devyn Collier Johnson wrote: On 07/25/2013 09:54 AM, MRAB wrote: On 25/07/2013 14:42, Devyn Collier Johnson wrote: If I execute a Python3 script with this haspling (#!/usr/bin/python3.3) and Python3.3 is not installed, but Python3.2 is installed, would the script still work? Would it fall back to Python3.2? Why don't you try it? I hope Dihedral is listening. I would like to see another response from HIM. Good point, but if it falls back to Python3.2, how would I know? Plus, I have Python3.3, 3.2, and 2.7 installed. I cannot uninstall them due to dependencies. Easy: #!/usr/bin/python3.3 import sys print(sys.version) Now run that on lots of different computers (virtual computers work well for this). There's also sys.version_info: >>> import sys >>> sys.version_info sys.version_info(major=3, minor=3, micro=2, releaselevel='final', serial=0) If you want to test what would happen if that version wasn't installed, set the shebang line to a future version, such as Python 3.4. I doubt you have that installed! :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: RE Module Performance
On 28/07/2013 19:13, [email protected] wrote: Le dimanche 28 juillet 2013 05:53:22 UTC+2, Ian a écrit : On Sat, Jul 27, 2013 at 12:21 PM, wrote: > Back to utf. utfs are not only elements of a unique set of encoded > code points. They have an interesting feature. Each "utf chunk" > holds intrisically the character (in fact the code point) it is > supposed to represent. In utf-32, the obvious case, it is just > the code point. In utf-8, that's the first chunk which helps and > utf-16 is a mixed case (utf-8 / utf-32). In other words, in an > implementation using bytes, for any pointer position it is always > possible to find the corresponding encoded code point and from this > the corresponding character without any "programmed" information. See > my editor example, how to find the char under the caret? In fact, > a silly example, how can the caret can be positioned or moved, if > the underlying corresponding encoded code point can not be > dicerned! Yes, given a pointer location into a utf-8 or utf-16 string, it is easy to determine the identity of the code point at that location. But this is not often a useful operation, save for resynchronization in the case that the string data is corrupted. The caret of an editor does not conceptually correspond to a pointer location, but to a character index. Given a particular character index (e.g. 127504), an editor must be able to determine the identity and/or the memory location of the character at that index, and for UTF-8 and UTF-16 without an auxiliary data structure that is a O(n) operation. > 2) Take a look at this. Get rid of the overhead. > sys.getsizeof('b'*100 + 'c') > 126 sys.getsizeof('b'*100 + '€') > 240 > > What does it mean? It means that Python has to > reencode a str every time it is necessary because > it works with multiple codings. Large strings in practical usage do not need to be resized like this often. Python 3.3 has been in production use for months now, and you still have yet to produce any real-world application code that demonstrates a performance regression. If there is no real-world regression, then there is no problem. > 3) Unicode compliance. We know retrospectively, latin-1, > is was a bad choice. Unusable for 17 European languages. > Believe of not. 20 years of Unicode of incubation is not > long enough to learn it. When discussing once with a French > Python core dev, one with commit access, he did not know one > can not use latin-1 for the French language! Probably because for many French strings, one can. As far as I am aware, the only characters that are missing from Latin-1 are the Euro sign (an unfortunate victim of history), the ligature œ (I have no doubt that many users just type oe anyway), and the rare capital Ÿ (the miniscule version is present in Latin-1). All French strings that are fortunate enough to be absent these characters can be represented in Latin-1 and so will have a 1-byte width in the FSR. -- latin-1? that's not even truth. sys.getsizeof('a') 26 sys.getsizeof('ü') 38 sys.getsizeof('aa') 27 sys.getsizeof('aü') 39 >>> sys.getsizeof('aa') - sys.getsizeof('a') 1 One byte per codepoint. >>> sys.getsizeof('üü') - sys.getsizeof('ü') 1 Also one byte per codepoint. >>> sys.getsizeof('ü') - sys.getsizeof('a') 12 Clearly there's more going on here. FSR is an optimisation. You'll always be able to find some circumstances where an optimisation makes things worse, but what matters is the overall result. -- http://mail.python.org/mailman/listinfo/python-list
Re: FSR and unicode compliance - was Re: RE Module Performance
On 28/07/2013 20:23, [email protected] wrote: [snip] Compare these (a BDFL exemple, where I'using a non-ascii char) Py 3.2 (narrow build) Why are you using a narrow build of Python 3.2? It doesn't treat all codepoints equally (those outside the BMP can't be stored in one code unit) and, therefore, it isn't "Unicode compliant"! timeit.timeit("a = 'hundred'; 'x' in a") 0.09897159682121348 timeit.timeit("a = 'hundre€'; 'x' in a") 0.09079501961732461 sys.getsizeof('d') 32 sys.getsizeof('€') 32 sys.getsizeof('dd') 34 sys.getsizeof('d€') 34 Py3.3 timeit.timeit("a = 'hundred'; 'x' in a") 0.12183182740848858 timeit.timeit("a = 'hundre€'; 'x' in a") 0.2365732969632326 sys.getsizeof('d') 26 sys.getsizeof('€') 40 sys.getsizeof('dd') 27 sys.getsizeof('d€') 42 Tell me which one seems to be more "unicode compliant"? The goal of Unicode is to handle every char "equaly". Now, the problem: memory. Do not forget that à la "FSR" mechanism for a non-ascii user is *irrelevant*. As soon as one uses one single non-ascii, your ascii feature is lost. (That why we have all these dedicated coding schemes, utfs included). sys.getsizeof('abc' * 1000 + 'z') 3026 sys.getsizeof('abc' * 1000 + '\U00010010') 12044 A bit secret. The larger a repertoire of characters is, the more bits you needs. Secret #2. You can not escape from this. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Unexpected results comparing float to Fraction
On 29/07/2013 16:43, Steven D'Aprano wrote: Comparing floats to Fractions gives unexpected results: # Python 3.3 py> from fractions import Fraction py> 1/3 == Fraction(1, 3) False but: py> 1/3 == float(Fraction(1, 3)) True I expected that float-to-Fraction comparisons would convert the Fraction to a float, but apparently they do the opposite: they convert the float to a Fraction: py> Fraction(1/3) Fraction(6004799503160661, 18014398509481984) Am I the only one who is surprised by this? Is there a general rule for which way numeric coercions should go when doing such comparisons? I'm surprised that Fraction(1/3) != Fraction(1, 3); after all, floats are approximate anyway, and the float value 1/3 is more likely to be Fraction(1, 3) than Fraction(6004799503160661, 18014398509481984). -- http://mail.python.org/mailman/listinfo/python-list
Re: Unexpected results comparing float to Fraction
On 29/07/2013 17:20, Chris Angelico wrote: On Mon, Jul 29, 2013 at 5:09 PM, MRAB wrote: I'm surprised that Fraction(1/3) != Fraction(1, 3); after all, floats are approximate anyway, and the float value 1/3 is more likely to be Fraction(1, 3) than Fraction(6004799503160661, 18014398509481984). At what point should it become Fraction(1, 3)? When the error drops below a certain threshold. Fraction(0.3) Fraction(5404319552844595, 18014398509481984) Fraction(0.33) Fraction(5944751508129055, 18014398509481984) Fraction(0.333) Fraction(5998794703657501, 18014398509481984) Fraction(0.333) Fraction(6004798902680711, 18014398509481984) Fraction(0.33) Fraction(6004799502560181, 18014398509481984) Fraction(0.3) Fraction(6004799503160061, 18014398509481984) Fraction(0.3) Fraction(6004799503160661, 18014398509481984) Rounding off like that is a job for a cool library function (one of which was mentioned on this list a little while ago, I believe), but not IMO for the Fraction constructor. -- http://mail.python.org/mailman/listinfo/python-list
Re: Unexpected results comparing float to Fraction
On 29/07/2013 17:40, Ian Kelly wrote: On Mon, Jul 29, 2013 at 10:20 AM, Chris Angelico wrote: On Mon, Jul 29, 2013 at 5:09 PM, MRAB wrote: I'm surprised that Fraction(1/3) != Fraction(1, 3); after all, floats are approximate anyway, and the float value 1/3 is more likely to be Fraction(1, 3) than Fraction(6004799503160661, 18014398509481984). At what point should it become Fraction(1, 3)? At the point where the float is exactly equal to the value you get from the floating-point division 1/3. If it's some other float then the user didn't get there by entering 1/3, so it's not worth trying to pretend that they did. I thought that you're not meant to check for equality when using floats. We do a similar rounding when formatting floats to strings, but in that case one only has to worry about divisors that are powers of 10. I imagine it's going to take more time to find the correct fraction when any pair of relatively prime integers can be a candidate numerator and denominator. Additionally, the string rounding only occurs when the float is being formatted for display; we certainly don't do it as the result of numeric operations where it could result in loss of precision. -- http://mail.python.org/mailman/listinfo/python-list
Re: Bitwise Operations
On 30/07/2013 00:34, Devyn Collier Johnson wrote:
On 07/29/2013 05:53 PM, Grant Edwards wrote:
On 2013-07-29, Devyn Collier Johnson wrote:
On Python3, how can I perform bitwise operations? For instance, I want
something that will 'and', 'or', and 'xor' a binary integer.
http://www.google.com/search?q=python+bitwise+operations
I understand the symbols. I want to know how to perform the task in a
script or terminal. I have searched Google, but I never saw a command.
Typing "101 & 010" or "x = (int(101, 2) & int(010, 2))" only gives errors.
In Python 2, an integer with a leading 0, such as 0101, was octal (base
8). This was a feature borrowed from C but often confused newbies
because it looked like decimal ("Why does 0101 == 101 return False?").
In Python 3, octal is indicated by a leading 0o, such as 0o101 (==
1*64+0*8+1==65) and the old style raises an exception so that those who
have switched from Python 2 will get a clear message that something has
changed.
For binary you need a leading 0b and for hexadecimal you need a leading
0x, so doing something similar for octal makes sense.
0b101 == 1*4+0*2+0 == 5
0o101 == 1*64+0*8+1 == 65
0x101 == 1*256+0*16+1 == 257
--
http://mail.python.org/mailman/listinfo/python-list
Re: RE Module Performance
On 30/07/2013 15:38, Antoon Pardon wrote: Op 30-07-13 16:01, [email protected] schreef: I am pretty sure that once you have typed your 127504 ascii characters, you are very happy the buffer of your editor does not waste time in reencoding the buffer as soon as you enter an €, the 125505th char. Sorry, I wanted to say z instead of euro, just to show that backspacing the last char and reentering a new char implies twice a reencoding. Using a single string as an editor buffer is a bad idea in python for the simple reason that strings are immutable. Using a single string as an editor buffer is a bad idea in _any_ language because an insertion would require all the following characters to be moved. So adding characters would mean continuously copying the string buffer into a new string with the next character added. Copying 127504 characters into a new string will not make that much of a difference whether the octets are just copied to octets or are unpacked into 32 bit words. Somebody wrote "FSR" is just an optimization. Yes, but in case of an editor à la FSR, this optimization take place everytime you enter a char. Your poor editor, in fact the FSR, is finally spending its time in optimizing and finally it optimizes nothing. (It is even worse). Even if you would do it this way, it would *not* take place every time you enter a char. Once your buffer would contain a wide character, it would just need to convert the single character that is added after each keystroke. It would not need to convert the whole buffer after each key stroke. If you type correctly a z instead of an €, it is not necessary to reencode the buffer. Problem, you do you know that you do not have to reencode? simple just check it, and by just checking it wastes time to test it you have to optimized or not and hurt a little bit more what is supposed to be an optimization. Your scenario is totally unrealistic. First of all because of the immutable nature of python strings, second because you suggest that real time usage would result in frequent conversions which is highly unlikely. What you would have is a list of mutable chunks. Inserting into a chunk would be fast, and a chunk would be split if it's already full. Also, small adjacent chunks would be joined together. Finally, a chunk could use FSR to reduce memory usage. -- http://mail.python.org/mailman/listinfo/python-list
Re: RE Module Performance
On 30/07/2013 17:39, Antoon Pardon wrote: Op 30-07-13 18:13, MRAB schreef: On 30/07/2013 15:38, Antoon Pardon wrote: Op 30-07-13 16:01, [email protected] schreef: I am pretty sure that once you have typed your 127504 ascii characters, you are very happy the buffer of your editor does not waste time in reencoding the buffer as soon as you enter an €, the 125505th char. Sorry, I wanted to say z instead of euro, just to show that backspacing the last char and reentering a new char implies twice a reencoding. Using a single string as an editor buffer is a bad idea in python for the simple reason that strings are immutable. Using a single string as an editor buffer is a bad idea in _any_ language because an insertion would require all the following characters to be moved. Not if you use a gap buffer. The disadvantage there is that when you move the cursor you must move characters around. For example, what if the cursor was at the start and you wanted to move it to the end? Also, when the gap has been filled, you need to make a new one. -- http://mail.python.org/mailman/listinfo/python-list
Re: Conditional decoration
On 18/06/2012 23:16, Roy Smith wrote: Is there any way to conditionally apply a decorator to a function? For example, in django, I want to be able to control, via a run-time config flag, if a view gets decorated with @login_required(). @login_required() def my_view(request): pass A decorator is just syntactic sugar for function application after the definition. This: @deco def func(): pass is just another way of writing: def func(): pass func = deco(func) Not as neat, but you can make it conditional. -- http://mail.python.org/mailman/listinfo/python-list
Re: re.finditer with lookahead and lookbehind
On 20/06/2012 14:30, Christian wrote:
Hi,
i have some trouble to split a pattern like s. Even have this
problems with the first and last match. Some greedy problems?
Thanks in advance
Christian
import re
s='v1=pattern1&v2=pattern2&v3=pattern3&v4=pattern4&v5=pattern5&x1=patternx'
pattern =r'(?=[a-z0-9]+=)(.*?)(?<=&)'
regex = re.compile(pattern,re.IGNORECASE)
for match in regex.finditer(s):
print match.group(1)
My intention:
pattern1
pattern2
pattern3
pattern4
pattern5
patternx
You could do it like this:
import re
s =
'v1=pattern1&v2=pattern2&v3=pattern3&v4=pattern4&v5=pattern5&x1=patternx'
pattern = r'=([^&]*)'
regex = re.compile(pattern, re.IGNORECASE)
for match in regex.finditer(s):
print match.group(1)
or avoid regex entirely:
>>> values = [p.partition("=")[2] for p in s.split("&")]
>>> values
['pattern1', 'pattern2', 'pattern3', 'pattern4', 'pattern5', 'patternx']
--
http://mail.python.org/mailman/listinfo/python-list
