Τη Παρασκευή, 7 Ιουνίου 2013 5:29:25 μ.μ. UTC+3, ο χρήστης MRAB έγραψε:
> This is a worse way of doing it because the ISO-8859-7 encoding has 1
> byte per codepoint, meaning that it's more 'tolerant' (if that's the
> word) of errors. A sequence of bytes that is actually UTF-8 can be
> decoded as ISO-8859-7, giving gibberish.
> UTF-8 is less tolerant, and it's the encoding that ideally you should
> be using everywhere, so it's better to assume UTF-8 and, if it fails,
> try ISO-8859-7 and then rename so that any names that were ISO-8859-7
> will be converted to UTF-8.
Indeed iw asnt aware of that, at that time, i was under the impression that if
a string was encoded to bytes using soem charset can only be switched back with
the use of that and only that charset. Since this is the case here is my
fixning:
#========================================================
# Collect filenames of the path dir as bytes
filename_bytes = os.listdir( b'/home/nikos/public_html/data/apps/' )
for filename in filename_bytes:
# Compute 'path/to/filename' into bytes
filepath_bytes = b'/home/nikos/public_html/data/apps/' + b'filename'
flag = False
try:
# Assume current file is utf8 encoded
filepath = filepath_bytes.decode('utf-8')
flag = 'utf8'
except UnicodeDecodeError:
try:
# Since current filename is not utf8 encoded then it
has to be greek-iso encoded
filepath = filepath_bytes.decode('iso-8859-7')
flag = 'greek'
except UnicodeDecodeError:
print( '''I give up! File name is unreadable!''' )
if( flag = 'greek' )
# Rename filename from greek bytes --> utf-8 bytes
os.rename( filepath_bytes, filepath.encode('utf-8') )
#========================================================
filenames = os.listdir( '/home/nikos/public_html/data/apps/' )
# Load'em
for filename in filenames:
try:
# Check the presence of a file against the database and insert
if it doesn't exist
cur.execute('''SELECT url FROM files WHERE url = %s''',
filename )
data = cur.fetchone()
if not data:
# First time for file; primary key is automatic, hit is
defaulted
cur.execute('''INSERT INTO files (url, host, lastvisit)
VALUES (%s, %s, %s)''', (filename, host, lastvisit) )
except pymysql.ProgrammingError as e:
print( repr(e) )
#========================================================
filenames = os.listdir( '/home/nikos/public_html/data/apps/' )
filepaths = ()
# Build a set of 'path/to/filename' based on the objects of path dir
for filename in filenames:
filepaths.add( filename )
# Delete spurious
cur.execute('''SELECT url FROM files''')
data = cur.fetchall()
# Check database's filenames against path's filenames
for rec in data:
if rec not in filepaths:
cur.execute('''DELETE FROM files WHERE url = %s''', rec )
=============================
[email protected] [~/www/cgi-bin]# [Fri Jun 07 21:49:33 2013] [error] [client
79.103.41.173] File "/home/nikos/public_html/cgi-bin/files.py", line 81
[Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] if( flag ==
'greek' )
[Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173]
^
[Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] SyntaxError: invalid
syntax
[Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] Premature end of
script headers: files.py
-------------------------------
i dont know why that if statement errors.
--
http://mail.python.org/mailman/listinfo/python-list