dear fellow Python enthusiasts, I recently wrote a script that grabs a file containing a list of ISO defined countries and creates an html select element. That's all well and good, and everything seems to work fine, except for one little nagging problem:
http://en.wikipedia.org/wiki/Aland_Islands I use the Wikipedia url because I'm conscious of the fact that people reading this email might not be able to see the character I am having trouble displaying correctly, the LATIN CAPITAL LETTER A WITH RING ABOVE character. After reading the following article: http://www.joelonsoftware.com/articles/Unicode.html I realize the following: It does not make sense to have a string without knowing what encoding it uses. There is no such thing as plain text. Ok. Fine. In Mozilla, by clicking on View, Character Encoding, I find out that the text in the file I grab from: http://www.iso.ch/iso/en/prods-services/iso3166ma/02iso-3166-code-lists/index.html is encoded in ISO-8859-1. So I go about changing Python's default encoding according to: http://www.diveintopython.org/xml_processing/unicode.html and voila: >>> import sys >>> sys.getdefaultencoding() 'iso-8859-1' >>> BUT the LATIN CAPITAL LETTER A WITH RING ABOVE character still displays in IDLE as \xc5 ! I can get the character to display correctly if I type: print "\xc5" which is fine if I am simply going to copy and paste the select element into my html file. However, I want to be able to dynamically generate the html form page and have the character in question display correctly in the web browser. In case you're wondering, I've already done my due diligence to ensure the character set is ISO-8859-1 in my web server as well as in the html file: - in my html file, I put in: <head><meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"></head> - I restarted apache after changing httpd.conf to add the line: AddDefaultCharset ISO-8859-1 The problem, of course, is that if I run my script that creates the select element in IDLE I continue to see the output: <option value='AX'>\xc5land Islands</option> Am I doing something wrong ? def create_bidirectional_dicts_from_latest_ISO_countries(): import urllib ISO3166_FILE_URL = " http://www.iso.ch/iso/en/prods-services/iso3166ma/02iso-3166-code-lists/list-en1-semic.txt " a2_to_name = {} name_to_a2 = {} file_obj = urllib.urlopen(ISO3166_FILE_URL) for line in file_obj: if line.startswith("This list") or line.isspace(): pass else: a_list = line.split(';') ISO_name = a_list[0].title() ISO_a2 = a_list[1].strip() a2_to_name[ISO_a2] = ISO_name name_to_a2[ISO_name] = ISO_a2 file_obj.close() return a2_to_name, name_to_a2 def create_select_element_from_dict(name, a_dict, default_value=None): parent_wrapper = "<select name='%s'>%s</select>" child_wrapper = "\t<option value=''>Please select one</option>\n%s" element_template = "\t<option value='%s'>%s</option>\n" default_element = "\t<option value='%s' selected='yes'>%s</option>\n" a_str = "" for key in sorted(a_dict.keys()): if default_value and a_dict[key] == default_value: a_str = a_str + default_element % (default_value, key) a_str = a_str + element_template % (a_dict[key], key) c_w_instance = child_wrapper % a_str return parent_wrapper % (name, c_w_instance) a2_to_name, name_to_a2 = create_bidirectional_dicts_from_latest_ISO_countries() a_select = create_select_element_from_dict("country", name_to_a2)
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor