[issue5036] xml.parsers.expat make a dictionary which keys are broken if buffer_text is False.

2009-01-22 Thread Takeshi Matsuyama

New submission from Takeshi Matsuyama :

When I make a dictionary by parsing "legacy-icon-mapping.xml"(which is a
part of
icon-naming-utils[http://tango.freedesktop.org/Tango_Icon_Library]) with
the following script, the three keys of the dictionary are collapsed if
the "buffer_text" attribute is False.

=
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import with_statement
import sys
from xml.parsers.expat import ParserCreate
import codecs

class Database:
  """Make a dictionary which is accessible by Databese.dict"""
  def __init__(self, buffer_text):
self.cnt = None
self.name = None
self.data = None
self.dict = {}
p = ParserCreate()
p.buffer_text = buffer_text

p.StartElementHandler = self.start_element
p.EndElementHandler = self.end_element
p.CharacterDataHandler = self.char_data

with open("/usr/share/icon-naming-utils/legacy-icon-mapping.xml",
'r') as f:
  p.ParseFile(f)

  def start_element(self, name, attrs):
if name == 'context':
  self.cnt = attrs["dir"]
if name == 'icon':
  self.name = attrs["name"]
  
  def end_element(self, name):
if name == 'link':
  self.dict[self.data] = (self.cnt, self.name)

  def char_data(self, data):
self.data = data.strip()

def print_set(aset):
  for e in aset:
print '\t' + e

if __name__ == '__main__':
  sys.stdout = codecs.getwriter('utf_8')(sys.stdout)
  map_false_dict = Database(False).dict
  map_true_dict = Database(True).dict
  print "The keys which exist if buffer_text=False but don't exist if
buffer_text=True are"
  print_set(set(map_false_dict.keys()) - set(map_true_dict.keys()))
  print "The keys which exist if buffer_text=True but don't exist if
buffer_text=False are"
  print_set(set(map_true_dict.keys()) - set(map_false_dict.keys()))
=

The result of running this script is
==
The keys which exist if buffer_text=False but don't exist if
buffer_text=True are
rt-descending
ock_text_right
lc
The keys which exist if buffer_text=True but don't exist if
buffer_text=False are
stock_text_right
gnome-mime-application-vnd.stardivision.calc
gtk-sort-descending
==
I confirmed it in Python-2.5.2 on Fedora 10.

--
components: XML
messages: 80398
nosy: tksmashiw
severity: normal
status: open
title: xml.parsers.expat make a dictionary which keys are broken if buffer_text 
is False.
type: behavior
versions: Python 2.5

___
Python tracker 
<http://bugs.python.org/issue5036>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5036] xml.parsers.expat make a dictionary which keys are broken if buffer_text is False.

2009-01-23 Thread Takeshi Matsuyama

Takeshi Matsuyama  added the comment:

Thanks for reply!

>If the xml file is small enough, could you attach it to the issue? Or 
>provide a download location?
Sorry, I found here.
http://webcvs.freedesktop.org/icon-theme/icon-naming-utils/legacy-icon-mapping.xml?revision=1.75&content-type=text%2Fplain&pathrev=1.75

>(Note that Python 2.5 only gets security fixes now, so unless this 
>still fails with 2.6 or later, this issue is likely to be closed)
I roughly confirmed the same problem on python-3.0 on MS Windows 2 weeks
ago, but need to verify more strictly...

___
Python tracker 
<http://bugs.python.org/issue5036>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5036] xml.parsers.expat make a dictionary which keys are broken if buffer_text is False.

2009-01-24 Thread Takeshi Matsuyama

Takeshi Matsuyama  added the comment:

Hi kawai.
I got correct output by modifying the code like you say, but I still
cannot understand why this happens.
Could you tell me more briefly, or point any documents about it?
I can't find any notes which say don't pass strings but append it for
CharacterDataHandler in official documents.
Does everyone know/understand it already? Only I am so stupid? (;;)

___
Python tracker 
<http://bugs.python.org/issue5036>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5036] xml.parsers.expat make a dictionary which keys are broken if buffer_text is False.

2009-01-24 Thread Takeshi Matsuyama

Takeshi Matsuyama  added the comment:

a mistake of my former message, briefly -> in detail

>Please read "The ContentHandler.characters() callback is missing data!" 
>http://www.saxproject.org/faq.html
I was just reading above site. it is now very clear for me.
Thanks kawai and I'm sorry to take up your time, gagenellina.

___
Python tracker 
<http://bugs.python.org/issue5036>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5036] xml.parsers.expat make a dictionary which keys are broken if buffer_text is False.

2009-01-27 Thread Takeshi Matsuyama

Takeshi Matsuyama  added the comment:

>From msg80438
>You should reset it by self.data = '' at end_element().

It seems that we should reset it at start_element() like this,

def start_element(self, name, attrs):
  ...abbr...
  if name == 'link':
self.data = ''
=
or unwanted \s, \t, and \n mix in "self.data".
That's all, thanks.

___
Python tracker 
<http://bugs.python.org/issue5036>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5036] xml.parsers.expat make a dictionary which keys are broken if buffer_text is False.

2009-01-30 Thread Takeshi Matsuyama

Takeshi Matsuyama  added the comment:

Could someone close this?

___
Python tracker 
<http://bugs.python.org/issue5036>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com