https://bugs.kde.org/show_bug.cgi?id=477533
--- Comment #3 from Tobias Leupold <t...@stonemx.de> --- Meanwhile, I know what's going on here, and I also think I know how to fix this :-) When the "compressed" file format is used, category names are used as XML attributes. To be able to do so, they are escaped. Our current escaping algorithm produces invalid XML attribute names, depending on the input: It (among other flaws) allows numbers to be the first character of the escaped output. This violates the XML spec (cf. https://www.w3.org/TR/xml/ ), which states that the first character of an XML attribute must be a NameStartChar. That is "a-z", "A-Z", ":" or "_". Numbers are allowed later in the attribute name, but not as the first character. When writing the XML file, the non-compliant attribute name is written nevertheless. When re-opening the database later, the data can't be read anymore though, because the parser finds a number where he expects either the end of the tag ("/>" or ">") or a new attribute (a NameStartChar), cf. the posted error message: "Expected '>' or '/', but got '[0-9]'" – and thus fails on the invalid XML. Just as a side note: The algorithm also can't escape non-Latin-1 characters correctly (they become "?"), and we also have problems with category names containing spaces and underscores when using the "readable" format, which aren't unescaped to what they initially were (all underscores are replaced by spaces and the underscores are lost on the next reading). The only way to fix the root cause for this is to implement a new escaping algorithm to escape category names to be used as XML attributes that respects the XML spec. My proposal for a compliant implementation can be found at https://invent.kde.org/graphics/kphotoalbum/-/tree/safe_xml_escaping?ref_type=heads – I use a modified URL-style percent encoding using QByteArray's integrated functionality. With this approach, not only the numbers issue is fixed, but one can also use the whole Unicode range in a category name. Also, the spaces and underscores issue is gone for the "readable" format. Needs testing though. -- You are receiving this mail because: You are watching all bug changes.