https://bugs.kde.org/show_bug.cgi?id=418873

            Bug ID: 418873
           Summary: For .torrent files with file names in cyrillic
                    win-cp1251 encoding, KTorrent creates files with UTF8
                    replacement symbols EF BF BD
           Product: ktorrent
           Version: 5.1
          Platform: Kubuntu Packages
                OS: Linux
            Status: REPORTED
          Severity: normal
          Priority: NOR
         Component: general
          Assignee: joris.guis...@gmail.com
          Reporter: m...@mwg.dp.ua
  Target Milestone: ---

Created attachment 126807
  --> https://bugs.kde.org/attachment.cgi?id=126807&action=edit
( copy of the .torrent file from
https://rutracker.org/forum/viewtopic.php?t=1587139 )

SUMMARY

This particular .torrent file I downloaded from
https://rutracker.org/forum/viewtopic.php?t=1587139 . I am attaching its copy
created by KTorrent in ~/.local/share/ktorrent/tor1/torrent . When I give it to
KTorrent to download, it creates all files and folders with every cyrillic char
replaced into UTF8 EF BF BD sequence (
https://apps.timwhitlock.info/unicode/inspect?s=%EF%BF%BD ). When I view
torrent file with 'less', I see inside UTF8 Cyrillic names for every file,
along with (I suppose) same name encoded in Windows cp1251. 

STEPS TO REPRODUCE
1. Get .torrent file from the URL above, or use attached copy
2. Ask KTorrent to download files&folders according to this .torrent
3. Look within the resulting files&folders tree

OBSERVED RESULT

When I do 'find', 'ls' or 'pwd' in the downloaded file&folder tree, I see
question marks inside diamond or hexagonal shapes. When I pipe that through
'xxd', I see these UTF8 triples, EF BF BD, repeated for every such char

EXPECTED RESULT

Files and folders should be properly named according to their UTF8 names
specified inside .torrent file. cp1251-encoded names should be either ignored,
or converted into UTF8, or (the least of evils) created in the filesystem as
cp1251 byte sequences which could then be renamed into their UTF8 equivalents
with some additional script

SOFTWARE/OS VERSIONS
Windows: 
macOS: 
Linux/KDE Plasma: Ubuntu 20.04 (development version)
(available in About System)
KDE Plasma Version: 5.18.3
KDE Frameworks Version: 5.67.0
Qt Version: 5.12.5

ADDITIONAL INFORMATION

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to