https://bugs.kde.org/show_bug.cgi?id=458516
Bug ID: 458516 Summary: Spaces in content filenames causes a second copy of the book's content to be shown; TOC points to the second copy. Product: okular Version: 22.08.0 Platform: Other OS: Linux Status: REPORTED Severity: normal Priority: NOR Component: EPub backend Assignee: okular-de...@kde.org Reporter: duane-t...@evenson.ca Target Milestone: --- Created attachment 151708 --> https://bugs.kde.org/attachment.cgi?id=151708&action=edit source html file SUMMARY An epub file with html content filenames with spaces in the epub zip file cause a doubling on the content with the TOC pointing to the second copy. STEPS TO REPRODUCE 1. create epub file with spaces in component filenames 1.1. create html file: "te st.html" (with space) nano "te st.html" <html><body> <h1>Chapter 1</h1> <h1>Chapter 2</h1> <h1>Chapter 3</h1> <h1>Chapter 4</h1> <h1>Chapter 5</h1> </body></html> 1.2. convert to epub file (with spaces in component filenames) ebook-convert "te st".{html,epub} 1.3. review contents unzip -l "te st.epub" 2. view with okular okular "te st.epub" OBSERVED RESULT The reader will show Title Page, Chapter 1, Chapter 2, Chapter 3, Chapter 4, Chapter 5, Chapter 1, Chapter 2, Chapter 3, Chapter 4, Chapter 5. The table of contents will point to the second occurrence so Chapter 1 will be on page 7. EXPECTED RESULT Reader should show Chapter 1, Chapter 2, Chapter 3, Chapter 4, Chapter 5. The TOC should place Chapter 1 on page 2. SOFTWARE/OS VERSIONS Linux/KDE Plasma: 5.19.4-arch1-1 x86_64 GNU/Linux Window Manager: jwm 2.3.7-3 ADDITIONAL INFORMATION Manually removing spaces in component file names (te st_split_000.html, etc.) and editing content in conent.opf and toc.ncx to remove space and %20 in references corrects the problem. ebook-viewer does not share this problem. There is no doubling of content references in either content.opf or toc.ncx. Playing around with it: If I have: test__split_000.html te st__split_001.html te st__split_002.html test__split_003.html te st__split_004.html and edit contents.opt to have lines: <item id="html5" href="test_split_000.html" media-type="application/xhtml+xml"/> <item id="html4" href="te st_split_001.html" media-type="application/xhtml+xml"/> <item id="html3" href="te st_split_002.html" media-type="application/xhtml+xml"/> <item id="html2" href="test_split_003.html" media-type="application/xhtml+xml"/> <item id="html1" href="te st_split_004.html" media-type="application/xhtml+xml"/> and edit toc.ncx, changing lines: <content src="te%20st_split_000.html"/> ... <content src="te%20st_split_003.html"/> changed to: <content src="test_split_000.html"/> ... <content src="test_split_003.html"/> The book shows Title Page, Chapter 1, Chapter 2, Chapter 3, Chapter 4, Chapter 5, Chapter 2, Chapter 3, Chapter 5 Second copies of Chapters 1 and 4 are missing. The TOC shows Chapters 1-5 pointing to pages 2, 7, 8, 4, 9. -- You are receiving this mail because: You are watching all bug changes.