This failing test seems to be an issue with Python itself, rather than Scrapy. Suggest just changing the test to match Python behavior.
This code calls through to w3lib.encoding.to_unicode, which just boils down to this: b"\xef\xbb\xbfWORD\xe3\xab".decode('utf-8', 'replace') In which we can see the same results as the test: On python 2: >>> b"\xef\xbb\xbfWORD\xe3\xab".decode('utf-8', 'replace') u'\ufeffWORD\ufffd' On python 3: >>> b"\xef\xbb\xbfWORD\xe3\xab".decode('utf-8', 'replace') '\ufeffWORD�' This bug is keeping python3-scrapy out of testing, can we just update the test to accept this behavior?