On Wed, Apr 13, 2011 at 08:46, darklow <[email protected]> wrote:
> Maybe someone knows some fast dirty fix at least how to skip such invalid
> byte sequence strings while there are no official fix, so i can finish the
> import?
> Can we detect invalid byte characters?
Hi again,
actually my problem is that I'm unable to reproduce this bug. :-)
Using Postgresql and SQLObject, my run goes on smooth.
I have downloaded the 'actors.list.gz' file today, so it's possible that some
garbage was removed.
Anyway, the previously proposed solution was obviously flawed, since
the problem was on _character_ names.
So, let's edit again the imdbpy2sql.py file and change the lines around 1540
so that they become:
movieid = CACHE_MID.addUnique(title)
if role is not None:
roles = filter(None, [x.strip() for x in role.split('/')])
for role in roles:
role = role.replace('\xec\x8c\xa0', '') # TEMPORARY FIX
cid = CACHE_CID.addUnique(role)
sqldata.add((pid, movieid, cid, note, order))
Maybe this will help... who knows? :-)
--
Davide Alberani <[email protected]> [PGP KeyID: 0x465BFD47]
http://www.mimante.net/
------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve
application availability and disaster protection. Learn more about boosting
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
Imdbpy-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/imdbpy-help