Thanks Ariel.
I rerun the code with progress=1 and here are the final lines:
33 pages (0.594/sec), 25,370 revs (456.336/sec)
33 pages (0.594/sec), 25,371 revs (456.329/sec)
33 pages (0.594/sec), 25,372 revs (456.315/sec)
33 pages (0.593/sec), 25,373 revs (455.718/sec)
33 pages (0.593/sec), 25,374 revs (455.695/sec)
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 2048
at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
Source)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:392)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:195)
at
org.mediawiki.importer.XmlDumpReader.readDump(XmlDumpReader.java:88)
at org.mediawiki.dumper.Dumper.main(Dumper.java:142)
77.4%
Michael
On Mon, May 20, 2013 at 1:09 PM, Ariel T. Glenn <[email protected]> wrote:
> Στις 19-05-2013, ημέρα Κυρ, και ώρα 23:43 +0200, ο/η Michael Tsikerdekis
> έγραψε:
>
>
> > $ 7za e -so
> enwiki-20130503-pages-meta-history1.xml-p000006887p000009316.7z
> > |java -server -jar mwdumper-1.16.jar --format=sql:1.5 | gzip -vc >
> > temp.sql.gz
>
> <snip>
>
> > 31 pages (0.647/sec), 24,000 revs (500.584/sec)
> > 33 pages (0.655/sec), 25,000 revs (495.835/sec)
> > Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 2048
> > at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
> > at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
> > at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown Source)
>
>
> Can you please rerun mwdumper with the additional argument
> --progress=1
> which should tell us the exact number of revisions processed before it
> dies?
>
> Thanks,
>
> Ariel
>
>
>
>
> _______________________________________________
> MediaWiki-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
>
_______________________________________________
MediaWiki-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l