Hi everyone,

I am trying to restore the revision table from Wikipedia dumps. I
understand that the file that I need is probably enwiki-XX-pages-
logging.xml.gz

I've downloaded the file and I am using the 1.16 version of mwdumper from
https://integration.wikimedia.org/ci/job/MWDumper-package/org.wikimedia$mwdumper/

When I execute the following I get this error:
java -server -jar mwdumper.jar --format=sql:1.5
enwiki-20130503-pages-logging.xml.gz | gzip -vc >
enwiki-latest-pages-articles.sql.gz
Exception in thread "main" java.lang.IllegalArgumentException: Unexpected
<id> outside a <page>, <revision>, or <contributor>
        at
org.mediawiki.importer.XmlDumpReader.readId(XmlDumpReader.java:329)
        at
org.mediawiki.importer.XmlDumpReader.endElement(XmlDumpReader.java:204)
        at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown
Source)
        at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEndElement(Unknown
Source)
        at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
        at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
Source)
        at javax.xml.parsers.SAXParser.parse(SAXParser.java:392)
        at javax.xml.parsers.SAXParser.parse(SAXParser.java:195)
        at
org.mediawiki.importer.XmlDumpReader.readDump(XmlDumpReader.java:88)
        at org.mediawiki.dumper.Dumper.main(Dumper.java:142)
  0.0%

Mwdumper works well with other 7z xml files but not for this one. I tried a
couple of different xml page-logging files and even from a different
language wikipedias.

Anyone knows what this error is and why it occurs on this specific file?

PS: I've also tried to build mwdumper:
git clone https://gerrit.wikimedia.org/r/p/mediawiki/tools/mwdumper.git
 mwdumper

However I couldn't use make or ant since there was not build.xml or
makefile in the git.

I appreciate any help you can give me with this.
_______________________________________________
MediaWiki-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

Reply via email to