Hi Rob, I'm a committer and PMC in the POI project and I'm interested in integration with the ODF Toolkit. Apache will be a good home for this codebase and many ASF projects will benefit from it.
For POI I see the following potential benefits: - have a common Java API to manipulate with Spreadsheet, Document and Presentation files transparently from the format (binary MS Office files, OOXML files or ODF files). - Extend POI tools to support ODF. For example, POI's formula evaluation module enables you to calculate the result of formulas in Excels sheets. I'm sure it will be a popular feature for ODF too. - unified interfaces for text extractors. This is a conduit to content analysis toolkits like Apache Tika. I think we should start a new TLP with a possibility that ODF Toolkit and POI will converge into one project in the future. I thought about including the ODF Toolkit as part of Apache POI, but it is not a good idea at the moment. POI's niche is Java API for Office documents. The Conformance Tools and the C#/.NET library don't fit in it. Also, the ODF Toolkit is Maven-driven and POI is Ant-driven. Yegor On Mon, Jun 27, 2011 at 11:42 PM, Rob Weir <apa...@robweir.com> wrote: > I'm cc'ing the POI and OpenOffice projects, inviting them to join this > discussion on the Incubator general list: general@incubator.apache.org > > When we were discussing the OpenOffice proposal a few weeks ago I > mentioned that there was another set of technology called the ODF > Toolkit, that we might want to bring to Apache as well. I heard some > enthusiasm for this at the time, but I didn't have the bandwidth to > put together another proposal. Now I do. I'd like to pitch the idea, > and see if there is still interest in having a formal incubation > proposal submitted, and if so, identifying a Champion and Sponsor for > the proposal. > > Note that this would not be a fork. The ODF Toolkit Union Steering > Committee met this morning and agreed to propose moving to Apache. > > As you probably know, ODF == Open Document Format, a open standard > document format for office documents. The ODF standard is created at > OASIS and then sent to ISO/IEC JTC1 for transposition into an > International Standard. ODF 1.0 was first published in 2005. ODF 1.1 > came out in 2007. And ODF 1.2 is "Candidate OASIS Standard" awaiting > final approval in OASIS, probably by end of September. ODF 1.2 is > what most applications are supporting today. OpenOffice, > LibreOffice, Symphony, KOffice/Calligra Suite use ODF as native > formats. Other applications, including Microsoft Office, Corel > Wordperfect and Google Docs offer some degree of import/export > support. ODF 1.2 is the version also supported by the ODF Toolkit. > > The ODF Toolkit Union maintains the following toolkits, all of them > under the Apache 2.0 license: > > 1) ODFDOM is Java-based typed DOM API, relatively low level, a 1-to-1 > mapping to the ODF schema. In fact, much of the code is generated by > processing the schema. > > http://odftoolkit.org/projects/odfdom/pages/Home > > 2) Simple Java API for ODF is a high level wrapper of ODFDOM. So > operations that might require several DOM-level operations, like > deleting a column in a spreadsheet, are a single operation in the > Simple API. Search and replace, copying slides from one presentation > to another, adding hyperlinks to a selection, etc., are top level > operations. > > http://simple.odftoolkit.org/ > > 3) The Conformance Tools projects is also in Java, and includes an > online conformance checker of ODF documents, which can also be run in > command line mode. > > http://odftoolkit.org/projects/conformancetools/pages/Home > > 4) XSLTRunner and XSLT Runner Task allows easy use of XSLT transforms > with ODF documents. > > http://odftoolkit.org/projects/conformancetools/pages/ODFXSLTRunner > > 5) AODL is a C#/.NET library for ODF > > http://odftoolkit.org/projects/aodl/pages/Home > > I think there is natural synergy with Apache, especially with the Java > components. For example, I could see publishing pipelines involving > the ODF Toolkit with PDFBox, Batik, FOP, and POI. Having these tools > under a common license, in one place, has obvious benefits. > > Moving this project over would not be a large technical effort. > Mercurial ==> SVN, some simple website/wiki migration, 30 or so > pages, a few mailing lists and bugzilla databases. It is currently on > the Kenai infrastructure, so similar to OpenOffice, just much, much > smaller in scale. > > I'm open as to whether this would be best eventually as a TLP or as > part of an existing project, like POI or even OpenOffice. I'm leaning > a little toward having this as a TLP, but I'm open to other ideas. > > Also, since this is already an open source project with all code under > Apache 2.0, I assume no SGA is required? > > So please let me know if you agree that Apache would be a good > location to further develop the ODF Toolkit libraries. > > Regards, > > -Rob > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org > For additional commands, e-mail: dev-h...@poi.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org