[Tomcat Wiki] Update of "Development/NestedFilesystem" by jboynes

Apache Wiki Sun, 05 Apr 2015 14:27:06 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tomcat Wiki" for change 
notification.


The "Development/NestedFilesystem" page has been changed by jboynes:
https://wiki.apache.org/tomcat/Development/NestedFilesystem

New page:
Java uses the JAR file format for packaging application components. Originally 
used simply for packaging classes and their associated resources, it is now 
used for package types that allow embedding of other packages such as:

 * Web applications (WAR files) that may contain JAR files with classes and/or 
web fragments
 * Resource adapters (J2CA RAR files) that may contain JAR files with classes 
or native libraries
 * Enterprise archives (EAR files) that may contain JAR files, WARs or RARs 
(with their embedded JARs and libraries)

This nesting is typically handled by expanding the packages onto the filesystem 
where they can be accessed using the standard JDK APIs; however, this requires 
a writable filesystem with space to hold the extracted packages and takes time 
to perform the extractions. This has the advantage that every resource 
contained in the package can be identified by a URL using a scheme supported 
directly by the JDK (using either the "file" protocol or the "jar" protocol).

To avoid unpacking the archive, alternative mechanisms have been build that use 
custom URLs and !ClassLoader implementations to access their content. Examples 
of these are the "jndi" scheme used in previous versions of Tomcat or the 
"onejar" scheme used by the One-Jar project. These custom schemes may not be 
recognized by framework libraries and may be handled incorrectly or 
inefficiently. This is compounded by schemes deriving from the "jar" scheme 
with its use of non-hierarchical URIs that require special handling.

This proposal explores an alternative implementation based on the use of the 
NIO !FileSystem library introduced in Java 7.

A prototype implementation is available in Tomcat's sandbox at 
http://svn.apache.org/viewvc/tomcat/sandbox/niofs/

= Requirements =

The design is predicated on the ability to create !FileSystem to provide a 
fully-functional view of an archive's content from a !Path referring to an 
archive. !Paths to entries in that !FileSystem may be used as the basis for 
other archive !FileSystems. Essentially, an archive can be mounted as a 
!FileSystem and any archives it contains can in turn be mounted to form a 
nested hierarchy of !FileSystems.

== Functional Requirements ==
 * A !FileSystem view of an archive may be created by calling the 
newFileSystem(Path) method on the provider.
   * The !FileSystem underlying the Path must support random access via the 
!SeekableByteChannel returned from newByteChannel()
 * The provider's newByteChannel() operation must return a !SeekableByteChannel 
that supports random access
 * A !FileSystem view of an archive may be created by calling the 
newFileSystem(URI) method on the provider.
   * The URI must be able to be converted to a Path using the Paths.get(URI) 
API.
   * The !FileSystem backing such a Path must meet the constraints defined for 
newFileSystem(Path)
 * The URIs for Paths returned by the provider must use standard URI syntax and 
support resolving of relative references

== Non-Functional Requirements ==
 * The provider will be identified by the URI scheme "archive"
 * The provider should avoid unnecessary buffering of data in memory or on disk
   * Buffering modes should be configurable by the user
 * Performance should be comparable to that achievable by extracting the 
archive to disk
   * Mount performance should be comparable to the time and resources taken to 
extract the archive's content
   * File open performance should be comparable to the time taken to open a 
file on the default filesystem
   * File read performance should be comparable to the time taken to read from 
a file on the default filesystem
   * File seek performance should be comparable to the time taken to position 
within a file on the default filesystem

= Implementation =

== Zip Structure ==
PKWARE's documentation on the format can be found at 
http://www.pkware.com/documents/casestudies/APPNOTE.TXT

A Zip file is organized as a series of file entries each consisting of a header 
followed by data, followed by a series of "central directory" entries that 
reference the individual file entries, followed by a "end of central directory" 
or EOCD record that can be used to reference the central directory. An 
application wishing to access a random entry must work backwards from the end 
of the file to locate the EOCD record, seek to and scan the central directory 
entries, then seek to the individual file entry.

Individual file entries may be uncompressed (i.e STORED) or compressed using 
the DEFLATE algorithm (although the Zip format allows others the JDK only 
supports DEFLATE). Data in STORED entries may be accessed directly once the 
entry's offset within the archive has been retrieved from the central directory 
entry. However, DEFLATE stores data as a series of blocks of unknown length so 
positioning within a deflated entry may involve following the block chain from 
the beginning.

Zip files may or may may not contain entries corresponding to folders in the 
filesystem. This is typically transparent to applications using a !ClassLoader 
to load classes or resources but to provide a !FileSystem view these nodes must 
be synthesized if not present.

Zip files may contain "zombie" entries that are not located in the central 
directory. These can be created when a zip file is updated to replace or remove 
additional items. An application that sequentially scans a Zip file may 
incorrectly handle this (returning the older or deleted entry) unless it 
continues to scan the entire jar to verify an entry still appears in the 
central directory; due to the inherent inefficiency in that most do not. In 
practice, application packages are generally not modified after initial build 
so this error is unlikely.

Zip files may contain data in addition to the archive entries such as 
executable code for self-extracting archives or text comments describing the 
archive.

== URI Structure ==



== ToDos ==
= Limitations in standard JDK APIs =

== Zip Handling ==

The JDK API dealing with Zip archives have not been updated to work with the 
NIO File APIs:
 * ZipFile's constructor only accepts a java.io.File or a String relating to a 
file on the default filesystem
 * A zip entry may only be accessed as a sequential !InputStream rather than a 
!SeekableByteChannel
 * A !ZipInputStream may only be constructed over an !InputStream rather than a 
!SeekableByteChannel

The JDK implementation of Zip support uses the native zlib library and maps the 
archive into memory for direct access and performance. This has implications:
 * The archive content must be accessible from native code
 * Memory mapping a file on some operating systems (e.g. Microsoft Windows) 
asserts a mandatory file lock which interferes with the "overwrite to 
re-deploy" mechanism often used in development environments

== URL Support ==

The jar scheme syntax is now 
[[https://www.iana.org/assignments/uri-schemes/prov/jar|formally defined]] as:
{{{
jar:<url>!/[<entry>]
}}}

The JDK libraries such as JarURLConnection do not permit the <url> component to 
be another jar: URL; nesting is specifically not supported.

As this does not comply with the syntax rules for standard hierarchical URIs 
custom parsing code is required in order to perform URL manipulation. For 
example, to resolve a relative URI such as a class reference, the jar: URL must 
be parsed to extract and manipulate the [entry] component.

JarURLConnection's getJarFile API returns a !JarFile which has the same issues 
described in [[#Zip Handling]].

== Built-in "jar" FileSystemProvider ==
To provide an illustrative example of a !FileSystemProvider, Sun/Oracle 
released a demo "ZipFS" for working with Zip archives and a version of this is 
included in the JDK. This implementation inherits some of the limitations from 
above:
 * The archive must be located on the default !FileSystem
 * It uses "jar:" URIs and does not support nesting
 * The !SeekableByteChannel returned by newByteChannel does not support seek 
operations

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

[Tomcat Wiki] Update of "Development/NestedFilesystem" by jboynes

Reply via email to