...
Note |
note that this strategy currently requires either setting an IDLE value or setting the HdfsConstants.HDFS_CLOSE header to false to use the BYTES/MESSAGES configuration...otherwise, the file will be closed with each message |
for example:
Code Block |
hdfshdfs2://localhost/tmp/simple-file?splitStrategy=IDLE:1000,BYTES:5
|
...
Using this component in OSGi
This component is fully functional There are some quirks when running this component in an OSGi environment , however, it requires some actions from the user. Hadoop uses the thread context class loader in order to load resources. Usually, the thread context classloader will be the bundle class loader of the bundle that contains the routes. So, the related to the mechanism Hadoop 2.x uses to discover different org.apache.hadoop.fs.FileSystem
implementations. Hadoop 2.x uses java.util.ServiceLoader
which looks for /META-INF/services/org.apache.hadoop.fs.FileSystem
files defining available filesystem types and implementations. These resources are not available when running inside OSGi.
As with camel-hdfs
component, the default configuration files need to be visible from the bundle class loader. A typical way to deal with it is to keep a copy of of core-default.xml
(and e.g., hdfs-default.xml
) in your bundle root. That file can be found in the hadoop-common.jar
Using this component with manually defined routes
There are two options:
- Package
/META-INF/services/org.apache.hadoop.fs.FileSystem
resource with bundle that defines the routes. This resource should list all the required Hadoop 2.x filesystem implementations.
- Provide boilerplate initialization code which populates internal, static cache inside
org.apache.hadoop.fs.FileSystem
class:
Code Block |
|
org.apache.hadoop.conf.Configuration conf = new org.apache.hadoop.conf.Configuration();
conf.setClass("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem.class, FileSystem.class);
conf.setClass("fs.hdfs.impl", org.apache.hadoop.hdfs.DistributedFileSystem.class, FileSystem.class);
...
FileSystem.get("file:///", conf);
FileSystem.get("hdfs://localhost:9000/", conf);
...
|
Using this component with Blueprint container
Two options:
- Package
/META-INF/services/org.apache.hadoop.fs.FileSystem
resource with bundle that contains blueprint definition.
- Add the following to the blueprint definition file:
Code Block |
|
<bean id="hdfsOsgiHelper" class="org.apache.camel.component.hdfs2.HdfsOsgiHelper">
<argument>
<map>
<entry key="file:///" value="org.apache.hadoop.fs.LocalFileSystem" />
<entry key="hdfs://localhost:9000/" value="org.apache.hadoop.hdfs.DistributedFileSystem" />
...
</map>
</argument>
</bean>
<bean id="hdfs2" class="org.apache.camel.component.hdfs2.HdfsComponent" depends-on="hdfsOsgiHelper" />
|
This way Hadoop 2.x will have correct mapping of URI schemes to filesystem implementations.