Hi All,

I am having a Pig script which is processing the sequence files.

In a Bag I am having all the file contents with comma as delimiter 
corresponding to the input sequenced files. The problem here is there is no 
schema associated with this bag. Each tuple in that bag is having different 
schema. There is a Input-Template file which contains schema corresponding to 
the tuples within that bag.

My question here is are there any UDF which are already written which can 
associate the schema with the tuples in the bag dynamically.

I can write a new UDF for this but just wanted to see if there already exits 
some UDF which can be reused instead of rewriting from scratch.

I have attached two files in email :

Input.txt :                  showing data corresponding to bag
Input-template :     schema for various rows within input.txt file.  First 
column within input.txt and Input-template are same which is the main source 
for mapping.


This is a dummy example I tried to create. Let me know if you need any 
additional details.


All I need as a suggestion is whether some UDF which applicable to this use 
case exit or not.

Thanks in advance .

Happy Hadooping!!!!

--
Ankur
FirstName,is,first,line,with,7,fields
Address,is,with,4
Location,am,with,6,fields,ff
FirstName,col1_name,col2_name,col3_name,col4_name,col5_name,col6_name,col7_name
Address,col1_name,col2_name,col3_name
Location,col1_name,col2_name,col3_name,col4_name,col5_name,col6_name

Reply via email to