Hi All, I am having a Pig script which is processing the sequence files.
In a Bag I am having all the file contents with comma as delimiter corresponding to the input sequenced files. The problem here is there is no schema associated with this bag. Each tuple in that bag is having different schema. There is a Input-Template file which contains schema corresponding to the tuples within that bag. My question here is are there any UDF which are already written which can associate the schema with the tuples in the bag dynamically. I can write a new UDF for this but just wanted to see if there already exits some UDF which can be reused instead of rewriting from scratch. I have attached two files in email : Input.txt : showing data corresponding to bag Input-template : schema for various rows within input.txt file. First column within input.txt and Input-template are same which is the main source for mapping. This is a dummy example I tried to create. Let me know if you need any additional details. All I need as a suggestion is whether some UDF which applicable to this use case exit or not. Thanks in advance . Happy Hadooping!!!! -- Ankur
FirstName,is,first,line,with,7,fields Address,is,with,4 Location,am,with,6,fields,ff
FirstName,col1_name,col2_name,col3_name,col4_name,col5_name,col6_name,col7_name Address,col1_name,col2_name,col3_name Location,col1_name,col2_name,col3_name,col4_name,col5_name,col6_name
