Dear ASF Community,

I’m writing to propose a new project at Apache Software Foundation. A little 
bit about myself, I’m currently a senior software engineer in the Affirm ML 
Platform team (possibly switching soon), and have around 5 years experience in 
a couple startups, all working in the Data and Machine Learning infrastructure.
Over the past few years I’ve seen a general pattern arising in many companies 
to build the ML infrastructure, especially on the feature store 
<https://www.featurestore.org/>. And I have built similar products across all 
my previous and current companies, including in-house solutions, using the 
3rd-party vendor: https://www.tecton.ai/ <https://www.tecton.ai/>, open source 
project: https://github.com/feast-dev/feast 
<https://github.com/feast-dev/feast> and https://github.com/feathr-ai/feathr 
<https://github.com/feathr-ai/feathr>.
But still those products are not able to well resolve the most important part 
of a feature store: transformation, or we can call it featurization.
So I’m proposing a new podling project - Featurizer (name can change) to build 
an open source feature platform - aims to address the challenges of feature 
engineering in machine learning by developing a software framework that can 
automatically extract relevant features from raw data.It will provide a wide 
range of featurization algorithms that can be customized and combined to fit 
the specific needs of different applications. And by leveraging three types of 
features: request based real time feature, stream feature, and batch feature, 
the framework is supposed to run on different processor engines such as apache 
spark, Flink, Beam, or microservices etc.
I’m still new to podling here, I did few contributions before but it’s the 
first time proposing a project. So I’m looking for suggestions, feedbacks, 
champions and mentors to help on starting the project.

Thanks

Reply via email to