Hi KDE developers, I created a small dataset of code-description pairs of KDE/C++ code to see how a language model (LLM) can be used to describe KDE code. I fine-tuned a T5 model with this data, and the results are promising. The model is open-source and can run on a laptop. However, the model's accuracy would significantly improve with a larger dataset. Therefore, I would like to invite anyone who is interested to add to the dataset. You can find the model and a link to the dataset here: https://github.com/tm243/CodeT5-KDE Please note that this project is still in its early stages and the results may not be accurate in all cases. Best,Thomas