Hi all, I am Silas Maughan, a Software Engineering BSc student at CODE University of Applied Science in Berlin. For the past year and a half I worked as a Computer Engineer at CERN, where I built and operated a production Apache Ozone and Hadoop storage cluster (~100 TB) supporting the High-Luminosity LHC. Part of that role involved directly integrating CERN's existing Spark infrastructure with Ozone, giving me hands-on experience with the performance and compatibility challenges that arise at that boundary. I have submitted an abstract to Community over Code this October titled "Apache Ozone at CERN: Migrating a Large-Scale Hadoop Storage Infrastructure".
I have submitted a GSoC 2026 proposal for SPARK-55163 (client-side metadata caching for Spark Connect). I have read the SPIP and studied the Spark Connect Python client and AnalyzePlan RPC paths. I would be grateful to connect with the listed mentors and welcome any feedback before the deadline. Apologies for the late contact! My research contract at CERN concluded last week and my priority was delivering a quality wrap-up of my work for Community over Code. GitHub: github.com/silvanias Proposal: submitted via the GSoC portal Best, Silas
