Inside Infra: Chris Lambertus --Part I

Sally Khudairi Mon, 11 Jan 2021 06:13:52 -0800

[this interview is available online at https://s.apache.org/InsideInfra-ChrisL ]


Part I of the last of the "Inside Infra" interview series with members of the 
ASF Infrastructure team features Chris Lambertus, who shares his experience 
with Sally Khudairi, ASF VP Marketing & Publicity.

- - -
"...The thing that we're fighting against is the safety and longevity of the 
old technology. For quite some time, our primary concern was that the hardware 
that was running this, which was 15 years old, was going to fail."
- - -

 - What's your name and how is it pronounced?

My name is Chris Lambertus (“Kris Lamb bert uhss”): it's pronounced exactly how 
it's spelled.


 - When and how did you get involved with the ASF?

I've been aware of the ASF probably at least since the inception of the ASF. 
I've been working in IT for quite a long while, and I've been very familiar 
with the ASF projects, because I use them daily in my career. I didn't actually 
get involved with the ASF until a buddy of mine who was working on the 
CloudStack project mentioned to me that (ASF VP Infrastructure) David Nalley 
was looking for somebody to do some contract Infra work. Long story short, I 
talked to David and I tossed in an application. I was eventually hired as a 
part time contractor.


 - ...So CloudStack, we're talking about 2012, when CloudStack first came into 
the Apache Incubator, or was it after that?

I've been aware of the ASF probably since HTTPd, since the original Web server 
came out. I joined the team late 2014.


 - Explain your role within the Infra team —how did you get here? Were they 
looking for someone who specializes in something particular?

My understanding was David was really looking for somebody that had a 
background in production systems engineering and had been doing it for a long 
time in a production environment. That's something that I had been doing since 
1992: I've been essentially a professional production systems administrator. I 
knew that skill set was definitely in line with what David was looking for. I 
think I brought that to the table pretty well. That's basically what I've been 
doing ever since as a contractor.


 - What are you responsible for specifically?

That's a complex question, because the ASF Infra sysadmins are essentially 
responsible for everything. All of us are all responsible for all of the 
things. We do tend to specialize a little bit. My current project is probably 
reengineering the mail system: it's the largest one I'm working on right now. I 
do tend to focus a lot of my efforts on backups. Beyond that, I do a lot of 
JIRA and Confluence work with Gavin (ASF Infra team member Gavin McDonald). But 
Puppet, configuration management, again, all these things are things that all 
the Infra guys support.


 - In past interviews everyone has basically said, “we do everything”. How does 
it work? There's no hierarchy. Everyone does everything. Do queries come in and 
everyone jumps on them? Do you have a round-robin way of getting stuff done? 
How do you manage with so much going on with Infra? How do you cope with that?

“Cope with it” is an accurate term. Each of us has... I don't really want to 
call it a specialty, but definitely a focus. If some question comes in about 
the new mail routing or something that I've been specifically working on, that 
would go into my bucket as a priority. Certain people have history with certain 
types of projects. Gavin (Infra team member Gavin McDonald), for example, has 
been heavily involved with the Continuous Integration infrastructure for many, 
many, many, many years. So, he tends to be the font of knowledge for all things 
CI-related.

We tend to break things up that way. Some of the team members definitely have 
skillsets above and beyond general system administration work. Humbedooh (Infra 
team member Daniel Gruno’s username) is a very skilled programmer. He then ends 
up owning a lot of the software that Infra has developed and he has developed. 
So, questions regarding that, and specialty configurations related to software 
that he's written tend to go into his bucket. Because of the nature of the 
team, and because of the nature of the time zones that we're all in, the 
responsibility of dealing with issues follows on whoever is on call, first of 
all, and then whoever is awake and available, to handle any situation that 
comes up regardless of who "owns" the technology.


 - Describe a typical workday for you.

Apache work for me is basically: I wake up in the morning, hopefully not at 
3:00 in the morning, but get out of bed and plop down in front of the computer. 
Essentially, my lifestyle is I've always been a computer guy. I've always been 
really focused on computer system administration, not only as my work but also 
as a hobby. So, I spend the vast majority of my day behind the computer, 
whether I'm working on Apache stuff or working on other projects, things like 
that. That'll go on until 11:00 at night. So, my "workday" is essentially me 
living my life and doing tasks as they arrive and doing projects as necessary 
and getting things done that need to get done.


 - ...All the things.

Regardless of the time of day, yeah.


 - How do you keep your workload organized? Folks have all sorts of different 
systems. Are you an Evernote type of person, or do you keep your own journal? 
Do you have a certain system to help manage your workload?

Jira is the primary basis for managing my workload with the ASF. We've done a 
lot of work in terms of building technologies around Jira. Our service level 
agreement reporting tools I find extremely useful for seeing what's in the 
queue, what needs to be done, what hasn't been touched in a while, things like 
that. That really drives a lot of my day-to-day efforts in terms of replying to 
tickets and servicing customers.

In addition to that, I also use Jira to track my projects. So, if I have a 
project going on, that's usually a Jira ticket. And then I can go back and 
refer to those and see where things are, what needs to be done. I've never been 
a big one for lists of notes. I do have notes that I keep, but by and large, 
the things that are on the top of my stack maintain on the top of my brain at 
the same time. So, I don't feel like I forget a lot of things, but I don't take 
a lot of notes, which is what it is.


 - …And then there's people who have everything except the monitors covered in 
Post-it's.

I don't do a Post-it thing, but I have little text files everywhere with notes 
and things in them.


 - So, you all have day-to-day tasks that you manage, as well as things that 
require your immediate attention, as well as long term projects. In my earlier 
interviews with other Infra team members, everyone's been saying that I have to 
talk to you, because you're handling “The Email Project”. For those who aren't 
aware, standard operating procedure at the ASF is “if it didn't happen on-list, 
it didn't happen”. So, you have, if I'm understanding this correctly, 21 years’ 
worth of email archives that you're working on. What's going on with this 
project? What are you handling? Why is it so important?

Well, as you know, email is the lifeblood of the Foundation. Everything that 
happens here happens on a list. Because of that, the Foundation has amassed a 
very large quantity of email archives. Those archives are fundamental to the 
provenance of the Foundation. So, maintaining those and keeping those safe and 
available is really a top goal of the Infra team.

The mail project, such as it is, is essentially to upgrade and migrate our 
existing legacy email system to a modern, more supported system. The current 
email system as it stands was engineered by folks, volunteers, some staffers, I 
would guess, over 10 years ago, maybe 15 years ago, running on FreeBSD, which 
we don't really use too much anymore. Actually, we don't really use it at all. 
They used technologies that were interesting at the time, but are perhaps not 
so well supported today. So, a lot of it is modernization.

A lot of it is taking a lot of that old tribal knowledge that really doesn't 
exist anymore and bringing it into the modern era, documenting all the weird 
little settings that we have and all the edge cases that we manage in email, 
management of the list systems, mailing lists and their configuration, and 
making sure that gets upgraded, migrated, modernized. Doing that all in such a 
way that we don't a) lose anything, or b) suffer any downtime. So, it's a large 
project. That's really what I've been working on probably for the better part 
of the last two years, bringing that up to the present era.


 - You’re like the Titan Atlas: carrying the heavens on your shoulders. That's 
a massive, massive undertaking. Is there like a deadline for this —where's the 
end for this project? Is it never ending?

I feel more like Sisyphus than Atlas, but the deadline is as soon as possible. 
The thing that we're fighting against is the safety and longevity of the old 
technology. For quite some time, our primary concern was that the hardware that 
was running the old email system, which was 15 years old, was going to fail. In 
fact, it did. But fortunately, I basically copied the whole thing off to a 
separate colocation facility. So, we had an archive of it when it went down, 
and I was able to bring it all back up.

So, that wasn't a problem. I mean, it was a problem, but it wasn't a disaster 
as it could have been. So, the deadline is as soon as possible. But in reality, 
it's going to work until it stops working. I'm not sure how to better state 
that, because the technology is so old and we really need to get off of it and 
onto new technology. But there's no hard and fast timeline. Nobody's really 
cattle prodding me to get it done, but it's the absolute top priority that I 
have.


 - ...That was actually my follow-up question. Is the “as soon as possible” 
official, or is this something you're setting for yourself because you just 
want to get it done?

Oh, that's definitely an official timeline. Yeah.


 - ...I remember our first email servers were a machine under Brian 
Behlendorf’s desk at the Wired offices. So, we've come a long way since then.

We have, yes.


 - ...You're handling this behemoth. Are you also dealing with the day-to-day 
putting out the fires, as well as everybody else?

Absolutely, yes.


 - The volume and scale of this project seems so huge. Again, the word 'cope' 
keeps coming to mind, because knowing what I know —and I don't even know— it's 
just scratching the tip of the iceberg: it seems astronomical in terms of scale 
and scope. Are you building everything from scratch for this project? Are you 
using any kind of commercial packages? This is a huge overhaul. Tell us more 
about it.

Multitasking has been in my blood for my entire life. I don't typically have a 
problem of splitting my time and my attention and my energies between multiple 
projects. You are absolutely right: this is a titanic project. It's one of the 
reasons why it's taken so long. Like I said, we've been working on this for 
several years at this point. The reason it's taken so long is twofold: One is I 
can't spend 100% of my attention on it or else I would go absolutely crazy. So 
I partition that. I partition my mind and my time, if you will. Just a little 
bit of time here working on this, working on this particular aspect of it, then 
I'll go work on some tickets. So, I'll go work on something else. If I was only 
working on the mail, then other things wouldn't get done, right?

I have to partition it that way. I think the main way I've tackled this type of 
project... Again, my experience in system administration going back so far, 
I've worked on a lot of very large scale projects. So, this is in the middle in 
terms of the scale. But the biggest thing is to break it down into multiple 
components as small a component as you really can. The first thing to do is to 
analyze the existing system. "What is it? How is it running? How is it tied 
together? How are these things all related? Where are the pieces? Where are the 
tendrils? How far do they go?" “Write that down.”

I started developing documentation that explained a lot of stuff. There was 
some documentation that existed. I take that and I carry it forward then into 
the new system. Okay, "what things do I want to keep? What things do I HAVE to 
keep? What things are legacy? What things don't we use anymore?" That process 
of discovery, of understanding how it was built, why it was built and what 
we're still using, and what we don't need to use anymore, is probably the vast 
majority of the work--just to understand it. Once that's done, we say, "Can we 
use the old technology, or do we need to use a different technology?"

In the case of the Foundation, we're extremely tied to the way that ezmlm, our 
mailing list system, works. ezmlm is extremely tied to Qmail. So, converting 
those into other tools, basically, I'll say, it's too complicated. With the 
amount of data that we have and the amount of dependence that we have on those 
configurations, migrating it to a different system would be incredibly 
difficult. So, what we've done is there are modern versions (and updates for) 
these pieces of software, ezmlm and Qmail.

What we've done is I've taken those packages and I built them for modern 
operating systems. I've patched them with current technology, TLS and various 
modern email stuff, and put that into configuration management and built a 
system that deploys all those packages in a reproducible fashion. So, at any 
time, I can just turn on a new machine. I could type in, "This machine is the 
new mail router," and run Puppet, our configuration management software, on it. 
It'll deploy all that software automatically.

That's probably the second part of this huge phase of developing this. The 
phase that we're in right now is testing it to make sure that it works the same 
way as the old one works. Once that's verified, then we can actually look at 
migrating the old data onto the new system and deploying it into production. I 
think that answered your question.


 - I think so, but it made me think of another question: How did this wind up 
being "your" project? Was this assigned to you? Did you jump on it going, 
"Yeah, I'm taking it"? How did you wind up with this?

That is a very good question. I don't really know. I think probably just 
because I had been working with... Back in, 2015 maybe, we were actually having 
this exact same discussion: "what do we need to do to migrate this EZMLM, all 
these mail archives, all this stuff to a new modern system?"

One of the things that we looked at was, "Can we transfer this? Can we 
translate this to something like Mailman or some newer type of mailing list 
management system?" We looked at a couple of options. The biggest problem we 
had was that the archivers were terrible. So, Humbedooh basically ended up 
writing this thing that became Pony Mail as the answer to that system. 
Ultimately, that turned out to be a great effort. I think it's going to take us 
a long way. But in the end, I was the one to continue to work on the email 
system. For whatever reason, I guess it just became my thing. Maybe because I 
was the only one willing to do it. I don't know.


 - ...Is the legacy system going to be powered by Apache Pony Mail (incubating) 
at some point, or is it already in the process?

So yeah, lists.apache.org is our primary advertised archive system. That is 
what we're telling people to use. In terms of what happens to the old system, 
that remains a little bit under discussion. I don't know the ultimate 
disposition of that, but the current plan is lists.apache.org will be the 
primary access to the mail archives.


 - I noticed that Pony Mail goes back quite a bit, but it didn't originally go 
back as far as it does now in terms of the archives. I’m curious to see if 
everything eventually is going to be migrated to it.

Yes, yes, we actually have a plan to load the previous archives in there. We 
loaded a subset when we first started it up. I believe they go back to 2012 
right now. So yes, we do have a plan to load the previous archives.


 - Great. I understand some Apache projects and their communities are always 
asking for new services. How does Infra decide which products you support? Who 
gets assigned to take the lead on introducing new services or new products? I 
understand that you develop your own custom solutions as well. How do these get 
divvied up? Is everything in queue? How does it get done?

When you're talking about a project requesting a service, I think the first 
thing we look at is, "Is this service extremely specific to this one project, 
or is it something that has broad appeal to the Foundation?" If it has broad 
appeal to the Foundation, we've got multiple requests for it, it's a service 
that we feel we can provide, given the amount of time that we have available, 
then it's something that we would consider doing.

Obviously, there's a lot of other thought that goes into that in terms of what 
it is, what it does, what it needs to do, who needs access to it, that we have 
to evaluate. But generally, if it has broad appeal to the Foundation, it would 
be something we would look into. If it doesn't, if it's something that's very 
specific to a certain project, what we typically recommend is that a project 
request their own VM. They can run the service themselves. That's typically how 
we’ve approached that in the past.


 - Has the team been in a situation where you're like, "Hey, this is a really 
cool thing, let's bring it in," and then throw it on projects or see if anybody 
wants to do it? Does the converse happen also, where you guys have insight as 
to something that's hot and new and you think that would be a great fit for 
Infra, but you have to find a "problem" to connect it to; or is that not 
something that you deal with? Is your work all reactive, or do you ever come 
into a situation where you say, "Long-term planning: we want to introduce 
something brand new"?

I think probably up until maybe five years ago, the work was almost entirely 
reactive. But the team and the processes that David (Nalley) now put together 
have really pushed us more in a direction of future planning, of taking the 
time and taking the mindset of, "What can we do long term to better support 
projects?" I think selfserve.apache.org is a great example of that. That's 
something that grew out of a small subset of tools. We got very positive 
feedback from Committers and Projects about using selfserve.apache.org.

That tool has grown extensively since it was developed. I think one of the best 
things that we provided recently is the .asf.yaml system, which allows projects 
to essentially set up 90% of their project metadata in GitHub. It lets them set 
labels. It lets them set notifications. It lets them set all kinds of things, 
all self-service. So, it's taken a huge load off of Infra in terms of 
responding to tickets, and also put a lot of that control in the hands of the 
projects. That's been incredibly well received. It's definitely, I think, one 
of the best things that we've done for projects in a while. I think it's a 
fantastic tool.


 - That's great. Now, it's also a new way of institutionalizing, so to speak, 
of "scratch your own itch", but in a way that that's a common deployment. You 
can do your own thing, but there's a common mechanism or method of doing it, 
because before —it was like the Wild West, back in the '90s— everyone's just 
doing their own thing. It didn't really matter, but it wouldn't scale properly: 
you guys can’t really support them because everyone's doing something and it 
was a one-off. It's interesting to see that selfserve.apache.org has 
standardized or unified that process.

Yeah, and one of the things that I really like about it too is because we have 
so many different projects --they're so varied-- the people that work on them 
are so varied in their skill sets and their desires and their interest level 
and their skill level and all this. What we want to be able to do is empower 
projects to use the tooling and take advantage of the skill sets that they have 
available. So, we don't want to arbitrarily enforce, "Oh, you must use this 
particular technology," but we also don't want random technologies to 
proliferate, like you alluded to, the Wild West. So, it's a very refined 
balance between, "How do you allow projects to do their own thing in a way 
that's scalable and supportable?" That's a complex task. It's difficult to 
manage. I think Self-Serve (selfserve.apache.org) goes a long way to support 
that.


 - Speaking of Self-Serve and other solutions the team is providing, the 
strategic process of figuring out where to go —direction— I know you have David 
(Nalley), I know you have Greg (ASF Infrastructure Administrator Greg Stein). 
Does the entire team participate in this? How does this work: is it top-down, 
or is it bottom-up? Are you guys saying, "Hey, there's a new thing that we 
should do"? I presume you don't have an annual strategy, but rather an ongoing 
rolling process; how do strategic decisions get made?

It's a collaborative effort, for sure. I think we do have an annual —when we 
get together at ApacheCon, we do tend to have a lot of discussions about 
strategy and about future direction. That's one of the things that we try to do 
as Infra with our team meetups, and with ApacheCon as well, to get together in 
person in a room and talk about where we're going to go, what we want to do. I 
say the process is collaborative, because sometimes it comes from the direction 
of Greg, or the Board, or David, or whoever. Sometimes it comes from a staffer 
saying, "Hey, it'd be cool if we could do this."

Sometimes it comes from Projects, or Committers, and they say, "Hey, can we go 
in this direction? I think it would be useful for X reasons." It just depends. 
By and large, the decisions for a future strategy are brought up by whoever 
thinks of it and are discussed within the team at a peer-to-peer level, right? 
We have very few situations where Greg or David or somebody will come down and 
say, "Thou shalt do it this way." Yeah, very uncommon to have that happen. It's 
a very collaborative environment, which I appreciate and works well for me.


 - So, in light of the pandemic, you guys didn't have your face-to-face. Did 
you do a virtual annual meeting? Or did it just not happen?

Well, we have a weekly team meeting. Yeah, we didn't do any virtual thing 
beyond that.


 - With so many projects at the ASF now, with 350 projects and initiatives and 
growing and so few of you in Infra, you must be constantly learning new things. 
How do you keep abreast of what's new? How do you close your skills gap? How do 
you stay ahead of everything?

I follow a few mailing lists, discussion boards, Reddit, and other similar 
sources. I typically learn new things when I need to implement new technology 
to solve a problem. "How do we provide 'X'?" I’ll go research it and learn that 
way. I also find out about new things from my hobby projects or other work.  


 - ...It's not like "I want to take Blah University to become certified in X" 
or anything like that, right? I mean, you’d do that from your own interest, but 
it's not something that's required of the job unless it comes up, right?

No, that's never been required of the job. Personally, I'm very much a 
self-directed learner. If I'm interested in something, I will absolutely seek 
out the resources to do so. I will say that there's not a lot of time for that 
stuff, at least not for me. I got a lot going on, right? So, having the time to 
sit down and take a class or go through that process, I find very difficult. I 
don't really learn that way very well either. So, class-based learning has 
never been for me.


 - ...Not linear. Yeah.

Yeah. So, typically, if I want to learn something new —I've been trying to 
learn Python, because it's definitely a gap for me— I find it incredibly 
difficult, because it's very hard for me to sit down and watch a video on 
programming, right? I got to have a reason. I got to have a thing to do. I need 
to have a project that requires it. And then I go and I figure it out.


 - ...Got it. So, it's purpose-driven education. You need an end result.

Yeah, exactly. That's how I've always operated.

[END OF PART I]

= = =

NOTE: you are receiving this message because you are subscribed to the 
[email protected] distribution list. To unsubscribe, send email from the 
recipient account to [email protected] with the word 
"Unsubscribe" in the subject line.

Inside Infra: Chris Lambertus --Part I

Reply via email to