This week we welcome Thomas Wouters (@Yhg1s) as our PyDev of the Week! Thomas is a core developer of the Python language. He is very active in open source in general and has been a director of the Python Software Foundation in the past. Let's spend some time getting to know him better!
Can you tell us a little about yourself (hobbies, education, etc):
I'm a self-taught programmer, a high school dropout, a core CPython developer, and a former PSF Board Director from Amsterdam, The Netherlands. I've been playing with computers for a long time, starting when my parents got a Commodore 64 with a couple books on BASIC, when I was 6 or 7. I learned a lot by just playing around on it. Then in 1994 I discovered the internet, while I was still in high school. This was before the days of the World Wide Web or (most) graphics, but I was sucked in by a programmable MUD, a text-based "adventure" environment, called LambdaMOO. LambdaMOO lets you create your own part of the world by making rooms and objects, and programming their behaviour, in a programming language that was similar to Python (albeit unrelated to it). One thing led to another and I dropped out of high school and got a job at a Dutch ISP (XS4ALL), doing tech support for customers. A year later I moved to the Sysadmin department, where I worked for ten years. I gradually moved from system administration to programming, even before I learned about Python.
Besides working with computers I also like playing computer games of all kinds, and non-computer games like board games or card games. I do kickboxing, and I have a bunch of lovely cats, about whom I sometimes tweet. I'm pretty active on IRC as well, and I'm a channel owner of #python on Freenode. I also keep ending up in administration-adjacent situations, like the PSF Board of Directors and the Python Steering Council, not so much because I like it but because I don't mind doing it, I'm apparently not bad at it, and it's important stuff that needs to be done well.
Why did you start using Python?
While working at XS4ALL, a friend with whom I worked on a TinyMUX-based MUD knew I preferred LambdaMOO, and mentioned that Python was a lot like the MOO language, at least conceptually. I knew BASIC, Perl and C at the time, but I wasn't particularly happy about any of them. The MOO language had always just seemed more logical, more natural to me. When I finally tried Python, it was an eye-opening experience. Mind you, this was in 1999, and it was Python 1.5.2; compared to Python 3.8, practically the stone age. Still, I fell in love with it instantly. It just fit my brain so nicely. That I was able to easily (compared to the state of the art at the time) use C libraries, or even embed Python in C programs, was an extra bonus. I didn't get to use Python much at work until I moved to Google, but I did all kinds of hobby programming with it.
Part of why I kept programming is that I found out how much fun it was to work on CPython itself. I had worked on a number of different C code bases at the time, and CPython's was the cleanest, most readable, most enjoyable by far. I learned a lot from just reading it, and implementing small features that people asked for. I took a proof-of-concept patch from Michael Hudson to add augmented assignment (+=, *=, etc) and ran with it, getting guidance from Guido himself on a lot of the details. It took a lot longer than I expected, but that ended up becoming PEP 204, and made me a core Python developer. I was just in time to help found the Python Software Foundation as well, which we did in 2003, and I was on its Board of Directors the first three years (and again later).
My involvement with Python also meant I got offered a job at Google, working remotely from Amsterdam, to help maintain Python internally. The rest of my team is in California, and I get to visit them regularly. The work at Google is complex and diverse and challenging enough that after 13 years I'm still not bored.
What other programming languages do you know and which is your favorite?
I know C intimately, and C++ and Java fairly well. I use all three (along with Python) at my day job. I'm also somewhat familiar with Haskell, D, Objective C, and Perl. I used to use Perl a lot at my previous job, but I never enjoyed it and I don't remember much of it now. My favourite language by far is Python, but C is in a firm second place. I'm familiar enough with its pitfalls that I know when I don't know something, and where to look it up. I'm also under no illusions about its drawbacks, and would be quite happy if everybody moved to more memory-safe languages. Modern C++ -- at least the set of features we're encouraged to use at work -- is also growing on me. The main issue I have with C++ is that it has so many features you *shouldn't* use. At work we have a lot of tooling to help us make those choices, which greatly improves the C++ experience.
What projects are you working on now?
Most of my projects are internal to Google, where we have a somewhat unusual environment. My main project is similar to Gregory Szorc's PyOxidizer, although a couple years older and tightly coupled with Google's build and deployment infrastructure. The open-source work I'm most excited about is my experimentation with replacing CPython's refcounting with a traditional mark-and-sweep GC. I haven't had much time in the last year to write the actual code for it, but I keep thinking it through, and I think it's a realistic way forward for CPython. It would allow Larry Hasting's Gilectomy to happen without too much extra overhead, and while it'll mean radically changing the CPython C API, I think we can do that gradually in a good way. It'll probably take ten years before it's all done, but it'll be worth it.
Which Python libraries are your favorite (core or 3rd party)?
I'll have to shamelessly plug Google's pytype here. Like mypy, it's a PEP-484 type checker, but it actually infers types. You don't have to annotate any of your code for it to be useful, and it can generate type annotations for your code (so that you can then also check the code with mypy, for example). I'm not directly involved with its development, but it's one of the products of my team at Google and I had front-row seats at all stages of it. The people working on it are awesome -- everyone in my team is -- and pytype has caught lots of runtime errors during static analysis at Google. Getting the benefits of PEP-484 type annotations without having to figure out all the annotations is a huge win.
As a core developer of Python, what parts of the language do you maintain?
I'm one of the people most intimate with the parser and compiler. There's several of us, including Guido, so I wouldn't say I maintain them, though. I'm also very familiar with the C internals of CPython -- how objects and types are implemented -- but again, so are many others. And because I used to be a sysadmin I also involved myself with the build system early on, and some of the modules that deal with OS services (pty, ncurses, readline). Since those modules don't get a lot of attention and require fairly niche knowledge, I guess I maintain those?
Which library is the trickiest to work on and why?
The system-dependent ones, especially ones that have been around for a long time. CPython has been around for a long time, and has in the past supported more operating systems than it does now, at least actively. We used to have support for VMS and DOS as well as various flavours of UNIX that are no longer around. As a consequence, some of the C code is riddled with conditionals that we can no longer test. Often it's to work around a bug or issue that may be fixed in current versions of those OSes, but we have no way to test if that's the case.
Also, some of the older modules with (in hindsight) badly designed APIs and middling test coverage. If it's not clear how you should use a module, or the test cases we have don't reflect real usage, it can be very dangerous to make changes to them. Even bugfixes or minor improvements may end up breaking use-cases you didn't realise existed. Because all changes you make can only be fixed (or rolled back) by releasing an entire new Python version, you really want to avoid that. It's one of the reasons modules in the standard library change so little, and why it's often better to use third-party alternatives instead: they can evolve much faster, and they can be rolled back individually.
What's the best part of contributing to open source?
The best part of contributing to open-source is the people. I came into this because I liked the technical challenge, the puzzles, the making things work -- but I stayed because of the people. Back in 1999-2000 it was the python-list and python-dev mailing lists that made me realise this, and in particular people like Guido van Rossum, Tim Peters, and Barry Warsaw. They were (and are) welcoming, friendly, playful, and painfully smart (except Barry, obviously), but they openly shared their knowledge, trying to elevate others as much they could. I've met many, many others who do the same thing, and it's something I try to strive for as well. Diversity, inclusivity, openness, friendliness, it's all very important to me. I would not have stayed involved in the Python community if it hadn't been such a warm, welcoming blanket to wrap yourself into.
Thanks for doing the interview, Thomas!