Here’s an alternate wording for that quote: code is about communication; communicating instructions to machines, and concepts to developers.
Knowing this reality, I find most developers underestimate the value of clean code. Clean code is a blessing because readers can focus on the task at hand. On the contrary, messy code is full of distractions; readers waste mental clock cycles thinking about latent bugs or inefficiencies.
The ability to draft clean code is not necessarily a product of experience, or investing a bunch of time manually reviewing code diffs, it comes with the right developer toolchain.
Origins#
While working on the biotech startup Synthego’s innovation team,
I often worked in a green field without code review.
Paired with the reality that immunotherapy research involves
multi-week experiments and costly reagents,
preventable software failures (e.g. TypeError
s) were excruciating.
I became interested in automated quality assurance,
both to prevent bugs and receive feedback on my programs.
Modern conveniences such as Python typing
,
pyproject.toml
-centralized configs,
and GitHub Actions didn’t yet exist,
though flake8
plugins were aplenty.
I began building a toolchain just for myself,
discovering where the tools helped, missed, and hindered.
Adoption#
Over the years, I passively expanded the toolchain,
growing it for new discoveries like codespell
or black
preview rules,
disabling aspects found unhelpful,
and contracting it alongside consolidations into ruff
.
Like a plumber’s toolbox, I repeatedly ported it to my current setting:
production software at Synthego,
group projects in Stanford computer science,
open-source software facing easily-prevented bugs.
I started receiving gratitude from colleagues
who had not experienced such a toolchain before,
and this began to give me ideas.
Leaving Synthego for FutureHouse in early February 2024,
I created the GitHub repo configurator
with a two-part vision:
- Creating a single toolchain fluent for both early-stage exploratory research and production software.
- Building an automated yet flexible system to propagate tooling updates across repos.
Fast-forward in time, and item 1 has come true. FutureHouse has adopted the configuration system org-wide. It’s been used for research projects such as PaperQA2, aviary, and ether0, as well as adopted universally by our platform engineering team.
Item 2 has some weak competition from tools
like cookiecutter
or nitpick
,
but really the ultimate solution will be an AI agent.
Maybe someday there will be time to properly tackle item 2.