tu-huynh
tuhuynh
.com
$
Blog

The WET codebase

The WET codebase

wrote

In the world of coding, there’s a common mantra: “Don’t Repeat Yourself” or DRY. We’re often told that it’s really bad to copy code, and we should always strive for elegant, reusable solutions. However, Dan Abramov introduces another perspective in his talk – the idea that wasting a bit of time, or what he calls a “WET” codebase: “wasting time is actually good!“.

DRY vs WET codebase

dry

The DRY principle tells developers not to repeat things and to create smart, modular code. It’s logical; why write the same thing repeatedly when you can make a clever abstraction, right? Well, that’s where the tricky situation comes in.

Using DRY is good, but don’t overdo it. If you try too hard to remove every repeat and make things too fancy, it may seem great at first. However, like Dan says, being too focused on this can make your code not just simple but also really hard to understand.

Instead of eagerly trying to remove every duplication, he suggests that sometimes it’s okay to take a step back, let go of the DRY obsession, and invest a bit of time in understanding the problem at hand thoroughly.

Abstraction is a core concept in programming that allows us to manage complexity through breaking problems down into reusable components. However, as with many powerful principles, it also introduces risks if not applied carefully.

In this post, I’ll explain how minor, incremental changes can cause abstractions to gradually “bloat” over time through a real-world example. I’ll also go into the specific risks this introduces and share practical ways you can adopt to help catch abstraction decay before it strangles your codebase.

Bloating Abstraction

Dan shared a story from early in his career where he fell into the trap of over-abstracting code through well-intentioned but incremental changes.

Two modules contained similar asynchronous logic. To avoid duplication, this was abstracted out. Later, a synchronous version was also needed. Rather than copy the logic, a flag was added to support both cases.

Subsequently, small issues like bugs uncovered minor differences that were patched with Conditional logic. Over time, the abstraction accumulated special cases until its original purpose was obscured.

This demonstrates how abstractions intended to reduce duplication can unintentionally grow too “generic” through a series of otherwise reasonable modifications. Each change makes sense in isolation but collectively bloats the abstraction.

ended

Example

To demonstrate this pattern, consider a simple reusable async task function:

function asyncTask(data) {
  // async logic
}

Adding a synchronous version through a flag seems like a minor change:

function task(data, async=true) {
  if (async) {
    // async logic
  } else { 
    // sync logic
  }
}

Later, a format option is added since some data is XML:

function task(data, async=true, format='json') {

  if (format === 'xml') {
    // xml logic
  }

  // existing logic
}

This demonstrates how well-intentioned modifications accumulate, distorting an abstraction’s initial purpose over time.

Cost of Abstraction

Accidental Coupling - Over-abstracted code creates unintended dependencies between modules, restricting the flexibility of code refactoring. Imagine you have a house with rooms that rely too much on each other. If you change something in one room, it unexpectedly affects another, making it tough to renovate without causing problems elsewhere.

Additional Indirection - Abstraction layers introduce unnecessary complexity, making code harder to understand and maintain due to excessive levels of redirection. Think of building a tower with many unnecessary floors. Climbing up and down becomes confusing, just like navigating through overly abstract code becomes confusing and takes more time.

Inertia - Abstraction also makes it hard to change things in your code. This isn’t just about technology, it’s more about how people work together. What often happens is, you start with an idea to make things simpler by using an abstract concept. Over time, it becomes more and more complicated. However, no one really has the time or courage to fix or simplify this complexity, especially if you’re new to the team. You might think it’s easier to just copy and paste the code, but firstly, you might not know exactly how to do that because the code is unfamiliar. Secondly, you don’t want to be seen as suggesting bad practices. Who wants to be known as the person who suggests copying and pasting code? How long do you think you’ll be part of that team if that’s your approach?

Too much abstract stuff in the code causes problems like things unexpectedly relying on each other, making the code complicated and hard to change. It’s like carrying a heavy load that makes your code slow and tricky to work with.

Abstract Responsibly

Embracing responsible abstraction involves several key practices, as highlighted by Dan:

Test Concrete Code: Rather than focusing on how the code works behind the scenes, test what it actually achieves or performs. For example, if you’re building a banking app, test that the transfer money function works as intended, regardless of how it’s implemented internally.

Delay Adding Layers: Avoid making things overly complex by adding extra layers of abstraction too early. Wait until you see clear patterns or repetition before abstracting things.

Be Ready to Inline It: Sometimes, when the code becomes hard to understand due to too much abstraction, be prepared to inline the logic to maintain simplicity and clarit.

These rules are essential in ensuring that there is the right balance between simplification and abstraction layers. This ensures that the code remains simple to work with and can be changed easily as the project grows. It’s like building something where each part has a job without making things too complicated, so it’s easy to change stuff when required.

Java ecosystem’s Over-Abstraction

stacktrace

Ever tried fixing a problem in Java and ended up seeing the stack traces that looks as long as a whole book?

In the landscape of programming languages, Java has long been praised for its robustness and versatility. However, the strength can also be turned out to be a two-edged sword in case of over abstraction, resulting to excessive complexities.

Take, for instance, the handling of HTTP protocols in Java using Servlets. While Servlets are fundamental in Java for building web applications, their implementation often introduces an extensive abstraction layer that can complicate what should be a straightforward process.

Moreover, Java Servlets (and its ecosystem) often encourage a highly layered approach, where logic is hidden behind multiple abstractions. While abstraction is useful for coping with complexity, excess layering can make the actual flow of code unclear and prevent developers from understanding what they are doing.

Java usually seeks to create reusable and maintainable code. However, it can prove to be quite complicated when dealing with the simplest things such as HTTP through Servlets because of excessive complexity of the layers. It also illustrates the fact that making things complicated even for trivial tasks can make the work difficult for developers.

Go’s Code Generation

The DRY principle has been a leading light in the development of software for years now. this orthodoxy has started to be challenged by modern languages and frameworks, such as Go. They love generating codes more than hiding logics through multiple levels of abstraction (the old-school Java).

Go stands out for its simplicity and focus on clarity. The famous “Clarity is better than cleverness” originates from the Basics of the Unix Philosophy endorsed by the creators of Go. This philosophy extends to Go’s approach to code organization and duplication.

In Go (and its ecosystem), developers often opt for a code-generating approach rather than relying on extensive abstractions. This means generating code explicitly for specific use cases, even if it involves duplicating certain parts. This decision is based on the notion that explicit code is more readable, comprehensible, and easy to manage.

Go’s success and adoption by major tech companies have influenced the industry’s perspective on code duplication. While the importance of abstraction and modularity is not dismissed, Go’s approach highlights that there are scenarios where explicit, duplicated code can be more beneficial.

goclear

*Watch Opening keynote: Clear is better than clever - GopherCon SG 2019

In conclusion, abstraction is a double-edged sword. It multiples both the might and danger in code. By following the practices of detecting and addressing bloat early, the team can take advantage of abstraction benefits and avoid unwanted complexity. Embracing a little bit “WETness” just may make you realise that all code duplication is not evil. In fact, it can be a pragmatic resolution in case of particular instances. Spending a small amount of time performing seemingly redundant activities could result in a more simple and easy to manage codebase that can be understood by future you and your colleagues.

Refs: