Are you ready: April 2023

Saturday, 8 April 2023

Safe refactoring using Scientist

Refactoring is a critical yet often overlooked activity in the product development lifecycle. Despite its importance, teams tend to neglect it until they encounter significant showstoppers during the development process. Several factors contribute to teams neglecting refactoring, including.

Pressure to meet product release dates
Concerns about production stability
Difficulty in writing high-quality unit and functional tests

When teams encounter these roadblocks, they often make a business case for refactoring, reengineering, or implementing new technology. Although there may be resistance, the product team typically agrees to it, and it is integrated into the development sprints. If you see a specific tech debt item in the sprint backlog, it is a clear indication that the code's health was not maintained due to business priorities, and now it is time to address it.

To put it in financial terms, failing to maintain code health is like missing an installment payment and incurring additional interest from your banker.

Integrated Development Environments (IDEs) have evolved to provide safe and efficient refactoring options such as renaming variables, extracting methods, simplifying branch conditions, inlining methods, and moving code. While these options are generally beneficial, some refactoring tasks are more complex and riskier. These tasks may involve changing the implementation of core components, such as modifying persistence, altering core algorithms, or adjusting underlying data structures to improve performance.

Some of the way to test such refactoring is by using feature flags or A/B testing.

While browsing through GitHub, I came across Scientist, a library that provides a way to verify critical refactoring. It offers an intuitive approach to code verification. It is based on experiment , observation & verification.

Let's take a look at some code snippets.

Experiment<Integer, Integer> experiment = new Experiment("Next Experiment");

experiment
        .withControl("BitCount Using binary string", x ->
                (int) Integer.toBinaryString(x)
                        .chars()
                        .filter(y -> y == '1')
                        .count()
        );

experiment
        .withCandidate("BitCount using native", x -> Integer.bitCount(x));

experiment
        .withParamGenerator(() -> 100)
        .compareResult("bit length", (control, candidate) -> control == candidate);

experiment
        .run()
        .publish();

This library has several components, including:

Control function
Candidate function
Experiment parameters
Result comparator function

Once you specify these parameters, you can run experiments. As you begin to use the library for more complex problems, additional considerations may arise, such as the number of times the experiment should be run, whether to run them in parallel, and setting timeouts

experiment
        .withControl("BitCount Using binary string", x ->
                (int) Integer.toBinaryString(x)
                        .chars()
                        .filter(y -> y == '1')
                        .count()
        );

experiment
        .withCandidate("BitCount using native", x -> Integer.bitCount(x));

experiment
        .withParamGenerator(() -> 100)
        .compareResult("bit length", (control, candidate) -> control == candidate);

experiment
        .times(100)
        .parallel()
        .run()
        .publish();

Other uses of Scientist

Thus far, we have discussed this library's potential for safe refactoring, but it can also be utilized for running experiments alongside real production code. This allows for the experiment to be run under the same constraints as the current code and produce useful feedback.

Since we are discussing experiments, it would be wise to store the results in a database or another system that can keep a log of the experiments.

In many cases, running these experiments can be costly, so previous results can be utilized to verify new code.

Furthermore, this library can be used to test multiple variations of new logic and select the most optimal one.

Code used in this blog is available @ github