Showing posts with label Unit Test. Show all posts
Showing posts with label Unit Test. Show all posts

Wednesday, 14 August 2019

Need driven software development using Mocks

Excellent paper on mocking framework by jmock author. This paper was written in 2004 that is 18 years ago but has many tips of building maintainable software system.

Related image

In this post i will highlight key ideas from this paper but suggest you to read the paper to get big ideas behind mocking and programming practice.

 Mock objects are extension of test driven development.

Mock objects can be useful when we start thinking about writing test first as this allows to mock parts that is still not developed. Think like better way of building prototype system.

Mock object are less interesting as a technique for isolating tests from third-party libraries.

This is common misconception about mock and i have seen/written many codes using mock like this. This was really eye opening fact that comes from author of mocking framework.

Writing test is design activity

This is so much true but as engineer we take shortcut many time to throw away best part of writing test. Design that is driven from test also gives insights about real problem and it lead to invention because developer has to think hard about problem  and avoid over engineering

Coupling and cohesion 

As we start wiring test it gives good idea on coupling & cohesion decision we make. Good software will have low coupling and high cohesion. This also lead to functional decomposition of task.
Another benefit of well design system is that it does not have Law_of_Demeter, this is one of the common problem that gets introduced in system unknowingly. Lots of micro services suffer from this anti pattern.

Need driven development
As mocking requires explicit code/setup, so it comes from need/demand of test case. You don't code based on forecast that some feature will required after 6 months, so this allows to focus on need of customer. All the interfaces that is produce as result of test is narrow and fit for purpose. This type of development is also called top down development.

Quote from paper
"""
We find that Need-Driven Development helps us stay focused on the requirements in hand and to develop coherent objects.
"""


Programming by composition

Test first approach allows you to think about Composability of components, every thing is passed as constructor arguments or as method parameter.
Once system is build using such design principal it is very easy to test/replace part of system.
Mock objects allows to think about Composability so that some parts of system are mocked.

Mock test becomes too complicated
One observation in paper talks about complexity of Mock Test.
If system design is weak then mocking will be hard and complicated. It does amplification of problems like coupling, separation of concern.  I think this is best use of mock objects to get feedback on design and use it like motivator to make system better.

Don't add behavior to mock
As per paper we should never add behavior to stub and in case if you get the temptation to do that then it is sign of misplaced responsibility.

If you like the post then you can follow me on twitter to be notified about random stuff that i write.


Tuesday, 23 October 2018

Simple Testing Can Prevent Most Critical Failures

Error handling is one of the hardest and ignored part of software development and if system is distributed then this becomes even harder.

Nice paper is written on Simple Testing Can Prevent Most Critical Failures topic.
Every developer should read this paper. I will try to summarized key take away from this paper but will suggest to read the paper to get more details about it.

Image result for testing in production














Distributed system outage is common and some of the recent example are 

Youtube was down on Oct,2018 for around 1+ hour
Amazon was down during Prime day on July,2018
Google services like Map,Gmail,Youtube were down numerous time in 2018
Facebook was also down apart from many data leak issues they are facing.


This paper talks about catastrophic failure that happened in distributed system like Cassandra, Hbase , HDFS, Redis, Map Reduce.

As per paper most of the errors are due to 2 reason

 - Failure happens due to complex sequence of events
 - Catastrophic error are due to incorrect handling
 - I will include 3rd one on "ignoring of design pressure" which i wrote in design-pressure-on-engineering-team post

Example from HBase outage

1 - Load balancer Transfer region R from Slave A to Slave
2 - Slave B open region R
3 - Master delete current Zookeeper region R after it is owned by Slave B
4 - Slave B dies
5 - Region R is assigned to Slave C & Slave C open the region
6 - Master tries to delete Slave B znode on Zookeeper and because Slave b is down and whole cluster goes down due to wrong error handling code.

In above example sequence of event matters to reproduce issue.

HDFS failure when block is not replicated.


In this example also sequence of event and when new data node starts it exposes bug of system.

Paper has many more examples.

Root cause of error 

92% of the catastrophic error happens due to incorrect error handling.
What this means is that error was deducted but error handling code was not good, does this sound like lots of project you have worked on !

1 - Error are ignored
This is reason of 25% of the failure, i think number will be high in many live system.

eg of such error
catch(RebootException e) {
   log.info("Reboot occurred....")
}

Yes this harmless looking log statement is ignoring exception and is very common anti pattern of error handling.

2 - Overcatch exception
This is also very common like having generic catch block and bringing down the whole system

catch(Throwable e) {
     cluster.abort()
}

3 - TODO/FIXME in comments
Yes real distributed system in production also has lots of TODO/FIXME in critical section of code.

Some other example of error handling

} catch (IOException e) {
 // will never happen
 }

} catch (NoTransitionException e) {
 /* Why this can happen? Ask God not me. */
 }


try { tableLock.release(); }
catch (IOException e) {
 LOG("Can't release lock”, e);


4 - Feature development is prioritized
 I think all the software engineers will agree to it. This is also called Tech Debt and i can't think of better example than Knight Capital bankruptcy which was due to config & experimental code.

Conclusion

All the errors are complex to reproduce but better unit test will definitely catch these, this also shows that unit/integration test done in many system is not testing scenario like service going down and coming back again and how it impacts system.

Based on above example it will look like all error are due to java checked exception but it is not different in other system like C/C++ which does not have checked but everything is unchecked , it is developer responsibility to check for it at various places.

On side note language with no type system like Python makes it very easy to write code that will break at runtime and if you are really unlucky then error handling code will have some type error and it will get tested in production.

Also almost all product will have some static code tool(findbugs) integration but these tools does not give more importance to such error handling anti pattern.


Link to issues mention in paper
HDFS
MapReduce
HBase
Redis
Cassandra

Please share about more anti pattern you have seen in production system.
Till then Happy unit testing.