Friday, 8 April 2016

Binary Search

This blog is part of series that i wanted to start Preparation of Coding Interview

Binary search is one of the first search algorithm that we learn, it is simple to implement and very powerful, gives 0(log n) guarantee.

Quick recap of binary search


  


















It is simple to implement but nearly all binary search were broken due to one bug for long time, you can read about it @ Nearly All Binary Searches are Broken  

Enough of history lets get back to topic and discus some of the interview question on this 

 - Implement Binary Search

This question is still asked so you should practice it because all the logic around index manipulation is little tricky.


Look the line that is finding MID, it is the line that had the bug

 -Find rotation point

This is small variation of binary search, you have to find rotation point in ordered array, for e.g 
















Elements are in increasing order then it drops to min value and then again elements are in increasing order.

How do you define rotation point ?
for any element if both left & right element are greater then current element is at rotation point.

In above case 5th element is at rotation point because both left(4th) and right(6th) are greater than 5th

Another example 
char[] values = {'i', 'j', 'k', 'v', 'a', 'b', 'c', 'd'}

5th element(i.e 'a') is at rotation point

Simple Binary Search will not work on this because 2 sorted array are joined in this.
If you try to solve this by brute force then have to go through each element till you find element that is less than previous.

Brute force solution will look something like below



Since this array is sorted , so it is possible to write hybrid binary search that can take advantage of sorted data.

Now real question is how to decided when to go left or right ?

If we just apply normal binary search logic then mid point is 'v'(i.e 4), now which element should be compared to decided left/right ?

Ah it is the first one ('i') , if mid is greater than first then go to right or if it is less than first then go to left.

Lets put some picture to understand it.




 Step 1 - v > i , so we go to right
 Step 2 - b < i, so we go to left
 Step 3 - mid value (a) is at rotation point and program ends.

Code will look something like below



Hopefully visualization was ok enough to explain how to approach this problem.
My visualization skills are very bad & i have to work on it.


Wednesday, 6 April 2016

Preparation of Coding Interview


I want to share my coding interview experience, so i have decided to write blog series about questions that are asked and this is first one in that series.

Coding interview are very common these days and if you find company that does not have it as part of selection process then i think you should stay away from them.

Lets start with why coding interview is required and reason is very simple especially in times when every company is adopting agile and they don't want to wait for months after hiring to know about coding skills of developer.

So that is the reason why it is the first gate that you have to pass as developer. 

Jeff Atwood share interesting experience about why-cant-programmers-program and type of question that is asked to filter out candidate. FizzBuzz is very good question to start with. 

I would advise to try FizzBuzz to make sure you are not surprised and think about testing also because we are in world of TDD


What types of question are asked

Different types of question are asked in coding interview and it depends on type of company you apply

 - Algorithms & Data structures

This is very common for Silicon valley company or product development company and some serious preparations is required for these types of interview if your daily work involves only playing with frameworks(i.e hibernate/spring etc). 

Things become more difficult for such interview because may developers(like me) are focused so much on learning emerging framework to build bleeding edge Resume that we tend to forget basic algorithm and start tagging ourself as some XYZ framework developer and company also help to the cause by putting such job requirement.

- Simple problems using TDD

This type of questions is also very common and it is make sure that you don't leave testing to QA team and write a code that is defect free , easy to understand and maintainable.

- Refactoring problems

You don't work on green field project every day and most of the time is spent in reading others code and adding features to it, so employer want to know "how do you quickly understand code written by somebody else and add new feature to it".

TDD becomes more important in such questions because it can be used to build all the safety net around existing code before you start changing it.

Above question can be asked as pair programming question or take a home type of question, so you have to build solution based on time given to you. 

Some of the things employer wants to check are

 - Simplicity/Readability of code.
 - Maintainability of code. 
 - Design principal usage.

Bottom line is you have to prepare for coding interview to survive in fast changing market.

Stay tuned up for questions.


Saturday, 27 February 2016

How to win friends and influence people

Recently i went on long vacation and took couples of books to read while i enjoy time with family.

One of the book that i read is How-Win-Friends-Influence-People by Dale Carnegie. This book is about human engineering by reading this book i came to know simple things can make big difference.

I like this book so much that i read it twice in 2 week.

This book is practical hand book for real life. Some of the principle mention in this book are

 - Don't criticize or condemn or complain.
This is the first approach that is used to fix the problem and it should never be used.

- Give honest sincere appreciation.
  He gives many example where act of appreciation change the way people look at life. 
Appreciation should be be confused with flattery(i.e cheap praise), few people likes flattery but it will do no magic as sincere appreciation will do.

People starve for appreciation same as food and it is legal tender that all soul enjoy but still we forget or avoid to appreciate somebody for years.

Dale Carnegie makes reference to below quote and i think it is one of the quote that should be read every morning.

"I shall pass this way but once any good there for that i can do or any kindness can show to any human being let me do it now , let me not defer or neglect it for i shall not pass this way again."


- Talk in terms of other people wants.
It is based on simple rule of fishing, you don't put the food that you like in the hook for fishing.

Book has many example on how to do it, some of the areas covered are below. I like the Kids example because i need is desperately. 

 - Kid not going to school
 - Feeding issue
 - Smoking issue with kid
 - Business negotiation
 - Job interviews.

Idea to make the person want to do rather than forced to do it by talking about what other person want and how he can get it.

Another quote by Henry ford refereed in book

"If there is any one secret of success it lies in ability to get the other person point of view and see things from that person angle and as well as from you own"



- Get genuinely interested in other people.

People are only interested in themselves in morning, noon & night and this was confirmed by analysis of telephone conversation to find which word is used most. 
It was word "I", it was used thousands of time in just 500 calls.

So to make friends we have to change our interest. One of the quote from book says 

"You can make more friends in 2 months by genuinely interested in other people than in 2 years trying to get other people interested in you."

- Smile
Mr Carnegie says that Smile makes us human and shares various experiment result on smile.

People likes to see smile and they smile back at you, it make us feel so much better. 
Smile brings some thing that is more valuable than $$$$, it is in terms of happiness and friendship. You are much more rich.

Smile Philosophy 

"The value of smile , it cost nothing but creates much as it enriches those who receives without loosing anything for those who gives and it happens in flash and memory last forever , none are so rich that get along with it and none so poor for its benefit. It creates happiness and it is nature best antidote for trouble."


I have long way to go in Human engineering and this book is definitely first step .

These are only few of the principal mention in book. I highly recommend this book, if you want to read only one book in 2016 then this is the book.


Friday, 11 December 2015

Mr Cool - Bloom Filter


HashCode based data structure are basis of many algorithms, it is used for fast look up , membership queries, group by etc.

HashCode is also used to build some interesting class of data structure. In this blog i will share one of such data structure that has many useful application in real world.

Take a example that you have million or billions of item and you want to keep track of distinct items.
This is very simple problem and can be solved by using HashMap but that requires huge memory depending on number of distinct items because it has to stores actual element also.

To make this problem little interesting we can put memory constraints on solution, so effectively we need space efficient solution to test membership of item.

Nothing comes for free in this world , so with less memory we have to trade off accuracy and in some case it is fine for e.g Unique Visitor on website, Distinct Page visited etc

First intuition to solve this problem by allocating N buckets and mark the bucket based on hashcode, no need to store actual value just mark slot index.







Quick question that comes to mind is what will happen in case of collision and in case of that our result will be wrong but it will be wrong by some percentage only and there are couple of ways improve result.

 - Allocate more buckets but use single hash function.
 - Use multiple hash functions and each of the hash function has its own dedicated bucket.

Multiple hash function options works very well , using that result can be around 96% correct !

So if we use multiple hash function then data structure will look something like below.















This data structure is called Bloom Filter and it maintains X bit vector and apply X hash function on input value to mark bits.

for checking if value already exists just check if bits are set to true or not by using multiple hash function.

In terms of memory requirement bit vector of X length is required and get more accurate result multiple hash based bit vector can be used. Quick math will give fair idea about memory requirement


Bloom filter is very compact in terms of memory and size can be controlled depending on requirement.
Memory can be further reduced by using compressed bit vector.

Trade Off
It is important to know trade off done to achieve the efficiency .

 - It is probabilistic data structure means result are not 100% correct but interesting thing about this data structure is that it can give FALSE POSITIVE but never FALSE NEGATIVE. Which makes it good fit for many practical application

 - Multiple hash function are used , so read/write performance will be based on hash function.

Application
- DB Query filter
One of the most common use case. Useful in reducing false query to DB

- Joins
Yes bloom filter provide alternate way to do joins especially in distributed system. So on one of the dataset(i.e smaller) build bloom filter and send it to other nodes for joining, since it is very compact in memory so transport to remote system does not adds big overhead.
One of the disruptive Big data processing system Spark is using it for joins

- Alternate to B-Tree
B-Tree can be also used for membership queries but if size of B-Tree becomes large enough that it can't fit in the memory then all the benefit is lost.
Another big data system Cassandra is using it for read request by having bloom filter type data-structure on top of data block to avoid IO operation

Another Open source in big data space Parquet is using it for fast filters.

- Partition Search
This is just another variation of above use case, Apache druid.io which is again into big data space for fast analytic is using it by partitioning data on ingestion by time and in each partition column has bloom filter type of index for fast filters.

- Safe Website filter
This is an interesting one, google chrome uses bloom filter to find if site is safe or not.


Simple implementation of Bloom Filter is available @ github
In this implementation each hash function has independent bit vector , but single common vector can be used for all the hash function.

Monday, 5 October 2015

Soft Skills - How to teach yourself

I have started reading Soft-Skills-software-developers-manual book and liked it, it is has lot of valuable information that can help in better management/planning of software development career.

All the chapters are well written but i liked chapter - "How to teach yourself" so much that i wanted to write blog about it and keep it for my future reference.

This chapter starts with what is typical learning strategy and i have to confess that i also use this simple and no so effective learning technique

- Pick a book
- Read it cover to cover and practice examples/samples given in it.

Problem with this approach is that it takes time and we might focus on lot of things that is actual not required upfront when you start working on it.

Sounds so much like water fall method :-(

John Sonmez shares his approach and it is worth reading it.  His 10-step process is pragmatic approach to learning.

- Step 01 - Get the big picture
This is about having 50,000 feet overview of topic that you want to learn , this will help in knowing how big is the topic and what are the sub topics under it.
This can be done by doing internet searches .

- Step 02 - Determine Scope
Once you know how big/complex is the topic then decided scope of your learning

Step 03 - Define Success
Since you know your scope , so it must to define success criteria , this helps in having clear picture in mind of what it will look like when you are done.

Step 04 -Find Resource
This is pretty simple given we can get tons of information from internet, find all the resource that will help like blogs/books/videos/podcast/training material etc.
You don't have to get hung up on the quality of resource, it will be filtered later.

Step 05 - Create Learning Path
This is very important part because there is natural progression for learning, so learning path should be like that, some examples can be taken from books how topics are split based on complexity

Step 06 - Filter Resource
By this time we know what are we going to learn and in what order, so filter resource that you have collected to fit your need.

Step 07 - Learn enough to get started
This is one of the step in which most of us get too much involved and loose the track. You have to avoid the temptation of going too much deep in this part. Learn only enough so that you can get started and explore the things.

Step 08 - Play around
This step is about learning by trying/experimenting things and then referring reference material to get answers of your question.

Step 09 - Learn enough to do something useful
This step is for doing deep dive and learn as much your want so that. You have to keep eye on success criteria while you are learning, so that your are on right track.

- Step 10 - Teach

Books mention below quote for this step

Tell me and I forget. Teach me and I remember. Involve
me and I learn.
—Benjamin Franklin

I highly recommend this book , it is worth spending time on this book to learn stuff that will help in building successful career.

Thursday, 1 October 2015

Lazy Array Initialization

Lazy object initialization is one of the important strategy to avoid upfront allocation and it can become little tricky if multiple threads are involved.

Some of the language like scala has language level support for lazy variable and compiler generates thread safe code for you but in java some coding is required to achieve this.

ConcurrentHashMap of JDK7 does upfront allocation of segment object and it adds to memory footprint of empty collection but in JDK8 lazy val strategy is used for initialization to save some memory for empty collection.

So what code is used to do allocation in thread safe way ? It is simple but it has some fine details that is worth looking at, below is code snippet that is used in ConcurrentHashMap


It has check on array length because that is written after memory allocation is done.

Code used in blog is available @ github

Wednesday, 30 September 2015

Wrap around design pattern in java8

Wrap around pattern is not listed in in GOF book but is very useful for problem like below

 - Loop construct for e.g do while/while/for loop
 - Stopwatch around some code.
 - Wrap checked exception with run time exception
 - Initialization and cleanup for eg Threadpool creation/destruction or file open/close etc
 - Adding context info to threads for eg request context info for logging or passing security context etc

Lot of plumbing code is required in java to do such simple things. Java8 added lamdba support and that has answer for such problems.

Using Lambda behavior can be passed as argument to any function and it is very powerful thing if you want to solve above problems.

Wrap Around
Template of wrap around function is something like below

Pre Code
Actual behavior
Post Code

WrapAround for loop



Above code is very straight forward , it has 2 functional interface one for condition & another one for block of code to execute and those 2 behavior is passed to loop function using lambda.
This allows us to introduce new construct

Lets look at some more example

WrapAround for time/stopwatch



WrapAround Closable/Exception


Java 8 has tons of feature that can make code concise and i used just one the the feature implement really useful thing.

Code used in blog is available @ github