Thursday 22 November 2018

Spark Run local design pattern

Many spark application has now become legacy application and it becomes very hard to enhance, test & run locally.

Spark has very good testing support but still many spark application is not testable.
I will share one common error that you see when try to run some old spark application.




When you see such error you have 2 option
 - Forget it that it can't be run locally and continue work with this frustration.
 - Fix it to run locally and show example of The Boy Scout Rule to your team


I will show very simple pattern that will save from such frustration.

This code is using isLocalSpark function to decided how to handle local mode and you can use any technique to make that decision like have env parameter or command line parameter or any thing else.

Once you know it is run local then create spark context based on it.

Now this code can run locally or via Spark-Submit also.

Happy Spark Testing.
Image result for i love testing

Code used in this blog is available @ runlocal repo

1 comment:

  1. The comprehensive migration solutions and the automated mapping of data between the target system and the source provided by your company help in successive data migration. It makes your company be one of the data migration service companies

    ReplyDelete