Are you ready: Custom Logs in Apache Spark

Saturday, 26 May 2018

Custom Logs in Apache Spark

Have you ever felt the frustration of Spark job that runs for hours and it fails due to infra issue.
You know about this failure very late and waste couple of hours on it and it hurts more when Spark UI logs are also not available for postmortem.

You are not alone!

In this post i will go over how to enable your own custom logger that works well with Spark logger.
This custom logger will collect what ever information is required to go from reactive to proactive monitoring.
No need to setup extra logging infra for this.

Spark 2.X is based using Slf4j abstraction and it is using logback binding.

Lets start with logging basic, how to get logger instance in Spark jobs or application.

val _LOG = LoggerFactory.getLogger(this.getClass.getName)

It is that simple and now your application is using same log lib and settings that Spark is based on.

Now to do something more meaningful we have to inject our custom logger that will collect info and write it to Elastic search or Post to some REST endpoint or sends alerts.

lets go step by step to do this

Build custom log appender
Since spark 2.X is based on logback, so we have to write logback logger.

Code snippet for custom logback logger

This is very simple logger which is counting message per thread and all you have to do it override append function.

Such type of logger can do anything like writing to database or sending to REST endpoint or alerting .

Enable logger
For using new logger, create logback.xml file and add entry for new logger.
This file can be packed in Shaded jar or can be specified as runtime parameter.

Sample logback.xml
This config file adding MetricsLogbackAppender as METRICS

<appender name="METRICS" class="micro.logback.MetricsLogbackAppender"/>

Next enabling it for package/classes that should use this

<logger level="info" name="micro" additivity="true">    <appender-ref ref="METRICS" /></logger>
<logger level="info" name="org.apache.spark.scheduler.DAGScheduler" additivity="true">    <appender-ref ref="METRICS" /></logger>

You are done!

Any message logged from 'micro' package or from DAGScheduler class will be using new logger .
Using this technique executor logs can be also capture and this becomes very useful when spark job is running on hundred or thousands of executor.

Now it opens up lots of option of having BI that shows all these message at real time, allow team to ask interesting questions or subscribe to alters when things are not going well.

Caution : Make sure that this new logger is slowing down application execution, making it asynchronous is recommended.

Get the insight at right time and turn it to action

Code used in this blog is available @ sparkmicroservices repo in github.

I am interested in knowing what logging patterns you are using for Spark.

11 comments:

Aruna Ram12 April 2019 at 03:17
Your explanation way is too good and I like your nice post. You are providing the information was very useful for me and also I learn huge details from your blog.

Linux Training in Chennai
Linux Course in Chennai
Pega Training in Chennai
Oracle Training in Chennai
Oracle DBA Training in Chennai
Tableau Training in Chennai
Unix Training in Chennai
Embedded System Course Chennai
Linux Training Fees in Chennai
ReplyDelete
Replies
sathyaramesh12 April 2019 at 23:11
Good job! Fruitful article. I like this very much. It is very useful for my research. It shows your interest in this topic very well. I hope you will post some more information about the software. Please keep sharing!!
Hadoop Training in Chennai
Big Data Training in Chennai
Devops Training in Chennai
Digital Marketing Course in Chennai
RPA Training in Chennai
SEO Training in Chennai
Hadoop Training in Tambaram
Hadoop Training in Porur
ReplyDelete
Replies
sathyaramesh9 July 2019 at 00:46
Nice article I was really impressed by seeing this blog, it was very interesting and it is very useful for me.
Blue Prism Training in Chennai
Blue Prism Training Chennai
AWS Training in Chennai
DevOps certification in Chennai
VMware Training in Chennai
Blue Prism Training in Anna Nagar
Blue Prism Training in Velachery
Blue Prism Training in Tambaram
Blue Prism Training in Adyar
ReplyDelete
Replies
Unknown16 September 2019 at 04:31
Nice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updating big data online training
ReplyDelete
Replies
ASHOK25 March 2020 at 05:33
Can you explain it briefly?.
ReplyDelete
Replies
mahnoorburi1 May 2021 at 08:08
After research a couple of the weblog posts in your web site now, and I actually like your way of blogging. I bookmarked it to my bookmark website list and shall be checking back soon. Pls check out my web page as well and let me know what you think. buy bank logs
ReplyDelete
Replies
Devi22 May 2021 at 07:25
Learn Amazon Web Services for excellent job opportunities from Infycle Technologies, the best AWS training center in Chennai. Infycle Technologies gives the most trustworthy AWS course in Chennai, with full hands-on practical training from professional trainers in the field. Along with that, the placement interviews will be arranged for the candidates, so that, they can meet the job interviews without missing them. To transform your career to the next level, call 7502633633 to Infycle Technologies and grab a free demo to know more.No.1 AWS Training Institute in Chennai | Infycle Technologies
ReplyDelete
Replies
Jobi Johnson5 July 2022 at 23:31
This blog was very nicely formatted; it maintained a flow from the first word to the last. My Chemical Romance Jetstar Jacket
ReplyDelete
Replies
Anonymous6 July 2022 at 23:19
Your site is good Actually, i have seen your post and That was very informative and very entertaining for me. 4th Hokage Coat
ReplyDelete
Replies
Prism Lead India20 December 2022 at 01:15
Planning to move locally inside city We Know Carrying goods carefully and Safely From One place to other place is a difficult part, No worries we have professional and trained employees will take care.Moving to other states or cities is made simple by Bigguymover’s low-cost domestic packing and moving services in INDIA. Relocation Services by Bigguymover
Packers and movers bangalore near me
Packers and movers bangalore online
Top packers and movers in bangalore
Packers and Movers bangalore price
Top 5 packers and movers in bangalore

ReplyDelete
Replies
Adam14 February 2023 at 21:15

Nice post. Thanks for sharing! I want people to know just how good this information is in your article. It’s interesting content and Great work.
AWS Training Institute in Chennai
ReplyDelete
Replies

Subscribe to: Post Comments (Atom)