This week, we had the pleasure of sponsoring Monitorama, an annual conference focused on open source, monitoring and developer tools—all topics in Runscope’s wheelhouse. We were in impressive company, with speakers from Netflix, Google, Twitter and many more presenting on lessons learned and best practices around monitoring their services. Monitorama’s first conference was in 2012, and the industry has grown leaps and bounds in the short time since then. Much of this year’s conference looked back on how far we’ve come, and naturally the new challenges that have arisen as more businesses have realized the value of monitoring and integrated it into all systems, from legacy to microservices. From this year’s Monitorama, we identified five modern challenges in monitoring, and the solutions from industry experts for how to combat them.
Challenge 1: Prioritizing Issues as Your Platform Grows
As your business becomes successful, you’ll naturally have to deal with growing pains. The number of issues incurred will increase, and prioritization will be key to efficiency. Netflix handles upwards of 36% of all downstream Internet traffic in the United States, and the company had several speakers at Monitorama sharing their experiences in scaling its complex and widespread system. Sergey Fedorov, Senior Software Engineer at Netflix, shared that one way to prioritize issues is to understand that big problems start small. Your customers will notice if you ignore the small things, and they will eventually be indicators of larger problems. Netflix looks at its problems from the customer perspective—issues that affect the customer experience are of the highest priority.
Challenge 2: Scaling with an Increasingly Complex System
Another common issue that comes with success is that your system must become more complex to meet the needs of a vast and diverse user base. Tony Rippy, Site Reliability Engineer at Google, presented Google’s entire evolution of monitoring, including its most humble beginnings—a Perl script running on someone’s desktop computer. As Google experienced unparalleled growth, it did a complete 180 by building an overly engineered system that was difficult to use. After many years and iterations, the company finally landed on an impressive system dubbed Borgmon that is highly redundant, built for scale and easier to manage.
From Twitter, Software Engineer Arun Kejariwal brought data science into the mix with a technical presentation on anomaly detection on server performance, complete with an array of scientific algorithms. Kejariwal's team at Twitter released an open source library to bridge the gap in technology so that other operations teams could incorporate Twitter’s research into their monitoring projects.
Challenge 3: Choosing the Right Tools for the Job
There are a plethora of creative companies out there building tools to solve even the most niche pain points. On selecting tools for monitoring microservices architecture, Camille Fournier, CTO of Rent the Runway, recommended picking just a few rather than trying to implement every tool available to fit each individual need. From her experience, adopting too many tools that do essentially the same thing and collect tons of data is a waste of time and money because no one ends up using them, leading to more confusion rather than utility.
In the same vein, Stephen Boak, CPO and Co-founder at Opsee, recommends bringing monitoring and actionable data to where the conversation is happening. Oftentimes, new products develop their own applications, which only adds more to an engineer’s plate instead of streamlining workflow. Instead, Boak suggests integrating your monitoring tool with platforms like Slack. Those communication platforms are tried and true, and Boak says that it’s better to go where your users are than to try and reinvent the wheel.
Challenge 4: Maintaining a Human Touch
Going where your users are reinforces the idea that all technology needs to have a human touch. Chrissie Brodigan, UX Researcher at GitHub, spoke at length about her company’s efforts to build a human-focused experience with a very engineering-centric product. Even the most technological or futuristic product can find room to add a human element to keep users coming back.
Challenge 5: When All Else Fails, Give Them Swag
While swag won’t fix your monitoring system blues, it doesn’t hurt to get a few tokens of appreciation for your efforts. At Monitorama, attendees strolled down the buffet of swag and were outfitted from head to toe—literally—with hats, shirts and socks.
The right tools can help mitigate these challenges and get rid of headaches that come with over-architecting and scaling. Give Runscope a try (it’s free!) and see how easy API monitoring can be—no code required.