In a perfect post-#monitoringsucks world, operations teams would have a single, magical tool that would provide capacity planning, self healing, trending information, system alerts, and application monitoring across your entire infrastructure. In reality, Infrastructure Operations has to make a choice between forcing a single monitoring system into multiple roles or choosing a best of breed solution for each component of the system. At Ping, we've chosen the latter.
We've done a few things to reduce some of the complexity associated with running multiple monitoring platforms.
The first, and most obvious, is to utilize cloud-based monitoring systems whenever possible. Cloud based applications iterate quickly and are generally very easy to implement with common automation tools like Puppet or Chef. SaaS monitoring applications also do not require server resources, database & version upgrades, or a dedicated administrator. Teams can be brought up to speed quickly and easily trained. With the exception of Splunk, we've managed to migrate almost all of our monitoring systems to the cloud.
The second thing we did was to identify and organize what components of the system we wanted to monitor (duh). Breaking these down among all production and supporting systems and coming from network engineering, we came up with the 5 layers of our monitoring stack. The layers are organized much like the 7 layers of the OSI stack. It starts at the networking level, and progresses up through the system to the external public facing endpoints. As you move up the stack, each layer becomes more and move immediately visable to the customer.