Although there may be many x86 virtualization platforms, such as VMware ESX, Xen, Microsoft Virtual Server 2005 R2 to name a few, monitoring is pretty much the same from one to the next. In this article, I will discuss what is involved in monitoring a virtual infrastructure, including monitoring the physical host servers, the virtual machine monitors (VMMs) / hypervisors, the virtual machines (VMs) and the applications running inside the VMs. I will conclude by looking at how to understand the performance metrics being gathered.
Physical host servers
Monitoring the physical host servers in a virtual infrastructure is extremely important. Because a single physical host server can host tens of VMs, it must remain healthy and without problems. My intent is not to scare people away from virtualizing their infrastructures but to make everyone aware of how important it is that you not forget about the physical hardware on which your virtual infrastructure resides.
The first place to look for server monitoring tools is the vendor. Dell offers tools such as OpenManage and IT Assistant, and HP offers its OpenView software. In many cases, the hardware vendors’ monitoring solutions are the best choice for monitoring hardware, because these tools are obviously designed and supported by the same company that made the hardware.
But you’ll also find no shortage of third-party solutions at your disposal. Both Dell and HP provide management packs that plug in to Microsoft Operations Manager (MOM). If money is an issue, check out Nagios Nagios is an open source monitoring program for hosts, services and networks. One of the environments I work in uses Nagios and I am quite pleased with the program’s capabilities. Not only is Nagios free, but it gives many pay-for products a run for their money.
The process of monitoring physical hardware in a virtual infrastructure is nearly identical to that of monitoring the physical hardware in a traditional server infrastructure. But because of the tens of VMs that depend on them, maintaining the health of x86-based physical servers is more important than ever.
Virtual machine monitors / hypervisors
A lot of people ask me about the difference is between a VMM and a hypervisor. The answer is, “Nothing” A VMM does exactly what the name suggests; it monitors and manages virtual machines. The term “hypervisor” is a play on the name of another computing component, the kernel. When kernels were a new thing, they were known as “supervisors” because they supervised the machine; hence, the term hypervisor refers to a VMM that supervises many machines, albeit virtual.
Unlike the software that monitors the underlying hardware, the software that monitors the hypervisor depends on the type of hypervisor you are using. If you are using VMware ESX, you have several options. Just as with monitoring hardware, the best place to start looking for virtual monitoring solutions is the vendor. VMware includes a Web-based management/monitoring interface to ESX called the Management User Interface (MUI) that, in addition to managing ESX, can tell you the current utilization of the VMM.
The MUI has a very nice availability-reporting feature. From the console in ESX, you can enable another Web-based reporting tool called vmkusage. While the MUI requires the user to authenticate, vmkusage provides a read-only, anonymous view of the state of the ESX VMM. While you are logged into the console, you can also run a tool called esxtop. Esxtop is similar to the standard top command, but unlike the top command, esxtop will also show the real-time utilization of the different ESX worlds, including the VMM.
VMware also produces a separate management/monitoring solution called VirtualCenter. Although VirtualCenter does not provide any additional monitoring information, it does let you set up events and alarms that can notify you when certain lower and upper resource limits are exceeded. Of the third-party ESX monitoring solutions, just one stands out, NetIQ AppManager for VMware.
All of the monitoring solutions for Microsoft’s Virtual Server 2005 R2 VMM come from Microsoft. You can use the standard Windows event logs to monitor the VMM, an approach already used by many Windows systems administrators. Virtual Server 2005 R2 also installs Windows performance counters that can track the utilization of the VMM. If you do not want to develop a custom utilization monitor with the Window Management Instrumentation (WMI), Microsoft Operations Manager (MOM) already leverages the Virtual Server 2005 R2 performance counters to provide a robust monitoring solution.
A few open source Xen monitoring solutions are worth mentioning. Libvirt is an open source toolkit designed to interact with Xen and other open source virtualization platforms. Also, Argo the Xen Monitor is a framework for managing and monitoring Xen. Commercial Xen solutions provide their own monitoring tools. XenSource’s XenEnterprise has a monitoring solution that provides a real-time view of the VMM’s performance. VirtualIron’s Xen package comes with a management and monitoring solution called VirtualizationManager.
All current VMMs require some sort of host OS or privileged control OS. For VMware ESX and Xen, this is Linux, which means that the control OS can use native Linux monitoring tools to gauge the utilization and state of the VMM. A perfect example of the KISS methodology is the syslog daemon. You can configure the syslog process to copy its logs to a dedicated log server so that they are available in the event of a catastrophic failure. One of my favorite tools is a product called splunk. The creators of splunk had the amazingly genius yet simple idea that logs are more useful when they are compared with similar logs from around the world. The Unix/Linux system management tool monit can also be used to watch your VMM processes.
Think of the hypervisor as your brain. Your body (the VMs) can be perfectly healthy, but if your brain fails you, then your body does not know how to function. Even though hypervisors, like our brains, are designed to “just work,” active monitoring is necessary to prevent possible total system shutdown.
Virtual machines
The VMs are analogous to your old servers — they are running software to fulfill a business purpose. Just because your servers are now virtual does not negate the need for adequate monitoring. Luckily, this is quite easy because the VMM monitoring solutions almost always have the capability to monitor the VMs. For a list of these solutions, please refer to the last section.
Applications
Monitoring applications running inside VMs is no different than monitoring applications running on a physical server — the same software can be used and it is as necessary as ever. I have met too many IT professionals who are under the mistaken impression that an application hosted virtually is not subject to traditional stress and rigor. Although ideas about application monitoring should stay the same, ideas about application and system utilization must change, and that is the subject of the next and final section.
In conclusion, it is important to adequately monitor your virtual infrastructure to guarantee its health and ensure that you are not losing money due to underutilization.
Feel free to contact E-SPIN for Virtualization monitoring infrastructure and application security, infrastructure availability and performance monitoring solution.