Example Analysis and Reports
The following charts
present an example analyses for the main detailed data categories and provide an
overview of how the
data can be used to manage your VMWare ESX installation.
Physical CPU Chart
The following chart illustrates the primary information that
is provided in this category – processor %Busy.
This is a quick and easy
way to get an overall view of processor usage for the physical platform. In
this case, there are eight physical processors and we are using the equivalent
of about two and a half of them at peak times on this day (so the system as a
whole is about 32% busy).
We note that CPU-0 is
busiest at about 32% across the day, followed by CPUs 4, 1 and 5 at about 18%
busy, then the remaining processors at about 15% busy. This information might
be useful when fine-tuning processor allocations and exploring anomalies.
Group CPU Data
The next chart provides
a summary of processor usage across the day for a specified server (in this case
it was VHost2 on 6 March). Usage levels are given as percentages of one single
physical CPU.
Usage:
We can see at a glance the cumulative usage by all virtual
machines:
- The report quantifies the
number of CPUs that are required to support the current workload.
- The peak period (which
dictates the required size of the system) has been identified.
- The report highlights and
quantifies “lost time” (Pct Ready) i.e. the percentage of time when a VM
cannot run due to contention.
- We can see how usage
varies over the day. Since we must size the system to cope with peak usage,
we can potentially save costs if we can:
- Reduce the peak by
swapping one or more VMs to another, more suitable server.
- Add other VMs that
can take advantage of the troughs.
- The report can be used to establish the “normal”
profile so that you can easily spot anomalies (which can be investigated by
drilling down into the data) and track growth.
There are other items
that could be included in the above chart (or companion charts), but %Used and
%Ready are generally the most significant metrics.
The following chart is
similar to the chart immediately above but this time, we are focussed upon the
peak hour and have quantified the usage of individual VMs. Once again, we have
included Ready time (as well as normal processor usage) to emphasise its
importance in the context of virtual systems.
Usage:
-
It is fairly obvious that there are two VMs that stand
out, because their usage is significantly higher than the rest. This does
not necessarily imply a problem, but it does suggest that these VMs might
need to be treated differently (in terms of allocated resources and
priorities) and perhaps they should be moved to a different host?
-
Consider Ready time; at first glance, you might think
that it is reasonable, but look a little closer: it seems fine for the two
dominant VMs, but what about the smaller ones? The amount of ready time
accumulated by the smaller VMs is small in absolute terms, but in some cases
it appears to represent a significant proportion of their total used time.
At this point, you might want to take a closer look. There are several ways
of doing this, for example:
But perhaps the quickest and more usual
approach is to view the data directly in the browser.
In relative
terms, the amount of Ready time is still very small for the two heaviest users
(ACCOUNTS and PERFORMANCE), but for the other VMs, it represents about 30% of
the amount of useful processing (Pct Used). In practice, this is probably not a
problem, since demand is very low in these VMs. Nevertheless, we have
illustrated how very easy it is to track and investigate these kind of issues
with RG Solutions.

In the next chart, we
examine usage over a complete day. The profile is similar®, but not identical to
that of Figure 5.
The same two VMs are dominant, but this time, the difference in usage levels
between them and the others is much less.
Virtual CPU
If we need to know more
about the resource consumption of one or more VMs, we can turn to the VCPU data
class.
In order to illustrate
the kind of information that is available, let us continue to investigate the
high-usage VMs that we referred to in section earlier.
Although we can use previously-prepared reports to examine this data, in
practice, most of our work in this area is likely to be ad-hoc when there are
more than just a few VMs. RG Solutions® can help us to examine large volumes of
data, from multiple perspectives relatively easily. Let’s start by looking at
the two “big” VMs in the browser.
The following
illustration shows the initial view of the VCPU data. Information is summarised
at hourly intervals, and we can use this to see how usage levels overall vary
across the entire day. Assuming that we didn’t already know that the peak hour
was at 14:00, we could just click twice on the “Pct Used” heading to sort the
information into descending order. This simple “trick” can be a real time-saver
when dealing with voluminous data.

We can now open up the
14:00 node and the two nodes associated with the VMs ACCOUNTS and PERFORMANCE as
shown above. There are
four items in each of these VMs: vmware-vmx, mks, vcpu-0 and vmm0 (vmware-vms
itself has two sub-items). The names are the same as those that you will see at
the VMWARE console. The major component in both VMs is vmm0 which is the single
virtual CPU assigned to each of these VMs.

If we wanted to determine how the
usage of the virtual CPU that is associated with the PEFORMANCE VM varied across
the day, we could use the RG Solutions® search facility to display records
associated with Group-Id 2225 (see the Group Id column). Note that we
can switch back and forth between the original data and the search results by
clicking on the tab at the top of the figure.

Alternatively, if we wanted to view all
of the processes associated with the PERFORMANCE VM, we could search for all
items in that Group.
Memory Data
The
following illustrations presents a subset of the information that describes
memory usage. In our sample, most of the remaining data values were zero,
because the systems that we examined were not placing any strain on the
memory management function so we have omitted a number of metrics. All
measurements are in megabytes.

The left hand chart shows Physical
memory: a relatively modest amount of memory is allocated to the Console and
Kernel; the rest is either free or allocated to VMs.
The middle chart shows memory as
either Reserved (the amount of memory that is committed to Resource Pools at
the time of the sample) or Unreserved (available for guaranteed allocations
to new VMs as they are powered on). It also shows the minimum amount of
memory that the Kernel will try to keep free (MinFree).
The right hand chart shows memory
usage from the perspective of memory page sharing: how much physical memory
is being shared (Shared), how much is common to all VMWare “Worlds” (Common)
and how much memory has been saved by just maintaining one copy of “common”
memory. Shared = Common + Savings.

The
following illustration provides an alternative view of the information
contained in the leftmost two panels of the chart above.
Group Memory Data
The
Group Memory data describes memory usage from the perspective of VMWare
“groups”, which in most cases refer to Virtual Machines and other Worlds
associated with them. A number of metrics are provided, but perhaps the
most useful are Memory Size (The amount of memory configured for a
given VM) and Target Size (The amount of memory to be allocated,
based upon recent usage – including the memory overheads of VMWare itself.
The Target Size of all VMs is less
than their specified Memory Size, but this is not always the case on VMWare
platforms. It is quite common to find that the Target Size exceeds the
Memory Size, because of the (generally small) additional memory overhead
imposed by VMWare.
Network Port Data
VMWare Network statistics are
organised in a tree structure. Devices (typically virtual switches) sit at
the top of the tree, and below that we have Ports and Port-users. Ports may
be linked to either a physical or a virtual network interface card (nic).
Port-users include VMs. The RG Solutions® Network Port Data Class reflects
this hierarchy.

We can use a report or the product’s
search capabilities to select information relating to a specific VM , but we
have also provided an optional ordering (Data Set) for this data that sorts
the information by User instead of Device.
Physical Disk Data
Like Network Port information,
Physical Disk information is organized according to the hierarchy that is
used by VMWare: Adapter, Channel ID (CID), Target ID (TID), Logical Unit ID
(LID), World ID (WID).

In the left hand screen shot, we observed that
the highest read rate occurred in LID 15 (134.67 reads per second), so we
opened up that node to find that all of the activity is associated with WID
2258. We can easily find out which World this is, by searching in the
Virtual CPU Data for a Group with a matching Group ID.
As we can see, the activity is
associated with the ACCOUNTS VM.
Summary
Use our new
RG Solutions® VMWare ESX product to examine the feasibility of migrating servers
to VMWare ESX and subsequently managing your VMWare ESX environment. Allowing you to
deliver fully optimised virtual environments, and monitor ongoing performance of
the hardware and virtual layers.
|