Raxco’s Bob Nolan explains the role of the SAN, the storage controller and the VM workflow, how each affects virtualized system performance and what system admins can do to improve slow VMware/Hyper-V performance:
There is no doubt virtualization has brought enormous efficiencies to corporate IT, but better performance is not one of the benefits companies can count on. A survey of IT managers by ZK Research indicated more than 25% of virtualization projects were rolled back to physical servers for performance reasons. That is a 1-in-4 failure rate. Queue contention, disk latency and sluggish virtual machine (VM) performance continue to plague virtualization implementations. Why is it that I/O related performance issues are still an issue as the industry moves steadily to more virtualization and SAN technology?
VMware now recommends an activity that resolves these I/O issues and this blog post explains its importance in depth. To better understand these problems we first examine what is happening at the SAN, controller and virtual machine level and their respective roles in these performance issues.
Activity at The SAN Level
Disk latency is a key metric of I/O performance. Latency is defined as the elapsed time between the issuance of a disk I/O request and its completion.
VMware and EMC contend latency of 15 milliseconds (ms) bears watching and anything over 30ms is a problem. All disk I/O is ultimately satisfied by the storage controller. The controller receives SCSI commands from the virtual guests and breaks them down into smaller disk I/O requests.
Latency problems arise when the sheer volume of disk I/O exceeds the ability of the read/write heads to respond in an acceptable time. The disks are already operating at their maximum capacity so there is little that can be done at the SAN level to improve performance short of a major investment in Flash or caching technology. Even after investing in expensive Flash/caching technology, there is still a maximum capacity that cannot be exceeded; I/O performance issues are just delayed until some future time.
The Storage Controller
The storage controller is the traffic cop for the I/O path. A continuous stream of SCSI commands arrive from the VMs and are broken down into smaller disk I/O to the SAN. The total number of disk I/O is a multiple of the incoming SCSI workload. Caching and coalescence technology have been introduced to increase controller efficiency but disk latency can still be an issue on busy systems. The controller’s role is to map the logical address information received with the incoming SCSI command and map it to a physical location in the array. All of this is under management of the controller software so there is little an administrator can do at this level to influence I/O performance at the controller.
Virtual Machine (VM) Workflow
If an administrator can’t fix a performance issue at the SAN or controller level that leaves the virtual machine. The VM includes the server and virtual disk and this is where all the work is being done. Users request applications and data on one end and SCSI commands go out across the hypervisor to the controller on the other end. The majority of what affects virtualization performance is determined by what happens inside the guest Windows file system (NTFS).
When we examine NTFS behavior we can appreciate why virtualization incurs serious I/O performance issues. To best illustrate this we will follow what happens in NTFS when a file is created:
A user saves a 2GB file as Sample.docx.
NTFS creates a record for the file in the Master File Table (MFT), the index to the volume.
This record holds some basic file information like file name and file ID. There is an Extent List in every file record in the MFT and it holds the logical address information Windows uses to internally locate a file.
Windows needs a way to keep track of the file so it now grabs 2GB of free clusters of logical address space from the $Bitmap File, a NTFS metadata file on each volume. The $Bitmap File identifies which clusters are free and used. There are two important things to note here:
- this is logical address space and NOT disk space, and
- when NTFS is looking for free space nothing else happens on the server until the request for space is satisfied.
Each chunk of address space is recorded in the Extent List in the file record in the MFT. NTFS records the:
- Virtual Cluster Number (VCN) which identifies which piece of the file it is (1st, 2nd, etc.);
- Logical Cluster Number (LCN) which identifies the starting cluster number of the string, and;
- Run Length which identifies the length of the string in clusters.
Every row in the Extent List now becomes a SCSI command to the hypervisor and controller. As far as Windows is concerned, if there is more than one entry in an Extent List it is a logically fragmented file. Keep in mind at this point nothing has been written to the disk.
Since every logical file fragment is a SCSI command it is easy to see how performance problems develop. If the sample 2GB file is saved in 100 chunks of 20MB each, it will take 100 SCSI commands to move it to the controller. The HBA-LUN queue has 32 slots so it will take 3.1 passes through the entire queue to move the file to the controller with each SCSI command passing 20MB. This is not an efficient use of queue resources and this why queue contention becomes a performance bottleneck.
The controller breaks each SCSI command down into smaller disk I/O in accordance with its software. We don’t know how many disk I/O the controller will generate for the 100 SCSI commands, but we do know the absolute minimum number of disk I/O is 100 (1 per SCSI command).
If we can control the number of SCSI commands coming from the guest systems we can reduce contention in the HBA-LUN queues and reduce the workload for the hypervisor and controller. This reduction in SCSI traffic in turn reduces the disk I/O workload. The net effect is the elimination of disk latency issues and any queue contention. When this unnecessary workload is eliminated, throughput increases and additional resources are freed up.
So, how do you reduce the SCSI workload coming from the guests?
VMware Recommended Solution
In the V5.x vSphere documentation there is a section titled Disk I/O Performance Enhancement Advice that recommends 12 measures that can be taken to improve I/O. The second recommendation on the chart is “Defragment the file systems on all the guests.”
Guest based disk optimization software scans the MFT and identifies files with more than one entry in their Extent Lists. Logical cluster optimization works to consolidate these multiple entries down to a single entry. When this is accomplished, only one (1) SCSI command is needed to access any file. This in turn reduces the disk I/O workload proportionally.
The table below compares the number of SCSI commands and the absolute minimum number of disk I/O for the 2GB file when it was fragmented and when it was contiguous.
|SCSI Commands||Absolute Min. Disk I/O|
|Logically Fragmented File||100||100|
|Logically Contiguous File||1||1|
|Workload Reduction in %||99||99|
Before optimization the 100 SCSI commands needed 3.1 passes through the 32-slot HBA-LUN queue to move the 2GB file to the controller. After optimization the one SCSI command needs only a single slot in the queue. It is much more efficient to move 2GB in one chunk versus 100 20MB chunks.
The entire I/O path from guest to SAN is influenced by the behavior of NTFS inside the Windows guest system. While file fragmentation is often cited as performance issue, the real culprit is often free space fragmentation. When NTFS can’t find sufficient contiguous free space it has no choice but to create a file in more than one logical address space, a fragmented file.
VMware realized a comprehensive solution to I/O performance issues had to be dealt with at the source, inside the Windows guest. Optimizing the guest file systems accomplishes several goals to remediate I/O performance problems.
- Reduces SCSI traffic which eliminates a large part of queue contention
- Improves disk latency by reducing disk I/O workload attendant to the SCSI traffic
- Increases average I/O size which means fewer IOPS to do the same work
- [SCSI and disk I/O workload reduction] improves throughput and productivity
- Maximizes performance in response to actions VMware takes to enforce “fairness” by tweaking the number of outstanding I/O requests with VMKernel settings
- Eliminates the unnecessary I/O workloads that create bottlenecks and impede performance
- Frees resources that can support more VMs per host
There are special issues that need to be considered when optimizing virtual guest systems. Different virtual platforms and their configurations can present situations where the optimization needs to be able to work without a negative impact on resources. You will want to make sure whatever solution you choose is adaptable to your environment. The following are some recognized conditions that can impact optimization on virtual guests:
- Thin-Provisioning: Optimizing can blow out thin-provisioned volumes unless special care is taken to avoid space expansion
- Snapshots and Clones: Virtual optimization software should avoid optimizing drives that have snapshots or are clones
- Resource Sensitivity: Optimization software should be aware of activity on other guests and the host
- Flexibility: Every site is different; optimization software should be adaptable to any virtual configuration
- Scheduling: Requirements change and you need to be sure you can easily modify how and when optimization occurs
- Central Management: The ability to control optimization across all virtual and physical systems is essential for best results
PerfectDisk Solutions for vSphere and Hyper-V take all of these conditions into consideration in order to optimize virtualized systems.
About The Author
Bob Nolan is an Officer of Raxco Software. Under Mr. Nolan’s leadership, Raxco has evolved to one of the leading providers of system utilities for the Windows operating systems. Recognizing the opportunity in the Windows utility market, Mr. Nolan and his team have transformed the company from the leading provider of OpenVMS system management software to a leader in providing solutions for the Windows market. Under his direction, Raxco’s PerfectDisk has become the recognized leader of disk defragmentation software for enterprises, small business, and consumers, with numerous awards to its credit and a fast-growing customer base. Mr. Nolan provides the strategic direction for Raxco product development and leads the development and expansion of strategic alliances and international operations.