How NTFS Causes IO Bottlenecks on Virtual Machines – Part 2

From a contributed article on by Bob Nolan, Raxco President and CEO.

Bob NolanPart 1 of this article details the NTFS behavior that leads to IO bottlenecks and how this relates to performance issues. In a virtual environment, where multiple copies of NTFS are running on the same physical server, virtual machines compete for disk access. Too much I/O results in disk latency issues, which causes throughput problems.

What is happening inside NTFS on each VM? When a VM is created, the virtual disk is formatted by NTFS and the volume index ($MFT) and $Bitmap file are created. When a user creates a file a record is created in $MFT and space is allocated by $Bitmap. If $Bitmap finds space in a single string of logical clusters, the starting address and its length are recorded in the file record Extent List in the $MFT. If $Bitmap cannot find space in a single string of logical clusters, it looks for space wherever it can find it until the whole file is allocated. The starting address and length of each piece of allocated space is recorded in the file record Extent List in the $MFT. Multiple extent entries in the $MFT means NTFS can fragment files before any user data is written to the disk.

NTFS is unaware of its underlying disk technology (IDE, SCSI, RAIDX or SAN). NTFS allocates space for files based on availability as seen by the $Bitmap file. The information in the $MFT Extent List represents where $Bitmap found enough space to allocate the file.

When a user needs a file, the $MFT is read and the logical address data in the file Extent List is passed to the storage controller. The controller maps these logical addresses to physical blocks on the disk.  The storage controller writes the file and updates its own index. Any subsequent access to the file (read or write) requires passing the entries in the Extent List to the disk controller. If NTFS allocates the file in a single extent, it will pass a single entry to the disk controller. If NTFS allocates the file in multiple extents, it must pass each extent to the disk controller. The controller maps each extent as it is received. A file in 2000 extents requires 2000 logical IO, while a file in a single extent requires one logical IO.

Why does $Bitmap allocate space in multiple extents? $Bitmap is created when the disk is formatted and it indicates whether a logical cluster is used or free. Installing Windows Server and other applications for the VM uses a large portion of disk space. You will discover thousands of fragmented files if you look at a disk after installing software. Even the remaining free space on the disk is fragmented from installations cleaning up and deleting their temporary files. Once free space is fragmented, $Bitmap has no choice but to create multiple extents when it cannot locate sufficient contiguous space to allocate to a file. Free space plays a key role in how NTFS behavior affects VM performance.

The remainder of this article explains how to reduce the number of logical IOs NTFS is sending to the storage controller.

Category: PerfectDiskStorageVirtualization