About the speaker
Rick Vanover (vExpert, MCITP, VCP) is a product strategy specialist for Veeam Software based in Columbus, Ohio. Rick is a popular blogger, podcaster and active member of the virtualization community.
vSphere 5 Changes for Backups and Administration
In this session you will learn:
- About VMware vSphere 5
- Major changes in scale
- Virtual machine hardware version 8
- Major changes in storage
- Storage DRS
- VDDK 5
- New version of HA
- How backups are impacted with vSphere 5
More sessions by Rick Vanover
Rick Vanover: Hello and welcome to this next session of Backup Academy: vSphere 5 Changes for Backups and Administration. I'm Rick Vanover, and I'm going to be your instructor for this session. Before we get started on this material, I want to take a second and tell you a little bit about myself. I work at Veeam Software, makers of virutalization data protection software and virtualization management tools, but this content, as you know, is very product neutral in nature. I'm a blogger and podcaster, and I've contributed to a number of industry publications. I write on the Veeam blog. I contribute to the Everyday Virtualization blog at Virtualization Review. I also write in the Data Center blog at TechReublic.com. I hold both VMware vExpert and VMware certified professional credentials as well as Microsoft certified IT professional status.
So let's go through the agenda of what we're going to talk about here today. I'm going to start off talking a little bit about VMware vSphere 5. From there, we're going to go into some of the specific topics of how it's different for both virutalization administration and backup. Specifically, we're going to talk about the scale has changed. VMware vSphere 5 has really done a good job of scaling upward for virtual infrastructures. We're going to talk about what is virtual machine hardware version 8. Then we're going to talk about storage. vSphere 5 is often referred to as the storage release. We're going to go through a number of features, such as storage DRS, VDDK5, VMFS5, as well as the new version of HA. Then we're going to wrap up talking about
So let's talk about why vSphere 5 is important to our backup strategy. Specifically, as we go onward in our versions of vSphere and virtualization in general, we need to be aware of the changes in the platform. Specifically, how we want to go about backups with the ESX 2.0 and VMware GSX days. That does not apply today to our virutalization tool and strategy for vSphere 5. As we've changed through the years with DI3 and vSphere 4 and now vSphere 5, all of those different changes, we need to really reassess how we do every little step each time with the new versions.
So when it comes to the technology involved, a number of different areas are impacted. Specifically, what makes up a virtual machine? I mentioned earlier new limits of scaling for VMware virtual machines, how those virtual machines are configured is very important. Specifically, older versions of hardware, levels of virtual machines complicate the ongoing administration of virtual machines as they are intermixed with newer virtual machine. So we have a huge push to adopt the features that we need for the platform we are using. That is because everything is related, and it can impact our performance, our backup configuration, as well as our design of our virtual infrastructure.
So let's talk a little bit about VMware vSphere 5. Specifically VMware vSphere 5 is the leading virtualization platform. It's evolved from VMware vSphere 4, which evolved from VI3, and it delivers the most robust features for virtualization today. In this figure, you see a generic representation of a vSphere environment. Basically, we'll start at the bottom. The disk looking icons, those are shared storage resources. Direct attached storage is supported as well for vSphere, but you really don't reap the benefits of the higher levels of availability until you introduce shared storage.
Then you have your ESXi hosts, which are general servers, and any mainstream server is supported. It really comes down to processor and storage interface and network interface, but that usually follows the normal inventory of servers that are available today. Then on the pool of infrastructure, that hardware, we can stack multiple operating systems. In this particular figure, they are all Windows VMs. That's really representative of my experience, but Linux VMs as well as a number of other operating systems are supported as virtual machines with VMware vSphere 5. Here, you could have a number of robust features.
DRS, the VMware's Distributed Resource Scheduler, for example, will move virtual machines from one host to another based on a number of factors, including how busy the first host is. VMware HA is another feature that allows a virtual machine to be restarted on another host in the event of a hardware failure of the first host. VM FS is a shared file system, a clustered file system to be more specific for the virtual machines provided specifically by VMware for this purpose. Then there are advanced storage features, which we'll talk about more in a bit.
VMware vSphere 5 is also the first release of this platform, this evolution of this product, ESXi that does not have service console. So, as of VMware vSphere 5, VSX, the full installed version is no longer available. So now, we only have ESXi. This is really important when we talk about host based agents. When we first started with ESX and VMware vSphere and VI3, that was something to consider. Maybe I'll put in an agent on the host to do this particular function. Specifically when we're talking about backups, we definitely have a stop of the line. The end of the line has been defined for agent based installs on the host. So, when it comes to a backup product, we need to make sure that we have a product that fully supports all of the APIs and programmatic interfaces into the ESXi or vSphere stack. ESXi is arranged like this. There's really three components. Hardware monitoring agents, system management agents, and host process and third party agents. That makes up VMkernel, which is the hypervisor that we call ESXi.
One of the cornerstone features of VMware vSphere 5 are incredible changes in the scalability. Specifically, we can now provision the dreaded jumbo or monster VM with 32 virtual CPUs and up to one terabyte of RAM. Other scale points are available with vSphere 5, but those are the big ones. 32 virtual CPUs and one terabyte of RAM. Just because we can doesn't mean we should, but the takeaway is that the platform is not limiting. We can really provision incredible amounts of resources with vSphere 5. Same goes for the host. We can also put up to two terabytes of RAM inside of one ESXi host. We can also include an incredible number of resources.
The best way to get your head around the changes in scale for VMware vSphere 5 is to read the configuration maximums document. This might sound familiar if you've been on any VMware certification path, as this document is central to your study practice. I've included a short URL, vee.am/vsphere5max. That will take you right to the configuration maximums document for VMware vSphere 5. Simply put, it is the authoritative resource for configuration. So, if you want to know if a virtual machine configuration is possible, this is the resource to check out.
The first thing I want to talk about is hardware machine version 8. Specifically, version 4 of the hardware virtual machine version was available in VI3. Then with vSphere 4, we introduced virtual machine hardware version 7. With vSphere 5, the virtual machine hardware is up to version 8. You can update this in a number of different ways. You can do it in a centralized managed fashion with update manager, or you can use the vSphere client.
As you see on the screenshot, it can be that easy. Upgrade virtual hardware. The virtual machine does have to be powered off to do that, but it is possible to do it one at a time that way. Update manager is going to be your best way to do it on scale with a number of virtual machines. This is important specifically for backups. Simply put, all the features may not work correctly or may not be working in an optimized fashion on older versions of hardware on newer hypervisors. So it's important to make sure that each virtual machine is up to the current hardware version of that vSphere platform.
vSphere 5, I mentioned earlier, is also known as the storage release. There are a number of new storage features that are presented with vSphere 5. We're going to talk about only some of them in this content. Basically, they really are set up to deliver additional performance and additional management. They're not just limited to VMFS. We talked earlier that VMFS is a proprietary clustered file system for VMware virtual machines. But now some of these features are starting to show up on NFS resources.
Specifically, storage I/O control, which allows throughput to a certain device to be managed through vSphere is now available for NFS as well as some of the VAAI offloads for NFS are available now, too. VAAI stands for vStorage APIs for Array Integration. Those are a very important set of SCSI primitive commands that instead of being done by the ESXi host, those are sent to the array. The array owns the disks, whether it be block storage like fiber channel or iSCSI, or a NAS storage protocol like NFS. Now we can send some of that work to the array to do that work for us.
So in the graphic below, we've taken a simplified version. The storage lives in between the host and the disks. Now, direct attached storage is a little bit different in that the disks are directly in the servers, but we still could use a VMFS volume for local attached disks. Other storage protocols include NFS, Network File System, which is a NAS based protocol. Then the two block storage protocols would be iSCSI and fiber channel.
One of the next important features of vSphere 5 is storage DRS. Storage DRS, simply put, is a technique to pool VMFS volumes logically. What I mean by that is we present one large clustered data store, which is composed of individual explicit data storage beneath it. So, this logical data store is pool, and from there, we can set a few configuration parameters, which I'm going to show you on the next screen, which allow us to set some of our basic management of these storage resources. From practical real world experience, I can tell you that vSphere administration spend a lot of time historically looking at data stores. How are they performing? How full are they? That type of stuff. Then they are constantly, from an operational standpoint, moving things around. Storage DRS can help us manage that a lot better.
Storage DRS really gives us two very important things to monitor. Utilized space and I/O latency. So, I snooked [??] out part of the screenshot here. Basically, these two slider bars will allow you to set your storage DRS policy based on how much free space to watch and how much latency to monitor. Basically, the administrator can craft it to be very aggressive or not so aggressive. What this will do is cause storage vMotion tasks. Again, all the individual data stores are still regular VMFS data stores, and they can hold VMs and run VMs, but they're a member of this pool. So storage DRS will position these VMs based on this policy so that the utilized space and I/O latency requirements can be met, assuming that there's free space.
The next thing I want to talk about in relation to VMware storage is the VDDK 5. VDDK stands for Virtual Disk Development Kit, and it has been updated with vSphere 5 from the initial release of vSphere4. Specifically, it's used for hot add a virtual disk, which hot add is a subset of the hot add capability. So, we may have been able to do hot add for memory and CPU and maybe network cards. Hot add for disks, specifically as it applies to backups, really is around the snapshot. So when you do a snapshot of a virtual machine, there's a sequencing where you can co-map a virtual machine's VMDK disk to two virtual machines, because at that point, that VMDK is read only.
So in the figure here, I have two VMs, one on the left and one on the right. Let's just say the one on the left is a regular virtual machine who has been snapshotted. While the snapshot is in place, that VMDK is read only. During that time, through the VDDK, the virtual machine that is running the backup application can be provisioned direct access to that virtual machine disk file, that VMDK file, through the VDDK. That would be a virtual appliance hot add or VMDK hot add, as it might be called conversationally. What happens is that VM has direct access to that disk for the purpose of a backup.
Then following that sequence out, once the backup task is done, the hot add would then be removed, and then the snapshot removed. At the end of the line, the original virtual machine is still mapped exclusively to that VMDK, but the virtual machine would be able to be backed up in a very efficient manner. This is one of those things where we really kind of zero back into the agentless backup, because this can be a very, very fast way to backup virtual machines. We're leveraging our platform, but most importantly, we don't have to have the management burden of agents in the guest. From a preference standpoint, agent increase your management burden, but understanding these capabilities can help you reduce that management burden.
Now I want to talk about VMFS 5. I've mentioned earlier, but basically, VMFS 5 is the virtual machine file system version 5 that was released with vSphere 5. This is available for block storage protocols. That specifically is fiber channel and iSCSI. Now, when I mentioned that VMFS is a clustered file system, let me take a moment to explain what I mean by that. Specifically, I did earlier say that it's specifically built for virtual machines, and it's a VMware proprietary technology. Further, it's a unique file system among clustered file systems in that there's no software coordination that goes on. So the file system itself manages the concurrent access by multiple hosts. So, that's a good thing in the sense that it is easy to scale and easy to manage. But beyond that, we need to make sure that we have the compatibility we need.
In terms of compatibility, let's talk about backwards and forward compatible nature. VMFS 3 did a really good job of allowing older hypervisors, all the way back to ESX 3.0 all the way up to ESXi 4.1, and vSphere 5's ESXi can read those hosts as well and read those data stores. Further, ESXi can actually format new storage resources at VMFS 3. So if you still need additional data stores that are VMFS 3 to move stuff around, ESXi 5 can do that for you. One note is that you would see that the version of that data store, by right clicking on that data store and selecting properties, would be 3.54. That's an easy way to tell you that an ESXi 5 host made that data store. But the ESXi 4.1 host and earlier can get into that volume just fine.
Upgrading to a new ESXi environment is one thing, but upgrading the storage is another. So, we have to make sure that we don't leave our data stores behind. We could go ahead and just leave them all as VMFS 3, but there are some other features I'm going to talk about in a second that are very important to the VMFS 5 volume. So we want to make sure that these volumes get upgraded. The easy option is just to right click on it and say upgrade to VMFS 5. That'll work, and that'll even work online believe it or not. But you may want to consider using a new one for VMFS 5 on the storage system and then move everything off of the VMFS 3 volume. Then remove that VMFS 3 volume and reclaim that space to the storage controller. I'll be honest, it's a lot more work, but it's going to be cleaner, and you really remove the risk of any leftovers from the years and many different versions of ESX and VMs on these data stores. It can be just a cleaner solution.
VMFS 5 also brings a number of new features that might help entice us to moving to VMFS 5 via a new format. Specifically, there's a unified one megabyte block size. Now, when you format a VMFS volume, you're given a choice. Do you want to do it as a 1 meg, a 2 meg, a 4 meg, or an 8 meg block size? That was an important choice, because that determined the largest size of a VMDK disk that you could put on that volume. Now with VMFS 5, there's a unified one megabyte block size that takes away that guesswork so that you don't have the issues downstream. Specifically, in VMFS 3, one meg limited us to only a 256 gigabyte VMDK file. Now we don't have that limit, so you can put VMDK files at 2048 gigabytes on a one megabyte VMFS 5 volume.
VMFS 5 also has improved its sub-block algorithm. So if you take a look at the VMDK disk file, that's not the only thing on the volume. There's other things, like a VMX file, a log file, and what you'll see is that there's a very polar disparity of the I/O proof file. So there's some really, really big files, the VMDK files, maybe CD-ROM ISO files. Then there's some really, really small files, like VMX files and ISOs. So what we have going on here is a potential really inefficient write mechanism where everything, even the little 1K VMX file takes up a whole megabyte on disk. That's where the VMFS 5 disk system has a sub-block algorithm that will allow those smaller files to fit into a sub-block. With vSphere 5, that sub-block is EK. Further, the other big deal with VMFS 5 is that we've introduced atomic test and set primitive commands. This is very welcome to the previous metadata updates that were distributed across volumes.
Another important feature of VMware vSphere 5 is that HA has been totally rewritten. This HA feature is really important for protecting against a hardware failure of your host. So basically, if you have a host running on a cluster and that host has the power removed, the purple screen of death or some sort of catastrophic event, it is not able to run the virtual machines, HA will actually restart that virtual machine on another host. Now you still need to make sure the storage is available, so the virtual machine's storage doesn’t move, but the network presence, the compute cycles, and the memory would be assigned to a new host via a virtual machine restart, and that's all done automatically. The new mechanism for HA is called fault domain manager, which replaces the previous version that had been changed through VI3 and vSphere 4. This is a new bill for vSphere 5.
One important new features of vSphere 5 is that there's no intrinsic HA DNS requirement. Now, that's not to say we don't want to make sure HA is still properly configured and well working for vSphere. It's a very critical part. But it won't cause some false positives in HA like we had before. There's a new master and slave node concept, so it's a little bit easier in the vSphere client to see how these virtual machines are contributing to the cluster in terms of their role right in the vSphere client. All of these features and more.
I have to send some homework to check out this book, "vSphere 5.0 Clustering Technical Deepdive". This is by two of the best people in the world on the topic, Frank Denneman and Duncan Epping, both who work at VMware and are based in the Netherlands. They've written this book, and it's a very affordable purchase on Amazon. In the form of a Kindle, I think it's less than 10 US dollars. So if you haven't read this book, I highly recommend it. Or if you attend a Veeam featured webinar, we frequently will deliver Amazon Kindle copies of this book for free for select winners of a little question that we do at the webinar. If you want to check that out, feel free to do so.
So specifically with backups and vSphere 5, what do we need to know? We've touched on a number of core features, but VDDK for example. I did not go into the inner workings of VDDK. My goal is really to just tell you it's out there and basically what it does. The VDDK, for example, was a big change from vSphere 4 to vSphere 5. So your backup application, if you're using something like virtual appliance hot add, you want to make sure that it fully supports VDDK for vSphere 5. One of those decision points you make about upgrading or switching backup products is to make sure that all of these types of features work correctly. So when you're evaluating a product, try all these different modes. If there's a network based option, try that. If it can use virtual appliance hot add, use that. If it can use a direct storage type of mechanism, use that. And if it can use any other type of mechanism, use that. Go through the full cycle to make sure that all of the different components work correctly and to your liking.
It's also important... I think a lot of people miss the fact that if you use a backup product that isn't agentless, so something specific for virtual machines, everything might not work right from the start. That's because it might uncover issues with the virtual environment. So, if there are non optimal configurations or components that are out of date, chances are, the virtual machine backup tool may not work correctly, or it may throw warnings or errors or identify some of these issues, like the virtual machine is not up to date. VMware tools is out of date, for example. Or the storage is not provisioning correctly. I can't get to it, or whatever. There's a lot of things that can go wrong. When we try to use these APIs and come in very clean, one of the hidden benefits is that we can be informed of what type of issues might be within our virtual environment.
So what I want to talk about now are some of the additional vSphere 5 resources. I really can't call this a full vSphere 5 class. This is an overview specifically about what you need to know for how things that are impacting backups are changed with vSphere 5. But the good news is that there are an incredible amount of resources to help you out.
So let's start with vSphere 5 documentation. This is a publication at the pubs.vmware.com site. I've got a short URL at vee.am/vsphere5. This has every aspect of the product documented to your liking. Check it out. The next thing I'll talk about is a series of video blogs that are available at blogs.vmware.com from the VMware KBTV team. That can be found, again, at a short URL at vee.am/kbtv. The last part of an additional resource I want to share with you is the vLaunchpad. Blogger Eric Siebert creates this outstanding resource that really connects the vSphere community to each other. There's no area of IT that has a better sense of community in my opinion than the virtualization community, and specifically the VMware community. The vLaunchpad is webpage that tracks all kinds of great stuff. Blogs, product documentation, different podcasts, storage resources, and news. Everything you need to stay up to beat with the VMware community can be found on this page. Further, it also has Twitter links, RSS feeds, everything you need right there. Great work, Eric. Check it out.
So let's wrap up what we've covered here real quick. We've talked about a big picture overview of VMware vSphere 5. I talked about the changes in scale and how that's important for provisioning workloads in a vSphere environment. We then talked about the virtual machine hardware version 8 as well as some of the major changes in storage, such as storage DRS, VDDK 5, VMFS 5, and then the new version of HA to ensure availability of virtual machines and hosts. We wrapped up talking about how backups are implicated with vSphere 5 and some additional vSphere resources. I'm Rick Vanover, and thank you for attending this Backup Academy session.