About the speaker
Rick Vanover (vExpert, MCITP, VCP) is a product strategy specialist for Veeam Software based in Columbus, Ohio. Rick is a popular blogger, podcaster and active member of the virtualization community.
Physical vs Virtual Backups
In this session you will learn:
- How backups for physical systems work
- How backups for virtual machines work
- What frameworks are available for backups
- How restore situations are impacted
- Service catalogs and virtualization’s impact
- Opportunities by platform
- Considerations for licensing, management and administration
More sessions by Rick Vanover
Welcome to this Backup Academy session, Physical versus Virtual Backups. My name is Rick Vanover, Product Strategy Specialist with Veeam Software, and I'm happy to present to you this Backup Academy Session. You can follow me on Twitter @RickVanover or @Veeam, and you can also read my blogging at veeam.com/blog.
Before we get started with this content, I want to take a moment and tell you a little about myself. I'm a blogger, a podcaster, and a lot of other different activities in the virtualization community. I write at the Veeam blog at veeam.com. I also contribute to the Everyday Virtualization blog at Virtualization Review, and I also write at TechRepublic.com. I'm also the host of the Veeam Community podcast, as well as a co-host of the Infosmack podcast. In regards to technical certifications, I have the vExpert designation as well as the VMware Certified Professional credential for VMware Virtual Infrastructure 3, vSphere 4, and vSphere 5. Lastly, I'm a Microsoft Certified IT Professional for Server Administration.
So let's talk a little bit about this session. Specifically, I'm going to go through how a virtual machine backup is different than a physical machine backup. We're going to start with the physical system, how do those work? Then we're going to go to the virtual machine backup, how do those work? What frameworks are available to make these tasks succeed and become a successful effort in our data protection strategy? Then we're going to talk about the restore, probably the most important part of your backup strategy. We're going to talk about some of the business impact that goes along with physical versus virtual machines in your environment, specifically, things like a service catalog. We're going to talk about how a different platform brings different opportunities. Then we're going to round out the content with some of the considerations related to the licensing, management, and administration of a backup solution, both in terms of installing it, maintaining it, as well as making sure you can do what you need to do.
So let's start with the physical system. Like many IT professionals, I started working with IT when there were only physical systems. Mainstream virtualization wasn't an option when I started. But basically, they're generally applicable to a wide array of systems. Specifically, there's no server requirement, commodity equipment, all the common brands are okay with physical servers. The operating system is what's really important. The supported configuration usually revolves around the operating system. Usually, we go about our backups in that sense with an agent, a piece of software that's installed in that operating system, a supported configuration to do the backups and keep that physical machine protected.
So when it comes to a physical backup, there are a lot of pros and cons that I want to talk about here. Let's start with the good things. It's pretty easy to do. You install the software and you can back it up. There's no real hassle associated with that, and it depends on the product however that would be. There are a lot of different agents available for specific applications, and that can help you have granular control, specifically to administer the backups and ensure that every layer works together.
Backup solutions for a physical system are a pretty easy way to go about protecting that system. As I mentioned earlier, the operating system is really the most important consideration, but in general, it's a pretty easy task to go about. Usually agents are involved to ensure that that operating system is backed up properly. Those can lead to granular control and administration of each layer of the backup infrastructure, specifically within the application and within the operating system as well as other things like files and retention policies. All of those different things can be managed within a backup solution.
Now, one good thing about tools that are optimized for a physical machine is that in most situations, they can be installed within a virtual machine. Simply put, the operating system, again, is really the biggest indicator of what the supported configuration would be.
When it comes to looking at backing up physical machines, there are also some negative things that we would maybe think about. Specifically, we may miss out on some features that are delivered to us by specific APIs. When it comes to physical systems, there also may be things we're missing out on. Specifically, there may be APIs that we're not able to use on a physical system, which might be available in other platforms such as virtual machines, which we'll talk about later.
I mentioned an agent earlier. If that's not running, a backup may not work correctly. There might be considerations related to permissions and authorizations for the agent to run service accounts. Things like that might complicate or increase the overhead associated with managing your backup environments if multiple agents are in use, and they're not necessarily specifically optimized for a virtual machine. Sure, as I mentioned on the left, you could install a tool optimized for a physical machine within a virtual machine, but that tool isn't necessarily optimized for the platform.
Now let's talk about a virtual machine. A lot of new systems are deployed as virtual machines, and chances are, in your own environment you might be seeing your growth area be virtual machines. In the scope of what I'm going to talk about today, we're really going to be discussing Hyper-V and vSphere. Both of these virtualization platforms have APIs, or Application Programming Interfaces, that allow additional things to be done to the virtual machine that may not be available in the equivalent fashion to a physical machine.
Specifically, commands are usually sent to a higher-level object. We don't necessarily interact with a virtual machine's guest operating system directly - that is what we have in the agent-based world - but if a backup solution is built aware of the virtual machine, these commands might be sent to a higher-level object in the management realm, if you will. Specifically, a Hyper-V or ESXi host which contains those virtual machines might be where the commands are sent, or specifically, a cluster, such as a DRS cluster in vSphere or a SCVMM cluster, a failover cluster in vSphere, those may be where we organize our backup infrastructure, or we can just leverage the management server to do these tasks at the highest level within vCenter or SCVMM.
So let's talk about some of the good things associated with a backup that is optimized for a virtual machine. In the simplest form, it's usually pretty easy. Just like the physical realm - installing an agent is easy on a physical system - the virtual machine might have one little, it might be a little bit easier to do as a virtual machine. Specifically, we don't have to generally engage too much with the guest operating system, and when we do it that way, we might be able to move data pretty quickly, specifically leveraging APIs such as VMware's vStorage APIs or Microsoft's Volume Shadow Copy services. We can access the virtual machine, which is just a file, right? It's a container that has that operating system, we can access that directly on the storage. So in the case of the vStorage APIs, a backup solution can talk directly to the one or the disk resource. Or, in the case of a shadow copy, it can copy the VHD directly from the shadow copy on the volume.
So in either situation, we can access the virtual machine at a container level, at the disk level, underneath the operating system. Not necessarily without interference, because this is an aggregated environment, it's the same storage in both cases that the other virtual machines are using, but we can potentially move this data very quickly because we're not having to traverse millions of files and communicate with different operating systems. A virtual machine backup solution can probably move that data pretty quickly.
On the administration side, backup solutions that are made for virtual machines might be agentless. That can really help out on your administrative burden of ensuring that backups run successfully and have what they need to keep running, and that will keep your management burden down. Whenever you go about data protection, you want to make sure it's something that you can fully support. What I mean by that is you can manage it, you know what's going on, you have the visibility you need to ensure that these backups are being done, and most importantly, that what you need backed up is backed up.
When you think about a tool for backing up virtual machines, there might be some negative things that come about. At first, you might think an agentless backup might miss out on application consistency or granular recovery. We're going to talk about that in a little bit, but that's not necessarily the case. A lot of early, first-generation backup products that were made for virtual machines may have had that. In fact, they might have gravitated towards the crash consistent realm. A number of technologies have come out to really help that, the first of which is Volume Shadow Copy Services. I recommend that you check out the other Backup Academy Session by Elias Khnaser, it's a great session. Basically, Volume Shadow Copy Services is a framework. Technically, it's not an agent, but if you're leveraging Volume Shadow Copy Services, you can actually take care of all the good stuff, specifically, maintaining application consistency, delivering granular recovery because it's backed up correctly, things like a domain controller, if it's restored. If you prepare this virtual machine correctly, when it's backed up using a technology like VSS, life can be a lot easier.
Now, early generation tools may have used the term ‘crash consistent’, and that's actually a misnomer. The consistent part might make you think that it's okay, but the truth is is that the consistency of that type of backup is really equivalent to pulling the power out of a physical server. There was no consideration given necessarily to the applications involved. Maybe to the file system, but in terms of application consistency, it might have been omitted. So at the highest level, the two different tools bring two different things to the table. When we go about looking at them together, we need to make sure that everything is addressed, and that’s everything from application consistency to recovery requirements to retention requirements, everything, and the fact that we have virtual machines increasing, our burden is increased to ensure that every requirement is met as our number of virtual machines increase.
I've mentioned earlier that VSS is a key technology for both types of backups. Specifically, varying degrees of consistency can be taken by any type of backup. I mentioned the crash consistent, and I can't emphasize it enough that that is actually not a good thing. So let's talk about VSS, and again, the previous session goes into it in much better length, but VSS is a framework that can deliver application consistency for backups. VSS has three components: a requester, a provider, and a writer. These can be installed or managed by the installed applications.
So let's start with the writer. Your big applications like SQL, Exchange, Active Directory, SharePoint, all those different applications that make up what we have in the enterprise or even the SMB for that matter, all those applications, especially in the Microsoft realm, are going to provide a VSS writer. The operating system, in the case of most virtual machines, or a hardware device, such as a storage array, may produce a VSS provider. That is where the coordination occurs during a backup from the last component, the VSS requester, who can request that backup, and that's generally done by a backup software tool. Both physical and virtual machines can request a backup in this framework, and it usually ends up interacting with the Windows registry, but VSS components don't sit in the taskbar of Windows, they don't have a shortcut on your desktop, that type of thing. It's leveraging the framework of Windows to perform application consistent backups.
Virtual machines can leverage VSS, but they also have an advantage in that they can leverage additional platforms, specifically, APIs and frameworks such as the vStorage APIs for data protection, web services, PowerShell, and WMI, which is Windows Management Instrumentation. These technologies can allow backups to be done at a higher level and a more efficient level, specifically for virtual machines. Now, I'll be honest with you, I'm pro-virtual machine. Whenever I have the choice, I'm going to deploy a new system as a virtual machine. But I understand that physical machines are still out there in today's environment, so we need to make sure whatever platform we go about with our backups we're able to do the best backup that we can. I personally really like the APIs available for the VMware and Hyper-V realms. They can really make your backups incredibly more efficient and do things that just aren't available in the physical world.
I'll take one example. In the other session, we talked about changed block tracking. That's an incredible benefit, especially if you're talking about an agentless backup that might happen in a virtual machine. That can really increase your backup efficiency without increasing your management burden. Great stuff. I've copied and pasted a number of great books that can talk about some of the different resources available to help you get started with these APIs. It can be very complicated. ‘Maximum vSphere’, for one, by Eric Siebert is a good starting point. Hyper-V, vSphere, they both have it. It's not limited to either one of these technology families. The Resource Kit is a great place. ‘vSphere Clustering’ is another great book, as well as the VMware vSphere PowerCLI reference, which is a PowerShell extension for vSphere. So you're kind of meeting in the middle with Windows techniques such as PowerShell, bringing vSphere into that realm to automate and kind of see some of the different features that are available. It's all good stuff.
All of these technologies will impact every part of our data center, and the conversation will quickly revolve around the stakeholders. This graphic below kind of summarizes it I think in a way that we can all relate to. It doesn't really matter what the arrows and the circles are, but we all deal with a lot of different technologies and they all have interconnections and they have requirements and they have all these different steps that need to happen to ensure that our physical and virtual machines are protected, and it really impacts the entire data center.
As we go about into the future and as we've gone virtual, all of these different things have impacted what we need to do and how we need to do it. Specifically, are we keeping our stakeholders informed? Sure, we've all had the stories where we've virtualized a system that used to be physical and we didn't even tell the application owner. They might not have even noticed, but we probably should let them know. We probably should coordinate that. If you have change controls in your IT practices, those types of things, of course, should be done. But we need to make sure that for the entire infrastructure we have visibility, we have management, we have protection. All of those different things need to happen for both physical and virtual, but chances are, we have a mix of technologies.
This mix of technologies really comes down to two things: physical and virtual. And then within that realm, we'll have a lot of other things. Maybe we deal with mainframes, or AS400s, or risk-based platforms. In the x86 world, like the VMware, the Hyper-V, the Windows, the Linux-type of compute platforms, we don't have too many obstacles, but when it comes to thinking about these other solutions and these other platforms, we have to make sure that we do everything that we need to do to maintain the best result for our data protection solution.
I'll give you an example based on my experience. As my career has gone on, I've found myself working more and more with virtual machines and less and less with physical machines. Go back a couple of years ago, it was the opposite way. What I mean by that is let's say, for example, all of my systems are physical machines, 100% physical. The moment I introduced that first virtual machine, that was effectively a one-off. It was an anomaly among my infrastructure. Then I liked that virtual machine and I added another, and that process continued. I may have still added physical servers, but as time went on, I found myself, like many others in the IT field, adding more and more virtual machines. As the environment became more and more virtual, my practices may have changed, and especially they’ve changed once virtualization became the norm, in my experience.
With all the changes in our data center practices, specifically, the increase in the number of virtual machines, one thing remains the same. We need to make sure that all of these tools and all of these practices and all these features point to the very specific ability to restore what we need. As I mentioned earlier, the ability to restore hasn't really changed when it comes to our change from physical to virtual machines. We still need to be able to restore a number of different things.
The first thing I want to talk about is a file restore. It's likely the most common restore situation. Hey, I lost a file. Can you delete it? That could be everything from user data, an Excel document, a Word document, an Access database. It could even be PDFs, text files, exports, logs, you name it. File content, especially unstructured data, can be a very difficult thing to manage, so the ability to restore it is very important.
Other types of recovery situations include operating system, file content, everything from a corrupted file, an accidentally deleted file out of a Windows system, or maybe I changed a file in Linux and I don't remember what I changed, but whatever I need to do, I need to bring it back, that type of thing is an important recovery situation that again, regardless of physical or virtual, we need to make sure that we have that ability.
Another important restore scenario is specific application restores. This would be stuff within a structured data type, so something within a database, something within an email system, an email message, a user mailbox, a SQL table. Those types of restore scenarios happen as well. They're a little bit more complicated than a file because you can't just bring back the Excel file within an email because you don't actually separate the Excel file that was an email attachment within an email system, specifically, something like Microsoft Exchange. So that structured data arrangement that comes to us from Exchange in both physical and virtual machines, we need to make sure that's backed up in a correct way. We can't just backup the Exchange files or the SQL MDB file in the database situation. We need to make sure that the application backup takes the full application, but that, more specifically, we're able to restore individual items.
How we go about doing that is up for discussion. A number of different solutions are out there, everything from agent-based item recovery to native tools within these applications, as well as solutions that work with virtual machines or physical machines, to broker connectivity from different environments to do restores.
Then there's the odd-ball restore situation. Well, I don't really know what I need or what I deleted or what I broke, but I need some sort of restore, everything from restoring a database or maybe a SQL stored procedure. I'd like to explain this example in I don't really know what I need, but I don't want you to restore the whole virtual machine or the whole physical machine, I kind of want to look left window, right window. Specifically, I want to look at the live system and I want to look at the backup. Sometimes people need windows into the past to kind of see what's changed, and they don't really know what's happened. It could be an application administrator making that type of request, it could be a system administrator making that type of request, but you really have to think about every type of restore.
Whole VM restores, the last option, which you know is somewhat of a catch-all type of restore, but basically that allows a whole virtual machine or a whole physical machine to be restored to simply bring the entire thing back. Maybe we only need a certain amount of data, maybe it's just easier to do that. Basically all these different restore scenarios need to be met by both physical and virtual backup solutions.
When it comes to a full restore, which we mentioned earlier, there is no textbook full restore. It could be everything from installing a blank operating system, loading an agent, and then a full restore would take it over from there and totally reconfigure the agent, from the installation to restore. It could be bare metal recovery options in the physical realm. These are pretty cool, actually. They might even do a PXE boot, which is a network-based boot without any local disk configuration to do a restore. There also may be more complicated situations which require the same hardware configuration. That can be difficult, specifically if you have a mixed hardware environment and you have a number of different systems. Some are older, some are newer, and that might be hard to come up with the hardware if you actually have a hardware failure rather than a soft failure like the OS failed or you lost some hard drives, that type of thing. But if you need to come up with a new system, that might be a challenge.
In the virtual machine realm, we have a little bit of these issues removed. Specifically, a virtual machine is a container, is an image, if you will, is a file, I guess most specifically, and that file is a full system image of an operating system that's running on it. So Windows or Linux VM can be contained within there. That image can be dumped right back on the hypervisor. Now, some virtual machines might be able to be backed up on one platform and restored on another from Hyper-V to vSphere or something like that, or just backup from vSphere restore to vSphere, backup from Hyper-V restore to Hyper-V. Whatever the situation may be, a full restore in a virtual machine realm may exist in a number of different ways.
Physical restore systems may also work on virtual machines. I mentioned PXE boot, I mentioned base operating system plus an agent and then a full restore. Those types of arrangements are possible in the virtual realm. The fact that virtual machines are abstracted, which is removed conceptually from the hardware, we're allowed to do a lot of different restore types. When it comes to a full system restore, off the top of my head we could rebuild a virtual machine, we could install an agent within an operating system and restore that, we could clone from a template and then launch in a restore task. You name it, there's a number of different ways out of a problem with a virtual machine. Again, I'm pro-virtual machine, so we might, I would say that we have more options in the virtual realm than we do in the physical realm for restores.
Aside from the technology that goes with backing up for both physical and virtual machines, especially as we find ourselves virtualizing more and more and presumably backing up more and more virtual machines, we also have some impact of some of the business areas of our technology operations. Specifically, a service catalog. If you don't have a service catalog, you might have it in a different name. It might be an SLA, it might be a run book, it might be as simple as someone's expectation. Those can be impacted as we have more and more virtual machines. So at the highest level, this could be everything from the technologies involved, VMware vSphere, physical servers, clusters, storage systems, all those different components. Then it can also include requirements, right? What are our RPOs? What are our RTOs, recovery time objectives, recovery point objectives, retention requirements?
Another critical area might be any compliance or regulation situations that we might have to deal with. How long do we have to keep data? How does it have to be stored? Those types of things. But probably the hardest one to deal with is the expectation. The users or the stakeholders of our technology might have an expectation that anything can be restored quickly, and that might not be the case in the physical world or that might not be the case in the virtual world.
It is worth taking the time to ensure that all of the different components of a service catalog are updated for virtual machines. It could be so far as to say that virtual machines can be recovered generally in one hour, or you could say physical machines could be recovered in one hour plus up to four hours for parts and service. A lot of hardware vendors have warranties that you can select up to four hour delivery time for spare parts or 24 hour next business day, those types of options. When it comes to the service catalog, it might not always make sense to take the same approach, especially with technologies in the Hyper-V and vSphere realm such as HA, DRS, Storage vMotion, all those different kind of protection techniques that can move us off of failed hardware. FT is another one. All of these different situations might be worth a change to our business process to effectively modernize the expectation that we have.
Taking that the other direction, when it comes to deploying new systems, stakeholders might be made very happy by the ability to deploy virtual machines very quickly. That doesn't necessarily mean that a physical server can be deployed at the same rate, so it's worth communicating, you know what, I like virtual machines, I can deploy them quicker. I can restore them quicker, you name it. Maybe the conversations with the stakeholders should revolve more around, does your application fit best in a virtual machine? Do we need to design it a certain way to work best in a virtual machine?
Then some other kind of tertiary issues arise, specifically, will the solutions scale? Virtual machines grow. Everyone I know ends up with more virtual machines at the end than they did total physical servers when they started. What abilities do we have? Do we need, are we able to support a complex environment? Are we able to support all these different retention requirements? Do we have the storage? Do we have the tape infrastructure? All those different things. Are we able to do what we're wanting to do? Do we have the money to do what we want to do? That's a very complicated thing. My advice a lot of times to people is generally you can protect to any requirement, it's just a matter of having the right resources to do that.
The last thing that is kind of an extra factor is what have we been doing? What do I know how to do? What do we have in place? All those different things, as the virtualization impact is felt in our environment, might warrant an update to a service catalog, should you have it. If anything, it can just level set the expectation, define an SLA. There are a lot of templates on the Internet that can really help you get started with that. It could even be a one-pager document that you share internally, nothing official. There's going to be an expectation that might be hard to satisfy other people. Having this conversation ahead of time might raise some very important points within the environment.
Maybe an application is determined - you know what, this application needs higher protection than this offering. Granted, maybe the organization hasn't provided any money, but if it's really required, then maybe we need to identify the steps that it's going to take to get that application protected to the requested level. This is one of those business soft skill communication styles that are needed for IT pros. This is worth having. You don't want to have a problem that the expectation is one way and the capabilities of a data protection service catalog are another, so it's worth the time to put forth and develop even a simple service catalog for data protection.
When it comes to backups and data protection in general, there are other non-technical discussions that need to happen, specifically, licensing. How is the solution licensed? Or multiple solutions? How are they licensed as the solution scales? If I add more virtual machines? If I contract physical servers? If I add physical servers? All those different scenarios, how does that impact my price for my backup infrastructure? Chances are, there's some amount of growth, even if it's just new operating systems. In my experience, at the end of the day, people end up with more virtual machines than when they started. Further, it's safe to assume that there is some amount of virtualization with everyone. You probably wouldn't be here if you didn't have an interest in virtual machines.
Another important point is how these solutions are going to be administered, specifically, how do we update them? I mentioned earlier an agent. If we have complicated operating system environments, maybe multiple Windows domains, that can increase our administrative burden, so we want to think about agent versus agentless in the physical or virtual machine realm. Our data durability is another important point. Do we need it accessible? Do we need it on disk for fast access?
Lastly, when it comes to the management of our backup environment, it's very important for us to ensure proper visibility of our backups, everything from ensuring that they are happening, ensuring that new virtual machines are included in backup jobs without the overlook. This happens to me all the time. How many times have you deployed a system, physical machine, or virtual machine, for that matter, where the application owner or the stakeholder says, oh, this will be a development system, we don't need to back it up. I just need it quickly. Okay, sure, no problem. If I'm in an environment where I deploy systems for application owners, boom, there we go, we have a system. Little do I know that that system may have graduated into production silently. Nobody told us, right? There might be change controls needed to move stuff from development to production. There might be all kinds of different things that need to happen, including a backup. Well, if we have a solution for our backups, and specifically in the virtual machine realm that can automatically include virtual machines into our backup jobs, that's a great technique.
So back to my earlier comments about a tool that is virtualization aware, this is an opportunity. Specifically, if we can have our backup jobs back up by object such as host, cluster, data store, things like that, think of those as a container of virtual machines, and every time we run that job, then every virtual machine that meets that criteria would be backed up. So the moment a new virtual machine comes in, it might be backed up.
Whether or not we back up development systems is a different discussion. In my professional experience, I did not, because if it was development, we wouldn't have needed to back it up. If it was production, it would. But then there are multiple layers. Every organization has different requirements. So maybe QA, test, production, development, you name it, there's a lot of different environments, but make sure that those business discussions, back to the service catalog, specifically say what is, what isn't backed up, or maybe you just back everything up. That might be good enough. Have those discussions. Make sure it's a situation that can be easily managed to ensure that your virtual and physical machines are kept available.
I hope you've enjoyed this session of Backup Academy. We've talked about, at the highest level, how different backup solutions work for both physical and virtual machines. We talked about some of the frameworks that are available in those platforms. The restores are the most important part of the backup. We talked about the different restore scenarios and how they might work, even specifically for a full restore both for physical and virtual machines. I can't emphasize enough the importance of a service catalog. When it comes to our backups, we need to ensure that the stakeholders are going to get what they expect with our data protection solution.
I talked a little bit about different platforms that are available for vSphere and Hyper-V in terms of features such as the vStorage APIs, PowerShell, WMI for Hyper-V. It's all good stuff, and when we go about data protection for virtual machines, we want to make sure that we include that.
Then I talked about some of the management aspects of backups for virtual and physical machines, and it's again important to have that service catalog to have a solution that we can manage and deliver our virtual machines and know that if we have an issue, we're able to recover them.
I'm Rick Vanover, and I hope you’ve enjoyed this Backup Academy track.