7 Critical Reasons for Kubernetes-Native Backup

Prodatix November 26, 2022 0 Comments

If it has not already, Kubernetes will soon increase its footprint within your infrastructure and will become the foundation for your next generation of applications. Ensuring that your critical applications have a cloud-native backup solution in place will allow your DevOps teams to move quickly by reducing work, also giving your infrastructure and ops teams a higher confidence in being able to recover from failure.

It might be tempting to delegate the backup and restore responsibility to legacy tools built for virtualization-based infrastructure or, in extreme cases, even assigning the responsibility to each application team. However, multiple customer journeys have shown that while this might be well-intentioned, it is the wrong path to follow. Not only will you experience data loss if the backup responsibility is disaggregated, but the high recovery time from manual and error-prone playbooks required with legacy approaches will also add excessive risk. When the average hourly cost of critical application failure is $500,000, this is not a risk organizations can afford or should tolerate! Here are seven reasons why a Kubernetes-Native backup solution will be the best fit for your expanding Kubernetes environment.

1. Kubernetes Deployment Patterns

The most significant reasons motivating the need for a Kubernetes-native backup solution are the fundamental differences between the Kubernetes platform and all computer infrastructure that has come before it.

One of the biggest deployment changes for containerized applications is that there is no mapping of applications to servers or VMs. Kubernetes uses its own placement policy to distribute application components across all servers for fault tolerance and performance. Further, different applications are often co-located on the same server. In comparison, traditional data management systems will fail as they can never independently capture the state of just a particular application without pulling in unrelated applications.

Another constant for cloud-native applications is the dynamic nature of their environment. Containers can be dynamically rescheduled or scaled on different nodes for better load balancing. New deployments, that happen on an hourly basis, can involve rolling upgrades, and new application components can be added or removed at any time. In short, the definition of a cloud-native application is constantly shifting. A backup solution needs to understand this cloud-native architectural pattern, be able to work with a lack of IP address stability, and be able to deal with continuous change.

With these major changes in the computing environment, trying to adapt appliance-centric solutions designed for VMs and use them with cloud-native platforms will fit as well as a square peg in a round hole. A Kubernetes-native backup solution is needed so that requirements such as dynamic application discovery, instantaneous backup, platform-integrated recovery, and capturing all application context can be met.

2. DevOps and “Shift Left”

The success of Kubernetes has been driven by its focus on developers, their applications, and high-velocity application development cycles. While seemingly subtle, this decision permeates the platform’s design and consequently requires that backup solutions be application-centric and not infrastructure-focused. The application is what both the developer and operator ultimately care about.

The DevOps philosophy adopted in parallel with Kubernetes also cedes control over both infrastructure and deployments to the developer (known as “shift left”). Developers define both application components and infrastructure requirements (e.g., storage or load balancers) as code. These programmatic requests are provisioned dynamically, via a CI/CD pipeline, and without an extensive change management process.

This great power comes with the increased risk that a simple configuration error could delete critical data! Any backup system today needs to not just integrate into CI/ CD systems but, more importantly, be able to discover new and changed applications automatically in order to instantly protect them. This process also needs to be completely transparent to developers, and it shouldn’t require them to make changes to their applications, tools, packaging, or deployment pipeline.

Finally, as developers take on more responsibility, a backup platform not only needs to be API-first, but its API needs to be cloud-native itself. Using a Kubernetes-native API, instead of older REST or SOAP APIs, allows for seamless authentication and authorization, simplifies application and workflow integration, and allows for the use of tools (e.g., kubectl) developers and operators are already intimately familiar with.

3. Kubernetes Operator Challenges

Infrastructure teams moving past early Kubernetes adoption and being tasked with providing Kubernetes infrastructure at scale are frequently running into a skills gap. For those teams migrating to Kubernetes from a Linux or vSphere background, a backup tool that provides CLI access and a clean API along with a powerful yet easy-to-use dashboard is critical. A well-designed backup solution will accelerate an IT team’s production journey by providing a bridge to greater Kubernetes understanding. Deep Kubernetes integration will hide underlying platform complexity, and a sharp focus on UX and revisiting backup workflows for cloud-native applications will reduce or eliminate manual or integration work.

A backup solution for containerized applications also needs to be application-centric and understand Kubernetes constructs instead of being infrastructure-focused. A single application that might have been composed of a few VMs is now, on average, composed of 100s of distinct Kubernetes resources (configuration, disks, secrets, etc.). When multiplied across all applications in a cluster, an operator is faced with understanding and protecting millions of components unless the operational unit for backup is switched over to be the application. Ignoring Kubernetes resources is not an option as a legacy backup solution that restricts itself to just infrastructure such as disks and volumes will suffer from error-prone recovery playbooks with very high recovery times because of the missing relationships.

Even with an unrealistic assumption that there has been no drift in Kubernetes objects between backup and restore, the initial manual process to determine the backups required for restore will have to be followed by another complicated manual process to graft these restored volumes back into the Kubernetes application, placing undue burden on operations teams.

4. Application Scale

With the rise of microservices and first-class Kubernetes support for functions such as configuration and secret handling, applications have been broken up into hundreds of discrete components that have independent lifecycles and are only visible to Kubernetes. Only a cloud-native backup solution built to handle the millions of components found in large clusters will understand the relationships between applications, their data, and related Kubernetes state, and be able to consistently capture all of it together at scale.

Additionally, both Kubernetes and cloud-native applications have been architected to scale up (or down) in response to load. An effective backup solution must: adopt the same cloud-native architectural pattern to scale with application and cluster changes, be able to effectively “scale to zero” when not in use, and do it automatically without manual operator input. This will result in better performance as the backup platform grows with the cluster, plus cost savings as the backup system’s resource footprint is correlated to instantaneous requirements and not peak load. Scaling of the backup system will also be linear with cluster and application growth and will not exhibit the step-function jumps seen with an appliance-based model.

The above scale challenges are further compounded with the growth of Kubernetes multi-cluster use. Multiple clusters are found not just across environments (dev, staging, prod, etc.) but are often being split across application, security, and team boundaries, and being deployed across multiple availability zones, regions, clouds, and on-premises data centers. Reducing the operational burden for ops teams is only possible with a cloud-native backup platform that seamlessly handles multi-cluster operations and provides global visibility.

Data,Security,System,Shield,Protection,Verification

The Ultimate Guide to Veeam Data Backup and Recovery

Prodatix August 25, 2022

211

What is a Veeam Certified Engineer and a Veeam Certified Architect?

Prodatix August 29, 2022

199

New Ransomware Acts Like a Windows Update

Prodatix August 30, 2022

204

When it comes to protecting your company from ransomware, the best offense is a good defense. Don’t wait until you’ve had a data breach to take your data protection seriously. Request a free Veeam trial today!

Free Trial

5. Protection Gaps

Kubernetes is architected for fault-tolerance, which makes it dramatically easier to ensure application uptime when faced with partial infrastructure outages. However, high availability or replication is still not a backup! Data corruption or deletion, accidental or malicious, will spread to all replicas and cause catastrophic data loss.

Given the popularity of running Kubernetes on public clouds, there is often the belief that the underlying storage is failure-proof. However, this is not true, and even AWS’s battle-hardened Elastic Block Storage (EBS) advertises a non-zero annual failure rate! Similarly, on-premises storage vendors also provide volume snapshots. However, these volume snapshots are often not resistant to hardware failure and, even worse, the deletion of a volume usually leads to the simultaneous and automatic deletion of all related snapshots.

Looking under the hood, actions such as quiescing file system activity require elevated Kubernetes security privileges and aren’t normally available. A Kubernetes-native backup platform can provide database and Kubernetes workload quiescing hooks, achieving the same result without sacrificing security. Finally, while it is tempting to push the backup and recovery responsibility to development teams, we find that not all developers understand backup and disaster recovery.

Their justified focus on their application has the unfortunate side effect of creating custom and poorly maintained backup solutions. Further compounding this issue, these ad-hoc systems often suffer from silent or noisy but ignored failures as the application and infrastructure continues to evolve and complexity grows. To mitigate this risk, a backup solution needs to work transparently against a wide range of Kubernetes application stacks and deployment methods while integrating into development workflows whenever required.

6. Security

Kubernetes includes a number of security features such as network policies that deny access to internal application components and their associated data services from not just outside the cluster but also to other untrusted applications running in the same cluster. Running backup solutions outside of your Kubernetes clusters becomes a non-starter as they cannot discover, let alone access (e.g., to quiesce), applications without weakening isolation policies. A well-architected Kubernetes-native solution that can embed itself into the control plane will not suffer from these limitations.

Further, with developers taking on more infrastructure responsibilities, traditional IT is transitioning to an “ITOps” model and needs to provide self-service capabilities. A backup platform now needs to allow for scoped control to be handed off to developers for their applications. Not only is fine-grained Role-Based Access Control (RBAC) a requirement but this scoped access needs to be granted using the same roles and tools defined by Kubernetes instead of introducing additional role management systems that operators and developers need to learn.

Kubernetes also delegates data encryption to the underlying storage system and backup platform. To ensure that application data is never transferred or stored in plain text, a backup platform needs to understand Kubernetes certificate management, work with storage-integrated Key Management Systems (KMSs), and support Customer Managed Encryption Keys (CMEKs) through the Kubernetes Secrets interface.

Finally, given the popularity of Kubernetes and the fact that a number of Kubernetes applications are external customer-facing, Kuberentes-targeted malware and ransomware attacks will remain a persistent threat. A Kubernetes-native solution that creates reliable backups independent of Kubernetes and the storage system and has deep platform integrations for quick automated restores will be essential.

7. Ecosystem Integration

The rise of polyglot persistence, where multiple data services (e.g., MongoDB, MySQL, and Cassandra) are used within the same application, has coincided with the growth of Kubernetes. Richer backups for these workloads are now possible by integrating against Kubernetes for automated workload discovery. Workload knowledge enables the backup solution to select the capture primitive (e.g., one or more of volume snapshots, application consistent backups, logical dumps) best suited to the application’s requirements.

With the shift from a single data service to multiple independent ones, the relationship between these workloads can be derived automatically from Kubernetes metadata. With this application topology in hand, a Kubernetes-native backup solution can capture a consistent copy (both within and across services) of the entire application stack, identify and gather data from replicas to reduce application impact and improve performance and efficiency, optimize restore performance by leveraging Kubernetes parallelism, and much more!

Similarly, with an increasing number of organizations running a large number of Kubernetes clusters across different environments, it is crucial for a backup platform to interoperate with the rest of the cloud-native infrastructure ecosystem. This not only improves the UX, reduces costs, and improves the efficiency of ops teams, it also provides integration with the cloud-native tools both developers and operators have become accustomed to. Examples include integration with Prometheus for monitoring and alerting or with the Kubernetes APIs for RBAC, logging, and auditing for root-cause analysis.

Prodatix Are Data Backup Experts

Need help with data backup? We’re here to help. Contact us online or give us a call today for a free consultation.

816 Views

AboutMatt Bullock

Matt is the VP of technical sales for Prodatix and Accelera IT Solutions. He's been an entrepreneur for the past 30 years in the technology (hardware and software) industry. He's mainly focused on educating clients on the opportunities that technology presents and reinforcing the importance of data management to ensure business continuity.