Insights: Building a Virtualisation Platform and Tape‑Based Backup System Under Real‑World Constraints

Summary

Over more than a decade, Kent ITS designed and operated a full virtualisation and backup platform using refurbished enterprise hardware, open‑source hypervisors, and a fully CLI‑driven tape system. Despite limited storage, strict budgets, and ageing hardware, the environment delivered continuous uptime, reliable recovery, and long‑term retention through a combination of XCP‑ng, Xen Orchestra, and dual‑drive LTO tape rotation.

The Challenge

The platform had to evolve over many years while operating within a set of practical constraints:

  • VMware’s free edition restricted backups, API access, and live migration.
  • Initial hosts were DL380 G5 and DL360 G6 servers sourced from the secondary market.
  • The QNAP NAS provided limited iSCSI/NFS capacity, restricting VDI retention depth.
  • No budget existed for commercial backup tools or proprietary tape software.
  • LTO4 and LTO5 tapes could not hold 30 days of VDI snapshots as the environment grew.
  • Backups needed to be air‑gapped and physically rotated off‑site.
  • Hypervisor upgrades and hardware refreshes had to be performed live with no downtime.

The challenge was to maintain a reliable, recoverable virtualisation platform with long‑term retention while working entirely within these constraints.

The Solution

1. Migrating from VMware to XenServer and XCP‑ng

To escape VMware’s limitations, the platform was migrated to XenServer 7 and later upgraded in‑place to XCP‑ng while workloads remained online. This provided:

  • Live migration
  • Open storage formats
  • Flexible upgrade paths
  • Full CLI access
  • No licensing restrictions

A three‑node DL360 G6 cluster was built, backed by a QNAP NAS providing iSCSI and NFS storage for VM disks and backups.

2. Hardware Evolution to DL360 G9

As workloads increased, two G6 nodes were replaced with DL360 G9 servers. The mixed‑generation pool remained stable thanks to XCP‑ng’s broad hardware support, improving:

  • CPU performance
  • RAM capacity
  • Power efficiency
  • Overall reliability

3. A Fully CLI‑Driven Tape Backup Architecture

A dedicated DL360 G6 backup server was equipped with two tape drives:

  • LTO4 on /dev/st0
  • LTO5 on /dev/st1

The backup system was built entirely using Linux CLI tools:

  • tar
  • mt
  • cron
  • Xen Orchestra streaming
  • NFS/iSCSI storage

The design focused on throughput, simplicity, and recoverability.

4. Streaming VDI Backups Directly to Tape

Xen Orchestra streamed full and incremental VDI chains at ~110 MB/s directly into tar, avoiding filesystem overhead and keeping the tape drives at full speed.

A typical VDI streaming command looked like:

Code

tar -c -v -b 128 -f /dev/st1 /mnt/nfs/backups/xoa-backups/ /mnt/nfs/backups/xen-meta/

This ensured:

  • no tiny‑file performance issues
  • no tape shoe‑shining
  • minimal local disk usage
  • predictable restores

5. Compressing File‑Level Data Locally

Zimbra and Samba data were compressed into a single archive to maximise tape efficiency:

Code

cd /mnt/nfs/backups/tape/ rm -f backup.tar.gz tar -cvzf backup.tar.gz /mnt/nfs/backups/samba /mnt/nfs/backups/zimbra/

This avoided writing hundreds of thousands of small files to tape and kept restores simple.

6. Splitting Workloads Across Two Tape Drives

To maximise throughput and avoid contention:

  • LTO4 handled the compressed archive
  • LTO5 handled VDI streams and metadata

The second drive used an identical workflow:

Code

mt -f /dev/st1 setblk 0 tar -c -v -b 128 -f /dev/st1 /mnt/nfs/backups/xoa-backups/ /mnt/nfs/backups/tape/ /mnt/nfs/backups/xen-meta/ mt -f /dev/st1 eject

Both drives ran at full speed simultaneously, doubling throughput and improving retention.

7. Automated Tape Rotation

Cron scripts handled:

  • block size configuration
  • archive generation
  • tape streaming
  • automatic eject

The onsite person simply removed the tape and took it off‑site, providing a reliable air‑gap.

8. Retention Strategy

The system maintained:

  • Daily full backups on tape
  • Incremental VDI chains in Xen Orchestra
  • 14‑day hypervisor snapshots
  • Months of VDI history across rotated tapes

Retention depth was adjusted as VDI chains grew and QNAP storage became constrained, ensuring tapes remained within capacity while still providing long‑term recovery.