BLOG

Author
Denrich Sananda

Date
14-05-2026

Industrial Cybersecurity

OT Backup and Recovery: Why Your IT Backup Solution Will Fail in an Industrial Environment

When a manufacturing plant ran a ransomware tabletop exercise, its IT team pointed to the cloud backup system with confidence. Their OT engineer then asked one question: Can that backup restore a Siemens S7-1500 PLC configuration in the middle of a live production run?

The answer was no. The backup existed. The recovery did not.

This is the central problem with OT backup and recovery. Most industrial organizations have some form of data backup running, but very few have a program designed for the operational reality of industrial control systems. The consequences are visible in incident data. According to Dragos' 2025 OT Cybersecurity Year in Review, ransomware groups targeted industrial organizations 1,693 times in 2024, an 87% increase from the year prior, and recovery time remained the most significant operational impact, often measured in weeks rather than hours.

 

Why IT Backup Solutions Fail in OT Environments

Real-Time Control Requirements

OT systems operate on deterministic timing cycles. A PLC controlling a turbine governor or a DCS managing a chemical reactor cannot pause for a backup agent to run a snapshot. Standard IT backup software initiates network connections, reads memory states, and transfers data. In a control loop running at 10 milliseconds, even a minor delay can trigger a watchdog timeout or a scan cycle overrun. The result is not just a failed backup. It can be an unplanned process shutdown.

Protocol and Architecture Incompatibilities

IT backup tools are built for file systems, databases, and virtual machines. OT environments run on proprietary industrial protocols such as Modbus, DNP3, PROFIBUS, and EtherNet/IP, and store configuration data in ladder logic programs, function block diagrams, and vendor-specific project files. A backup of an HMI running Windows may capture the operating system files but miss the SCADA application configuration entirely. That configuration can represent years of commissioning work that no automated backup tool knows how to capture.

Air-Gapped and Isolated Network Segments

Properly segmented OT networks have no direct path to enterprise backup infrastructure. The control device level and field device level are intentionally isolated. Backup procedures that depend on network connectivity cannot reach these segments without crossing security boundaries, which introduces a different risk: creating a persistent connection between IT and OT networks just to run backups.

 

What OT Backup Actually Covers

Effective OT backup covers five distinct asset categories. Each has different backup methods and different recovery requirements.

 

Asset Category

What to Back Up

Backup Method

Recovery Priority

PLCs and Controllers

Ladder logic, function blocks, I/O config, firmware version

Offline export via vendor software (e.g., TIA Portal, RSLogix)

Critical - 1st priority

HMIs

Screen configs, tag databases, historian connections, application files

File-level backup and application export

High - 2nd priority

DCS and SCADA

Control strategy, setpoints, alarm configs, historian schemas

Vendor-specific backup utilities

Critical - 1st priority

Engineering Workstations

Project files, vendor software licenses, device driver configs

Full disk image and incremental

High - 2nd priority

Historian and Data Servers

Process data, event logs, and trend configurations

Database backup and file-level

Medium - 3rd priority

 

The Specific Challenges of OT Recovery

Recovery Time Objectives in Process Environments

In IT, a four-hour recovery time objective for a non-critical system is acceptable. In OT, a four-hour outage at a refinery or food processing facility can mean millions of dollars in spoiled product, regulatory notification requirements, and potential safety consequences. Recovery time objectives for OT systems need to be defined against process impact, not IT service level categories. An RTO built around business continuity for email has no relevance to a process that cannot tolerate an unplanned shutdown.

Testing Recovery Without Stopping Production

IT teams regularly test backups by restoring to a test virtual machine. OT recovery testing is harder. You cannot take a live PLC offline to test a restore procedure without stopping the process it controls. Most industrial organizations do not have shadow test environments that mirror live configurations. This means recovery procedures are often untested until they are needed, and the first real test happens during an active incident.

Vendor Software and License Dependencies

Restoring a PLC configuration requires more than the backup file. It requires the correct version of the vendor's engineering software, the correct software license (often hardware-dongled), and, in some cases,s a live connection to the physical device to push the restored configuration. If the engineering workstation itself is compromised or destroyed in the same incident, the toolchain needed to perform the restore may also be unavailable.

 

Building an OT Backup and Recovery Program

Step 1 - Complete OT Asset Inventory

You cannot back up what you have not cataloged. The starting point is a complete asset inventory covering every PLC, HMI, DCS controller, SCADA server, historian, and engineering workstation. Each asset record should include the firmware version, the engineering software version, the last backup date, and the personnel responsible for performing the backup.

Step 2 - Define Recovery Priorities by Process Impact

Work with operations and process engineering to rank OT assets by process criticality. A PLC controlling a safety instrumented system has a different recovery priority than a batch historian. Define recovery time objectives for each tier based on what the process can tolerate, not what the IT backup infrastructure can deliver.

Step 3 - Implement Offline Backup Procedures

For critical control assets, offline backup is the only reliable method. This means:

  • Exporting PLC programs to project files using vendor engineering software on a scheduled basis, at a minimum, quarterly, and after every change
  • Storing backup files on removable media kept in a physically secure, offline location
  • Maintaining a spare parts inventory with pre-configured replacement hardware for critical devices,s where vendor lead times make emergency procurement impractical

Step 4 - Maintain Tested Recovery Runbooks

A backup without a recovery runbook is an archive. For each critical asset, document the complete step-by-step recovery procedure,e including the tools required, the personnel required, and the estimated completion time. Review and update runbooks every time a change is made to the target system.

Step 5 - Schedule and Track Backup Execution

OT backups are regularly skipped because they require manual execution by OT engineers who have competing operational priorities. Build backup schedules into the plant's management of change process so that a backup is triggered automatically every time a change is made to a control system configuration.

 

OT-Specific Considerations for Disaster Recovery

Segmentation-Aware Recovery Architecture

Recovery procedures need to account for network segmentation. Each security zone in the industrial network should have its own defined recovery sequence. Recovering SCADA servers before PLCs are verified as clean creates a risk of reinfection from a compromised control layer. Recovery sequences need to follow the trust hierarchy of the network architecture, not just restore the fastest systems first.

Coordination with Safety Systems

Any recovery scenario affecting process control systems must be coordinated with the safety team. Safety instrumented systems must remain in a safe state throughout the recovery process. Recovery sequences that inadvertently bypass safety logic, even temporarily, can create process safety hazards that outlast the cybersecurity incident itself. This is the intersection of cybersecurity and functional safety that most IT-led incident response plans do not account for.

 

Frequently Asked Questions

How often should OT backup procedures run?

At a minimum, OT configuration backups should run quarterly and after any change to a control system. For high-criticality systems such as PLCs controlling safety functions, backup after every change is the appropriate standard.

Can we use cloud backup for OT systems?

For some OT assets, particularly historian data and engineering project files stored on standard servers with network access, cloud backup can work. For Level 1 and Level 0 control devices in air-gapped segments, cloud backup is not reachable without creating a network path that violates segmentation requirements. These assets require offline or locally stored backups.

What is the difference between OT backup and OT disaster recovery?

OT backup is the process of capturing and storing configuration and data snapshots. OT disaster recovery is the end-to-end program that defines how you restore operations following a disruption, including recovery sequences, personnel assignments, vendor coordination, and testing. Backup is an input to disaster recovery, not the same thing.

How do we back up a PLC that has no network connection?

Air-gapped PLCs require manual backup using the vendor engineering software on a portable engineering workstation or laptop that is physically connected to the device's programming port. This is a scheduled manual activity, not an automated process. It is slower, but it is often the only available method for legacy or deeply isolated devices.