OT/ICS and IoT Incident Response Plan
2024-7-24 21:54:25 Author: securityboulevard.com(查看原文) 阅读量:3 收藏

What is an Incident Response Plan?

Modern-day enterprises experience cybersecurity threats and risks are a part of everyday business. Therefore, protecting business assets requires pre-emptive and proactive measures, and IRP is one such approach that assists security teams in handling a security event.

A network security breach can put an enterprise into chaos. A security breach exposing sensitive data and networks pushes security teams into panic, especially the inexperienced ones. Even an expert security team might fail in neutralizing a threat optimally if they are unprepared. To ensure optimal handling of threats even in crunch situations, irrespective of the teams’ experience, the Incident Response Plan (IRP) comes in handy. An Incident Response Plan is a document that assists IT and OT security professionals in responding effectively and timely to cyberattacks.

The IRP plan includes details, procedures, and tools for identifying, and detecting an attack/malfunction, analyzing, determining its severity, and mitigating, eliminating, and restoring operations to normalcy on IT, IIoT, and OT networks. The IRP plays a crucial role in ensuring an attack does not recur. The amalgamation of IT, IIoT, and OT networks has made cyberattacks at the core of security breaches, along with other challenges like modification to control systems, and restricting interface with operational systems among others.

Attacks on IT, IIoT, and OT Networks:

Cyberattacks:

The cyberattacks can originate in the following manner, targeting the corporate and operational divisions of an enterprise:

  • Ransomware attack
  • Data breach
  • Loss of sensitive information
  • Data leak
  • Social Engineering (Spear Phishing & Phishing)
  • Email Spoofing
  • Typo-squatting
  • Operation Security (OPSEC) failure, and others.

Modification to control systems:

From disabling safety sensors to triggering a reaction of event failures, modification to control systems can have drastic effects. The case is worse in the case of OT networks, where there is little to no security with a single event capable of impacting the whole supply chain ecosystem.

The physical infrastructure at manufacturing plants comprises thousands of PLCs, multi-layered SCADA systems, and DCS. Any process malfunctioning and anomalies occurring at the plant level can affect the OT infrastructure. The following signs raise red flags about malfunction or an attack on an OT network:

  • Disappearance and appearance of assets from the OT network map
  • Failing of automation systems
  • Safety protocol alarm activation
  • Failure of SCADA systems, forcing complete shutdown at times
  • Constant alerts about security and safety protocols

It is crucial to acknowledge that threats can take any form and shape, and a comprehensive IRP should be able to address the challenges above thoroughly. There have been numerous instances of a cyberattack-led attack destroying OT networks and affecting related infrastructure.

IRP reflects an organization’s personal and corporate information integrity. Often, many IRPs include defining roles and responsibilities, establishing communication channels between teams (IR team and the organization), and carrying out standard protocols during a security event. An Incident Response Plan continues functioning even after handling a security event effectively. It provides a window into historical data, helping auditors ascertain the risk assessment process.

Evaluating the effectiveness of IRP

A set of metrics need to be established to track the effectiveness of an IRP. A few of the metrics are as follows:

  • Security rating of the enterprise and the competitor
  • Number of vendors and their average security rating
  • Lowest-rated and least-improved vendors
  • Highest-rated and most improved vendors
  • Number of incidents detected, missed, and mitigated
  • MTTD and MTTR
  • Number of repeat incidents
  • Number of known and unknown vectors

These metrics help understand and estimate the risk weighing on the IRP and pave the way to improve it further.

Importance of Incident Response Plans in IT, IoT, & OT establishments

Technology and automation are woven into our daily lives. Industrial plants run on integrated and sensitive IT and OT networks, pushing the world forward. However, the evolution of IIoT has added another layer of complexity, calling for stricter security measures, given its level of social, government, and military penetration.

Need for Incident Response Plan in IT & IoT

A security event has the muscle to the shake foundations of businesses. The highly publicized 2015 Target data breach saw the CEO getting fired. In addition, numerous SMBs (Small and Medium Businesses) went bankrupt after a data breach was made public. Unauthorized access hampers an enterprise’s IT ecosystem and affects every device on the network, putting thousands of IoT connected to the breached IT network.

It is not possible to completely secure a given IT & OT network from cyberattacks. In such an atmosphere, IRP can help minimize the damage to a good extent. It minimizes the threat radius and can help recover the systems at a swift pace. Alongside this, it plays a crucial role in meeting numerous industry and government compliances, protecting the company’s brand, and paving the way for agencies to better collaborate in tackling the threats.

  1. Preparedness is vital – Cybersecurity attacks have risen in complexity, effectiveness, and volume in the modern world. Under these situations, taking a pre-emptive stance and preparing for an attack helps minimize the damage and restore normalcy quickly. The network’s health should be the priority for everyone working with the enterprise, with constant vigilance the only way to prevent a cybersecurity attack.
  2. Enhancing Data Security Practices – Data security gaps are often the reason for data breaches. Testing Incident Response plans have shown promising results in enhancing data security practices. IRPs help identify data security gaps thereby strengthening security posture.
  3. Patching Cybersecurity infrastructure gaps – Running breach scenario simulations on sensitive networks can help security teams detect cybersecurity infrastructure gaps. Often, bad actors exploit the insecure points to breach the network. Patching these insecure points can thoroughly address the security concerns before an actual security event arises.
  4. Holding the company’s trust among shareholders – Having a strong IRP affects everyone in the enterprise – from the customers consuming the services to the C-suite making critical decisions. In addition, an IRP infuses confidence in clients, investors, and other personnel and helps strengthen the brand’s reputation in the public’s glare.
  5. Minimizing lost work hours – Cybersecurity attacks can hamper daily operations and stall operations for weeks and even months, leading to the loss of human work hours and revenue loss. An Incident Response plan can help minimize these losses by guiding IT Security Teams to act swiftly and effectively in handling a threat – from its entry to neutralizing it. The sooner one identifies a threat, the faster it can be offset.  
  6. Curtailing damages – Data breaches can push an enterprise into bankruptcy. An average data breach cost in 2022 was around $4.54 Million, an increase of 17.5% from 2020 ($3.86 Million). 65% of enterprises say there is an increase in the severity of attacks, and over 55% of enterprises confirm that threat resolve times have increased. The damages without an IRP can get worse in such enterprises. Additionally, soft costs due to reputation loss can quickly put the enterprise out of operation, and companies without an IRP end up paying penalties after a data breach. Therefore, preventing a data breach is essential for an enterprise’s existence in the market.
  7. Transformation into Knowledge Hub – Every successful encounter handling a threat following an IRP helps build a knowledge hub. The security teams will have access to vast amounts of fertile data for future reference to optimally handle emerging threats. In addition, the lessons security teams learn over time, and the best practices they implement become highly fertile knowledge for future teams.
  8. A repeatable process that aids coordinationAn IRP is a very repeatable, clear, and reliable method to handle security events. This repetition improves coordination – communication, and technology between the security and other teams in the organization over time. The improved coordination helps achieve higher efficacy with every security event.
  9. Reducing MTTD and MTTR – Time is the most critical component when it comes to handling a security event. Threat detection time affects threat mitigation time. Unfortunately, it often takes months and even years (in some instances) to detect specific threats. The attackers continue to infiltrate the network with each passing second. Having an IRP, coupled with constant network monitoring, can help detect threats quickly, significantly reducing MTTD (Mean Time to Detect) and MTTR (Mean Time to Respond). Over 70% of incident response cases are attributed to BEC (Business E-mail Compromise) and ransomware.
  10. Compliance and Documentation – Government agencies, compliance partners like PCI DSS (Payment Card Industry Data Security Standard), GDPR, HIPAA (Healthcare Portability and Accountability Act), and cybersecurity insurance agencies require a fully functional IRP and a testing regime. In addition, most Government agencies make IRP mandatory for enterprises to be eligible for contracts. Likewise, the insurance underwriting’s premiums noticeably change depending on whether an enterprise has an IRP in action or not. Moreover, enterprises meet with heavy legal and penalty fees from forensic and auditing agencies for failing to have an IRP. 

Need for Incident Response Plan in the OT Sector

A robust Incident Response Plan in manufacturing, pharmaceuticals, and energy sectors where IoT, IIoT, OT, ICS, and SCADA systems are vital is indispensable. OT networks are the backbone of modern society, and any lapse in their functioning can have cascading effects. Given the quantum of resources (human and other assets) and the inter-dependency of additional infrastructure in OT networks, the stakes are quite high. Hence, it is important to understand why IRP plays a key role in defining the security of IIoT and OT, thereby shaping society.

  1. Cybersecurity threats: Most OT infrastructure is at least a decade old, isolating the network from modern security protocols. These sectors are soft and attractive targets for cyberattacks due to the critical nature of their operations and the potential impact on national security, economic stability, and public safety. Incident response plans help detect, mitigate, and recover from cyber incidents effectively.
  2. Operational disruption: A process malfunction, anomaly, or cyber-incident can disrupt critical processes and operations in these sectors, leading to significant financial losses and potentially endangering human lives. Such circumstances require a well-thought-out protocol that can avert the risk quickly. A comprehensive incident response plan ensures a timely and coordinated response to prevent threats, minimize disruptions, and restore operations swiftly.
  3. Asset protection: Typically, a manufacturing plant has thousands of physical assets, and digital assets (machines, ICS, PLCs, and others) interconnected across various levels. The manufacturing, pharmaceuticals, and energy sectors rely heavily on these physical and digital assets for optimum functioning. Failure of any critical component can cause unplanned downtimes, plant shutdowns, and even human life loss in extreme cases. An incident response plan helps protect these valuable assets from unauthorized access, sabotage, or theft ensuring business continuity and preventing financial losses.
  4. Regulatory compliance: Enterprises are subject to stringent regulations and compliance requirements, such as the FDA regulations for the pharmaceutical industry or NERC-CIP standards for the energy sector. Other compliances include NIST SP 800-82 for securing Industrial Control Systems, IEC 62443 for Industrial Automation and Control Systems, ISO/IEC 27001 for managing security incidents, and HIPAA for IoT devices in healthcare, to mention a few. A well-defined incident response plan helps meet these compliance obligations and demonstrate due diligence to regulatory authorities.
  5. Incident containment and mitigation: Isolating a compromised section on an IT network is not difficult. But the same cannot be said when it comes to IoT, IIoT, OT, ICS, and SCADA networks. These systems are interconnected at various levels and are highly vulnerable to cyber threats, literally mandating security measures like DMZ (Demilitarized Zone). An intrusion into power plants and electricity management OT networks can threaten national security. Given the critical importance of the infrastructure, an incident response plan provides a structured approach to contain and mitigate the impact of an incident, preventing its escalation, and its spread across the network.
  6. Supply chain resilience: Production disruption and quality control issues depend on OT networks. Inventory management is interconnected to OT networks, making tracking and managing stocks in real-time possible. An attack on the OT networks can significantly affect production, Quality Check mechanisms, and inventory management, thereby comprehensively impacting the supply chains. The manufacturing and pharmaceutical sectors rely heavily on complex supply chains. Likewise, an attack on the transportation systems and communication channels can delay the shipment of goods. A single cyber incident in one part of the supply chain can have cascading effects on the entire ecosystem. An incident response plan facilitates collaboration and communication among partners, ensuring supply chain resilience and minimizing disruptions.
  7. Reputation management: In the age of ever-growing cybersecurity threats, no enterprise is safe irrespective of any level of cybersecurity without preventive measures in place. Cyber incidents still occur and continue to damage the reputation of the organizations operating in these sectors. In these turbulent times, having an Incident Response Plan mirrors the commitment of an enterprise toward its business operations and ensures holding back the trust. Prompt and effective incident response helps manage communication with stakeholders, customers, and the public, mitigating reputational damage and maintaining trust.
  8. Safety and environmental concerns: Every enterprise, especially from the energy sector (power plants and oil refinery facilities), is taking steps to minimize its ecological carbon footprint. Measures are being implemented to reduce their impact on the ecosystem. However, these critical OT & IT infrastructures come with potential safety and environmental risks due to unauthorized access, cybersecurity incidents, or malfunctioning. A robust incident response plan addresses the steps to avoid accidents or ecological disasters during mishaps.
  9. Legal and financial implications: Organizations without an Incident Response Plan can be legally prosecuted by the law for negligence and breach of duty, leading to legal liabilities, fines, lawsuits, and financial losses for organizations. Insurance companies take the existence and effectiveness of a Response Plan before underwriting policies. As a result, OT enterprises without a comprehensive IRP might shell out higher premiums regarding insurance policies. Further, insurance can deny liability if a company does not have an IRP in place. So, a comprehensive Incident Response Plan, like the one from Sectrio, can minimize legal and financial risks, demonstrate compliance with data protection laws, and potentially reduce liability in case of an incident.
  10. Continuous improvement and lessons learned: ‘Learn, evolve, and upgrade’ – this motto forms the core of any Incident Response Plan. While an enterprise cannot conceive every possible situation of an intrusion/malfunction, but can learn from the past.

The past learnings are incorporated into the IRPs, making them dynamic and living processes. By having an incident response plan, organizations can learn from past incidents, conduct post-incident analyses, and continuously improve their security posture to protect their systems and assets better.

Drafting an efficient Incident Response Policy for OT, IoT, and IT Networks

Irrespective of the size of the enterprise, an effective Incident Response Policy is the need of the hour amid the snowballing cybersecurity threats. A comprehensive and efficient IRP helps respond to a cybersecurity incident, malfunction, or any mishap during the operational course effectively and minimize the consequential situation arising. Therefore, following strict measures while drafting an efficient Incident Response Policy is obligatory.

  1. Identifying an exclusive Incident Response Team
    The foremost measure an enterprise should take while drafting an Incident Response Policy is identifying the team that looks after the setup. The IR team should comprise experts from various domains – technical, HR, legal, public relations, etc. The team’s core strength and ability to work in sync determine the efficiency in handling a security incident. The policy should define every team member’s roles and responsibilities. Often OT infrastructure like industrial plants and units have thousands of individual components working in tandem under tens of different sections. Having experts from each section is highly recommended over a general expert, as the former will possess a superior skill set, helping to minimize the response time.
  2. Defining security incidents
    Allocation of resources – capital, time, and workforce- determines how effectively and timely the IRT team can handle a security incident. Classifying security incidents into various categories is essential for allocating these resources. Such an exercise allows the enterprise to prioritize the allocation of resources. For instance, common alerts include network and systems breaches, data theft, malware attacks, and unauthorized access on the cybersecurity (IT) front. Likewise, in a manufacturing plant (OT Infrastructure), disruption in the assembly unit, compromising of critical access control, and failure of PLCs among others can affect production and safety. Therefore, classifying an incident into the above and other categories is essential to optimize resource utilization.
  3. Refining alert mechanism
    Refining security alerts and security incident thresholds is a continuous process in Cybersecurity, Operational Technology, and IoT. If an incident is marked as a false positive, it is essential to understand what led to such an alert and tune the alert mechanism accordingly. During production, a faulty machine can impact the production line. This may be a genuine failure of a given machine rather than a third-party intervention. For IoT devices, a software glitch can cause a device to malfunction. This needs to be identified ASAP so that threat actors do not exploit the glitch. Alert mechanisms should be able to classify by collecting information from various resource points than from a single point. It saves time, ensuring the security team focuses on alerts that require their attention.
  4. Updating the Incident Response Policy
    Enterprises should update their Incident Response Policies regularly. New threats actors, techniques, and technologies evolve daily. Threat actors leverage novel techniques and novel tools to breach secure spaces. This calls for constant updation of the threat library, procedures to remediate the threats, and steps to protect the secured establishment. Doing this helps security teams act swiftly, constraining the threat perimeter, and bringing the systems to normalcy in the shortest time possible. Adding lessons learned from handling a security incident into the IRP is essential in improving the enterprise’s security.
  5. Monitoring performance
    Time is the most important metric when measuring the performance of any cybersecurity solution. IRP is no stranger to it. MTTD (Mean Time to Detection) and MTTR (Mean Time to Response) are the critical metrics in adjudicating the efficiency of an Incident Response Policy. Parallelly, taking feedback from the team and customers is a good practice in further refining the IRP. Likewise, identifying the number of incidents, successful breaches, closed incidents, loss or damage due to incidents, and thwarted attempts over a specific period are qualitative and quantitative metrics to determine the efficacy of the Incident Response Policy.
  6. Developing procedures related to:
    • Incident Reporting:
      Defining an incident report template is critical. It includes incident reporting within the enterprise, third-party vendors, and establishing an exclusive communication channel– email, and a dedicated telephone line. In addition, the Incident Report template should help individuals mention the information for the IRT to act swiftly. Therefore, timely and comprehensive incident reporting is crucial.
    • Escalation Reporting
      While not all threat alerts are worrisome, there might be a few requiring additional attention, prompting the monitoring team to escalate. For instance, an assembly unit might be affected in an industrial plant due to a minor glitch. The employee responsible for that section can rectify the glitch instantly. Such minor threats need no further escalation. However, a SCADA system might fail causing a temporary halt on the manufacturing line in some cases. Severe threat alerts require instant escalation to the higher authority. By establishing a thorough escalation reporting procedure, there would not be a time-lapse, with threat alert reports reaching the right person at the earliest. At times forensic specialists and law enforcement agencies come into the picture, depending on the threat and resources available at the enterprise.
    • Training Programs
      Developing and conducting training programs for the workforce is a crucial exercise to achieve maximum efficiency among the Incident Response team. Detailed and comprehensive hand-outs can help employees identify an incident readily. The training should include incident reporting, escalation reporting, identifying and preventing security incidents, and categorizing threats based on severity. A holistic approach is quintessential for developing an efficient IRP.
  7. Testing and Reviewing
    Annual testing and reviewing of the IRP are mandatory. In fact, at Sectrio, we suggest bi-annual testing and reviewing. Testing and Reviewing include conducting mock drills and threat simulations of emergencies (intrusions, system failures, network jamming, and data loss, among others), to test the Incident Response policy, plan, and procedures in practice. Likewise, evaluating the policy depending on the updated threat library, techniques, technologies, guidelines, and regulations is critical. Regularly implementing testing and reviewing fortifies the security of the enterprise or the manufacturing plant.

Break down of NIST CS IR Team Incident Response Plan – OT & IT Infrastructure

The Incident Handling Guide from NIST (National Institute of Standards and Technology) proposes a four-section phase for a successful IPR. It involves:

  • Preparation
  • Detection and Analysis
  • Containment, Eradication, and Recovery
  • Post-incident Activity

Preparation phase:

The initial phase of the Incident Response Plan deals with the prevention of threats arising from various reasons and causes. At this phase, most threats are flagged, dealt with, and analyzed to evaluate the extent of threat they pose to the enterprise. The threats that meet specific criteria based on threat intelligence inputs and other data are notified as incidents, and a defense plan is created accordingly. The preparation phase involves the following:

  • Establishing an incident response team
    • Offering exercise training through security incident simulations
    • Access to Incident Analysis Hardware and Software, Incident Mitigation Software, and Incident Analysis Resources.
  • Prevention of incidents using:
    • Risk assessment analysis – Understanding threats, organization-specific threats, identifying critical resources, and increasing focus on monitoring and response activities towards respective resources.
    • Hardened host security by using SCAP (Security Content Automation Protocol)
    • Real-time detection of incident and security events
    • Using a VPN to prevent Network Security and installing malware protection tools
    • Organizing periodic training and awareness sessions for the workforce
    • Implementing DMZ architecture in manufacturing and production plants
    • Multi-layered authentication controlso  
    • Establishing multiple coordination and communication mechanisms to eliminate SPOF (Single Point of Failure)

Detection and Analysis (and documentation):

Understanding anomalies and cyber intrusion is essential in the early detection of the threat. Analyzing system data with comprehensive toolsets helps to identify whether a network breach or an intrusion occurred. The tools include logs, firewall intrusions, error messages, Intrusion Detection and Prevention Systems (IDPS), security information, SIEMs (Security Information and Event Management tools), network monitoring tools, file integrity checking software, and others.

Incident Detection:

Classification of an anomaly as an incident is critical. False classification can waste valuable human work hours and resource utilization.

  • Suspicious activity on the server or loss of data can point toward a data breach.
  • Malfunctioning of sensors, PLCs, detectors, and other devices on a massive scale along a particular section of the plan
  • A sudden spike in alerts on a given day
  • Malware detection by security systems
  • Logs indicating usage of a vulnerability scanner on servers
  • Unauthorized activity footprints in the command center
  • Buffer overflow against a database server

Identifying how and when a threat has crept into the network is crucial, as threats can take different forms and channels. For instance, threat vectors can be:

  • Removable media – Using USB to spread malicious code
  • Unpatched workstations on OT infrastructure
  • DDoS Attack against authentication mechanisms
  • SQL injection, man-in-the-middle attacks, and rouge wireless attacks
  • Phishing attacks via email
  • Inherited vulnerabilities for PLCs, RTUs, and DCS
  • Poor encryption
  • ICS focussed malware
  • Rogue employee installing unsecure software to siphon off data

Incident Analysis:

Generally, most alerts from IDPS, RBVM, and other monitoring tools are flagged. However, it is painstakingly difficult to analyze each alert, evaluate it, and then classify it as a false or an intrusion – with alerts usually numbering thousands to millions (in larger enterprises) daily. Additionally, even if a given indicator performs accurately, determining the exact cause behind the alert without investigating it deeper – is an exercise that is heavily time and resource-consuming.

Only a well-versed team, capable of correlating data from indicators and precursors, can correctly identify whether the alert is false or legitimate. Therefore, following a predefined procedure to analyze an event and documenting every step in the process is crucial. Furthermore, to explore and validate an alert, security teams should practice the following techniques to analyze and validate a threat:

  • Synchronizing all host clocks – An essential exercise that helps to validate a threat, given the time stamps and its presence across the network
  • Data filtering – Many analysts filter data with indicators of critical significance, leaving out data reflected in insignificant indicators. While considerable risk comes with this approach, we can minimize the risk factor by using other validation techniques.
  • Installing RBVM Systems – Risk-Based Vulnerability Management systems to get accurate insights into the kind of threats, their severity, and threat actors involved. The RBVM systems help in analyzing the level of acceptable risk and the urgency the risk brings for the security teams to act upon.
  • Performing correlation – Largely affected by the skill of the security personnel, correlating data from various indicators and tools is highly effective in validating an event’s occurrence.
  • Defining normal behavior – A solid understanding of what normal and abnormal behavior point to on applications, networks, and systems is critical. Though perfected over time in quickly identifying ‘unexplained entries’ in the logs, this technique comes in handy to analysts.
  • Profiling of systems and networks – Recognizing changes to a particular activity based on its characteristics and what to expect of it relates to profiling on systems and networks. Though analysts cannot solely depend on various profiling techniques, they can use above mention techniques to zero in on an alert. Understanding the impact of the threat on the network is essential in crafting effective defensive strategies.

Documentation

Documenting every step in detecting and analyzing a security alert is prudent. Many CISOs and plant managers see this exercise as the most significant step in acquiring knowledge that is critical to strengthening the security posture. The documentation should thoroughly answer the following:

  • A complete profile of the event occurrence – When and how it occurred
  • Threat vector used for the attack
  • Who identified and reported the incident
  • Information regarding its discovery, validation, and analysis techniques
  • Network segments that are affected
  • Impact of the Threat on Daily Operations
  • Are critical systems affected? If so, which systems and to what degree?
  • Details of the team that responded to the threat
  • What action has been taken by the Incident Response team? And why?
  • The outcome of the respective action
  • Logs about every detail throughout the procedure

Incident Prioritization

Identifying the order of threat addressal is critical in protecting the network. Addressing threats that could have minimal to zero impact on the network functioning and missing those that can critically affect the network can be catastrophic. First-come-first-serve does more harm than good when it comes to handling security threats.

Prioritization in addressing a security threat relies on its:
  • Functional impact on the system: Threat prioritizing is made on the operational implications a threat can have on the network or the system. Often, threats attacking the IT systems paralyze a part of complete functionality on the admin and end user’s end. The effect could be temporary and can even last into the future. Meanwhile, on an OT infrastructure, the threat could incapacitate safety protocol, putting the lives of the workforce in danger. Parallelly, it can also bring entire operations to a standstill. Incident response team handlers must consider all the factors and act while prioritizing which threat to contain.
  • Recoverability factor from the event: Having a security breach might sound terrible. Pursuing an elongated incident cycle can cost more than the effect threat on the network. Unless preventive efforts can prevent such threats from repeating next time, the IR teams should make wise decisions while pursuing an IRP in containing a threat. For instance, it is sensible to completely detach and replace affected portions of a manufacturing plant, rather than trying to find ways to prevent it from happening. The value of the resources poured as a part of IRP might be more than the threat’s effect on the system.
  • The extent of data compromise: Few security events can affect an organization’s data, integrity, and confidentiality. At times, a data breach may not affect sensitive and confidential information. Likewise, understanding how a data breach affects a partner organization is also crucial. The incident response team should weigh possibilities when addressing certain security events involving data exfiltration.

Containment, Eradication, and Recovery

The last series of steps after an intrusion is the Containment, Eradication, and Recovery. Quick and potent containment and eradication are vital in minimizing the damage. While containment restricts the intruder from moving to other parts of the network, eradication ensures deleting the malware, disabling affected user accounts, and addressing all the vulnerabilities thoroughly. The breakneck speed at which security teams deploy containment measures reflects the extent of damage and ease of eradication on a network. In addition, faster identification of a threat helps quickly contain it, thereby preventing it from spreading deeper into the network.

In OT networks and infrastructure, bringing the entire operations to a standstill at times helps in faster containment and eradication of the threat. Likewise, depending on the extent of threat infiltration, IT network administrators can temporarily disconnect the affected part or as deemed.

  • Means and ways to secure the network
    • Understanding the resources that are at risk
    • Mitigating the threat to limit the damage
    • Resources required for containment
    • Putting a workaround solution in place until everything is secured
    • If required, allow only one-way information to and from the network.
  • Documenting the entire procedure
    • Details of the procedures adopted in containing the threat
    • Information about human resources and machines in use
  • Identifying the attacking hosts
    • Taking the aid of incident databases to determine the type of attack
    • IP Addresses of the host (if possible)
    • If the threat is due to a physical security lapse, gather enough evidenceo   Scanning communication channels

While identifying the attacking host is a good idea, it may not be wise, especially when trying to contain a threat. The primary goals of the IRP are to secure the system and limit the damage of a successful intrusion. Security teams might fall short of resources if they try tracing back the path of the attacking host in certain situations. The security team’s lead should make wise decisions considering the situation. For example, there might be cases where access to a specific section can also result in accessing other critical areas. Under such conditions, pursuing the attacker’s host is futile, given the quantum of resources at the risk of unauthorized access.

Eradication:

Once a threat or an intrusion is identified positively, security teams move fast to contain the threat and the intruder by deploying various techniques. Upon containment, the eradication procedure kicks in. Having a well-established plan to eradicate different types of threats comes in handy and saves valuable time. The Security experts can immediately deploy suitable techniques depending on the threat profile.

From removing malware to identifying the root cause for successful intrusion, eradication teams work with other groups. For example, in cases where user accounts are breached, they are often disabled, and access to the network is cut off immediately via such accounts. Security teams try to plug all the vulnerabilities they can find after deleting the malware from software, hardware, and networking components. In the case of OT infrastructure like manufacturing plants and oil refineries, the operations can be brought to stand standstill, or necessary changes might be made to the supply change to cause minimal disruption. These measures ensure not only foil similar attacks but also help improve the security posture. 

Security teams might temporarily find workaround solutions if a function is disabled or a network section is rendered inoperable. However, in rare cases, owing to security reasons, an operation might be permanently disabled.

Recovery:

The critical component that brings businesses back to the table is recovery. After successful containment and eradication, the security teams must return the systems to a fully functional state. The process of recovery can include the following:

  • Ensuring all the channels of communication are time-synced.
  • Restoring systems from clean (malware-free) backups
  • Installing security patches and scanning the networks
  • Changing passwords of all user accounts and systems
  • Removing compromised files and replacing them with clean files
  • Strengthening network security by altering boundary router access lists and firewall rulesets
  • Increasing network monitoring

Recovery of the complete network to the pre-attack state can take anywhere between a few days to months. The recovery timeline dramatically depends on the damage extent, threat complexity, and resource availability. Enterprises might need to deploy eradication and recovery procedures in a phased manner, ensuring remediation steps lead the process. During a successful attack, enterprises should focus on improving the OT and IT security posture on short notice as a temporary fix and then work around building a more robust and secure network. 

What is an Incident Response Policy?

Incident Response Policy is a formal document explaining measures to follow before, during, and after an incident – cybersecurity incident, operational failure, mechanical failure, and other pathways leading to normal functioning of an enterprise or a plant. The senior leadership team approves the policy in consultation with the security and operations team. The Incident Response Policy also lists the responsibilities, activities, workforce, and other guidelines to be followed by people responsible for securing an enterprise. An Incident Response Policy should not be confused with an Incident Response Plan. The former is a framework that helps prevent a situation, while the latter is a step-by-step protocol/procedure that comes in handy when a cybersecurity incident occurs.

The Chief Information Security Officer (CISO) plays a pivotal role in drafting an enterprise’s Incident Response Policy. The IR Policy framework helps handle and manage cybersecurity risks most effectively and efficiently. Precise planning and communication in the policy are vital in ensuring similar threats don’t recur and that the teams are strategically and resourcefully equipped. Every byte of data, network connection, device, controller, and system that is part of the network comes under the purview of the Incident Response Policy. To make the most of an IR Policy, one must adopt a framework before, during, and after an incident. This else to improve the efficiency of the IR Policy.

What to do:

Before a cybersecurity incident (IT & OT Networks):

  • Give hand-outs of the IR Policy to everyone involved. For example, digital forms of communication might be down during a cybersecurity incident. A disruption in other communication channels can be affected when cables are cut.
  • Train the workforce to identify cybersecurity threats and follow measures to prevent threats.
  • Conducting attack simulations exercise to test the preparedness of the team
  • The role description of every individual involved
  • Make an attorney a part of the discussion before drafting the IR Policy

During a cybersecurity incident:

  • Deploying incident managers
  • Ensuring communication takes place through secured channels
  • Separating compromised sections of the network
  • Bringing subject matter experts on board for technical assistance
  • Establishing clear communication with media houses via a communication manager

After the cybersecurity incident:

  • In-depth analysis of what led to the intrusion
  • Examining processes, technologies, and people who are a part of the team
  • Suggesting places for improvement
  • Updating new learning and reviewing IR Policy quarterly
  • Establishing communication and transparency with the team

Incident Response Policy Template:

An IRP Template helps make a checklist of various aspects of the IRP framework. Every enterprise has its own IRP Template that doesn’t differ much from others. Depending on the industry and the expertise, enterprises customize their IRP templates to suit their needs. At Sectrio, we have developed a comprehensive IRP Template that can find a place in every enterprise, allowing great customization. An ideal IRP Template is as follows:

  1. Incident Response team
  2. Defining roles, responsibilities, and furnishing contact information
  3. External contracts
  4. Compliance and Legal obligations
  5. Definition and preparation
  6. Threat classification 
  7. Identifying a security incident 
  8. Types of incidents
  9. Incident detection and analysis
    • Containment
    • Eradication
    • Recovery
    • Post-incident analysis
  10. Incident-related data elements
  11. Incident log and evidence register
  12. Periodic testing
  13. Remediation

The length of an Incident Response Plan can range from a 5-page document to a 50-page document. Despite the increase in cyberattacks, only 45% of enterprises have Incident Response Plans. Out of those companies that have IRP, more than 35% of them have a managed service provider without an exclusive IRP team. 

*** This is a Security Bloggers Network syndicated blog from Sectrio authored by Sectrio. Read the original post at: https://sectrio.com/blog/guide-to-incident-response-plan/


文章来源: https://securityboulevard.com/2024/07/ot-ics-and-iot-incident-response-plan/
如有侵权请联系:admin#unsafe.sh