As the operator of a large server fleet, your responsibility is to ensure the infrastructure running business-critical application workloads is secure and available. To this end, there are a number of security control frameworks and best practices that you can follow, such as the NIST Cybersecurity Framework (CSF). In this blog post, we’ll introduce an aspect of data center security that few are thinking about—except for attackers—along with how Eclypsium can help you close this gap simply.
Each of your servers is actually a constellation of hardware and firmware components from various manufacturers. A typical data center server has up to 30 different components that have some sort of updatable microcode or firmware.
The U.S. Government Accountability Office (GAO) noted that a major OEM had 65 direct suppliers and more than 200 second-tier suppliers. Altogether, the components that went into that OEM’s computer products were manufactured in factories located in 39 countries! The point here is that the supply chain for your servers and other IT infrastructure is incredibly complex.
Each of these components are vulnerable to attack. For each component, the attack surface grows as we add functionality. For example, CPU microcode today is vastly larger and more complex than it was before multicore processing was a thing. This inevitably leads to vulnerabilities in code that handles speculative execution, such as Spectre, Meltdown, Inception, and a long list of variants. New GPU hardware used in AI infrastructure similarly includes a range of new code that contains vulnerabilities, and hackers have developed exploits to hide malware in GPUs.
A related eye-opening factoid: The baseboard management controller (BMC) included in servers for lights-out management in data centers is often based on some type of embedded Linux operating system, and so will often contain open-source libraries like OpenSSH or the xz utils library that was targeted by nation-state attackers as a supply chain vector. This includes OpenBMC, iDRAC, iLO, and other BMCs.
Attacking these server components does not require physical access. Many exploits can be executed over the local network or even from the internet if management interfaces are not segmented properly. This is a serious threat to traditional data centers and new AI infrastructure. For example, in January NVIDIA disclosed CVE-2023-31029 and CVE-2023-31030, 9.3 CVSS vulnerabilities in the NVIDIA DGX A100 baseboard management controller that an unauthenticated attacker could use for arbitrary code execution, denial of service, information disclosure, and data tampering.
An information sheet released by the U.S. NSA and CISA last year explains the risk of firmware threats that evade OS-level security controls. In a section titled Malicious actors target overlooked firmware, the paper says, “A vulnerable BMC broadens the attack vector by providing malicious actors the opportunity to employ tactics such as establishing a beachhead with preboot execution potential. Additionally, a malicious actor could disable security solutions such as the trusted platform module (TPM) or UEFI secure boot, manipulate data on any attached storage media, or propagate implants or disruptive instructions across a network infrastructure.” In other words, compromised BMCs help attackers establish persistence and spread laterally throughout the network.
The problem is not easily solved. The NSA and CISA information sheet continues: “Traditional tools and security features including endpoint detection and response (EDR) software, intrusion detection/prevention systems (IDS/IPS), anti-malware suites, kernel security enhancements, virtualization capabilities, and TPM attestation are ineffective at mitigating a compromised BMC.”
What security controls address this risk? The CIS Critical Security Controls are a tremendous and well-respected resource for helping to prioritize security controls. The top two in order of importance are Inventory and Control of Enterprise Assets and Inventory and Control of Software Assets, respectively. The components inside your servers are included in these two primary controls, but how many organizations are including components in their asset inventory and management processes? This is a serious gap in most data center operations.
The Eclypsium supply chain security platform fills the gap in data center cybersecurity programs, providing not only component-level asset management, but also vulnerability management, compliance monitoring, and threat detection. With one solution, you have the component-level security capabilities in place:
Component-level threats are serious and the risk needs to be mitigated. Traditional OS-level security tools cannot address these types of threats. Eclypsium’s supply chain security platform provides the visibility and security mechanisms necessary to protect your data center’s soft underbelly.
To learn more about how our solution works, download the solution brief Eclypsium for Data Centers and AI Infrastructure. If you’re ready to chat, we’d love to show you a demo.
The post Securing Your Data Center Servers at the Component Level appeared first on Eclypsium | Supply Chain Security for the Modern Enterprise.
*** This is a Security Bloggers Network syndicated blog from Eclypsium | Supply Chain Security for the Modern Enterprise authored by Chris Garland. Read the original post at: https://eclypsium.com/blog/securing-your-data-center-servers-at-the-component-level/