Software Bill of Materials (SBOMs)
Understanding Software Bill of Materials (SBOMs): The Ultimate Guide
What are SBOMs?
Software Bill of Materials (SBOMs) are used to document the source code, library and package dependencies, and the licenses consumed by the compiler or packager used to create a distributable artifact. The SBOMs are JSON or YAML files that live side-by-side with the built artifact. Software Bill of Material is required to take the next step in supply chain security, the vulnerability scan. To know your vulnerabilities, you need to know what code, library, and package dependencies to scan. So, SBOMs become the first level of defense required for all downstream reporting.
Why are SBOMs Important?
Software Bill of Materials, or SBOMs, expose software libraries that your developers consume from open-source and third-party packages like compilers and languages. You cannot defend yourself against source code hacks if you don't know all of the packages that your software consumes. This level of insight into your application dependencies is why Software Bill of Materials are an absolute must for understanding the software supply chain you deliver to your end-users. SBOMs, have finally been recognized as an essential tool for cybersecurity. SBOMs and cybersecurity go hand in hand. In this article, we will review what an SBOM is and why they are required for a strong cybersecurity strategy.
SBOM Format Examples – SPDX and CycloneDX
With the complexities of software, the adoption of SBOMs (software bill of materials) is on the rise. Despite the rise of use, many organizations are still unfamiliar with the current SBOM formats.
What Are SBOM Formats?
SBOM formats are standards that define a unified structure for generating SBOM and sharing them with end users. These formats describe the composition of software in a common format that other tools can understand.
CycloneDX and SPDX are the two popular formats for Software Bill of Materials.
- CycloneDX has a viewpoint of SBOMs from the OWASP world.
- SPDX has a viewpoint originating from a license consumption model.
In terms of SBOM formats, CycloneDX and SPDX achieve the same result. These formats document what is used to create an artifact by digging into obfuscated transient open-source dependencies.
CycloneDX is a modern standard for SBOM with a viewpoint of SBOMs from the OWASP world. It is a lightweight Software Bill of Materials (SBOM) standard for application security contexts and supply chain component analysis.
Using CycloneDX, supplier, manufacturer, and target component, tools used to create the BOM, license information for the BOM.
SPDX is another SBOM format with a viewpoint originating from a license consumption model. It is an open standard for communicating SBOM information. This information includes components, licenses, copyrights, and security references. It is designed to streamline work and improve compliance by providing a common format to share data.
Differences Between CycloneDX vs. SPDX
How do these two formats stack up? Understanding the difference between these common SBOM formats is important. Here is a breakdown of each format and how they are different.
|CycloneDX is a lightweight SBOM standard for application security contexts and supply chain component analysis.
|SPDX is formed with the intent of creating a common data exchange format for information related to software packages for sharing and collection.
|CycloneDX supports XML, JSON, and protocol buffers and get source code on GitHub.
|SPDX supports RDFa, .xlsx, .spdx and expands into other formats such as .xml, .json, and .yaml. There’s an online tool and GitHub repository.
|supports referencing components, services, and vulnerabilities in other systems and BOMs as well
|What it is
BOM Metadata, Components, Coordinates (group, name, version), Package URL, Common Platform Enumeration (CPE), SWID (ISO/IEC 19770-2:2015), Cryptographic hash functions (SHA-1, SHA-2, SHA-3, BLAKE2b, BLAKE3)
Extensions: extension points to support future use cases and functionality
|SPDX Document Creation Information, package Information, file Information, snippet Information, other licensing information detected, relationships Between SPDX Elements, Annotations
|CycloneDX can only be used to track the software components in a project.
|SPDX tracks the software components in a product and shares information about the software components with other developers, buyers, and sellers.
Between CycloneDX and SPDX formats, you will recognize an overlap in terminology with a different overall structure in the file formats. See https://cyclonedx.org/specification/overview/ and https://spdx.dev/use/specifications/
These schema definitions equate to a series of tables and relationships between the tables in the RDBM world. The CycloneDX JSON structure and documentation are easier to understand and follow. SPDX goes into much more detail, but the examples are XML snippets. With SPDX, there is no single view of the whole spec represented as a single YAML or JSON file.
Challenges with SBOMs
While Software Bill of Materials is critical to our overall DevOps practices, they are imperfect. Here are the top challenges.
- SBOM data is not consumed or federated.
- Vulnerabilities of tools
- Lack of signing of the SBOM
1. SBOM Data is Not Consumed or Federated
SBOMs are not useful if the data they generate is not consumed or aggregated up to the organizational level. They become more complex in modern architecture, where hundreds of components are used to create a single application. In this case, an SBOM is generated for each unique component. For this reason, there is no easy way to show a ‘federated SBOM‘ of all components used in a single application.
2. Vulnerabilities of Tools
Another issue is the tools to generate SBOMs have vulnerabilities themselves. The vulnerability resides when the tools run, post-build or post-packaging. SBOM generators use files that the compiler or package program creates, such as Rust’s cargo.lock, NPM’s package-lock.json, or an apt-cache. The gap between the build and SBOM creation provides a hacking opportunity where the SBOM report can be manipulated, resulting in untrue SBOMs.
3. Lack of Signing of the SBOM
Other challenges include the lack of signing of the SBOM, so we know the provenance in the SBOM report. The signing process is critical for SBOMs and cybersecurity.
The biggest challenge is the mutability of the SBOM files. It is important to show history over time without the ability for someone to manually update an SBOM. Software Bill of Materials files can be edited without any trace, allowing the data to be manipulated. For example, hackers can easily update an NPM package-lock.json prior to your NPM SBOM generation. By doing so, they can disguise any hack in the SBOM. Immutability becomes an issue at all levels of SBOM generation.
What was the SBOM generated for, a microservice or an entire application? In traditional development, an SBOM is created at the point in time when the application version’s artifacts are created, giving us an SBOM based on that application version. In microservices, each service is built and deployed independently. An application is a logical collection of microservices. Application versions are created when an underlying microservice version is updated. So how do we track application-level SBOMs when the application is only a logical representation and never built as a complete unit?
Understanding the Different Levels of SBOMs
Multiple Software Bill of Materials are produced at different levels of the software stack. The following levels are needed to have a complete audit trail of the entire supply chain used to create a single artifact (microservice, binary, library, etc.):
At the lowest level, the supply chain starts at the underlying hardware used, for example, the processor type (amd64 vs arm64). Yes, this level of the stack can also have vulnerabilities. However, this information is not currently gathered by default, even though the SBOM schema can store it. In CycloneDX this data is stored as Components.
Knowing the hardware used by your build is essential in recreating an artifact. For example, tools in the future may use a ‘consensus build networks’ where all machines must be identical, starting at the lowest level. The hardware SBOM is the way to confirm this level of parity. Check out JFrogs Pyrsia open source project for a consensus build network.
Operating System SBOM
The next level of the supply chain is the OS. We need to know the underlying operating system packages installed on the build machine. These operating systems packages affect how the compiler and packagers produce their artifacts and impact the ability to reproduce a build. It could also introduce vulnerabilities from old OS packages still being consumed. OS SBOMs are not currently gathered by default, even though the SBOM schema can store them. In CycloneDX format, this data is stored as Components.
The software compile translates source code into objects. To create a source-to-object parity, we must know what the compiler consumed for input and produced as output. Unfortunately, Software Bill of Materials at this level do not exist outside of commercial build tools such as OpenMake Meister and IBM Rational ClearCase (ClearMake).
Most build scripts that produce a ‘build audit’ report on the files found in the ‘local’ build directory and not based on what the compiler used. These files are pulled from a source repository (git directory) and listed as the source SBOM. However, the ‘local’ build directory is not the only place where the compiler will find files. If we are to be accurate in our SBOMs we cannot miss files. Software Bill of Material must include all files, even when managed outside your versioning tool. In other words, it must be done at compile/link time to be accurate.
In the future, compilers must be updated to produce Software Bill of Materials as standard output with the ability to control the inputs and outputs carefully. OpenMake Meister achieved this control using of a Search Path and generated Build Control Files. Another approach is to monitor the file system for reads and writes by the compiler, like ClearMake. The concepts of the GnuMake VPATH, with a ‘first found’ reference is as relevant today as it was 30 years ago.
Transitive Dependencies SBOM
Transitive dependencies happen when A depends upon B and B depends upon C. C is a transitive dependency to A. SBOM tools will rely on the compiler or packager to output the transitive dependencies, or they will be determined using a recursive lookup. Most Software Bill of Material at this level is accurate and represents where most teams have focused their attention. Transitive Dependency SBOMs are critical as they expose open-source packages.
Packagers, such as NPM, docker build, and dpkg-deb, pull together files and artifacts into an installable package based on a scripted configuration file. They produce two outputs, the package, and the package dependency file. A package dependency file, like package-lock.json or cargo.lock, are example outputs of the packager.
Some Package SBOMs at this level may contain source code dependencies based on the programming language, packager, and SBOM tools used. Like the compiler SBOMS, the package SBOM may only reference files from the ‘local’ build directory, ignoring all other file locations.
Application SBOMs aggregate together all of the lower Software Bill of Materials into a single view. Tools such as DeployHub and Ortelius snapshot all available SBOMs for each artifact in the supply chain. When consumed by an application, the Software Bill of Materials data is aggregated to the logical application level, creating an Application SBOM.
Without this aggregation ability, we lose the application level SBOM.
Tooling to Make Software Bill of Materials Accessible
Tooling makes Software Bill of Materials accessible. While your team may be doing a great job of generating the different levels of SBOMs for each build, we are far from finished. We need the ability to see the information so that it can be used and acted upon by multiple team members. And the SBOM information must have immediate relevance. After all, what good is a microservice’s SBOM if we have no insight into which applications are using the service?
Maturing in the use of SBOMs will drive the need to aggregate the data and display the information in a way that is easy to read and useful to all. The first step is finding an easy way to associate the generated SBOMs based on the artifact in a simple dashboard view. Unfortunately, we cannot expect everyone to know where the last build was executed and where the SBOM data lies.
Secondly, in a microservice architecture, we need a method of aggregating SBOM data up to the ‘logical’ application level. This aggregation requires tooling that can define a ‘logical’ application and build relationships between the microservices (or any component) and each ‘logical’ application version.
Understanding SBOMs is key to understanding why they should be an important step in the DevOps process. Both SPDX and Cyclone DX provide what is needed to expose open-source packages delivered to end users. This knowledge is critical when a high-risk vulnerability in open-source is found, and a patch is required immediately. For additional information on SBOM’s read the report from National Telecommunications and Information Administration (NTIA).
Federating all SBOM Data with DeployHub
In a cloud-native microservices architecture, your SBOMs are generated and managed at the microservice level. Microservices are pushed across your continuous delivery pipeline independently and frequently. Every time a new microservice is updated, all of the consuming ‘logical applications’ have a new version with a new SBOM and CVE report. Developers, DevOps Engineers, and Security teams struggle to keep up with the changes and cannot easily provide SBOM and CVE reporting for all impacted applications. The result is the absence of governance or a historical audit trail of the changes pushed to end users.
DeployHub’s SBOM automation tool solves this problem by centralizing the ‘evidence store’ data and continuously aggregating the information to the critical level, the ‘logical application.’ DeployHub provides you with a working federated SBOM report every time a new application ‘release candidate’ is created. DeployHub provides the insights needed to harden the security of the software your end users consume.
Start Leveraging Your SBOM Data Today
SBOM Key Concepts
SBOM automation automatically generates a list of all software components, libraries, and dependencies that make up a software application as part of your DevOps Pipeline and then consuming the data.
Aggregating SBOM data to the ‘logical’ Application level is required if you need to produce an Application SBOM in a decoupled architecture. Learn how DeployHub provides aggregated SBOM reports from hundreds of component SBOMs.
- Software Supply Chain Management Catalogs Explored Whitepaper
- Aggregated SBOM Reports
- Open-Source Inventory and Risk Management
- Supply Chain Versioning with Historical Trends
- Component Impact and Blast Radius
- Logical Application Views in a Decoupled Architecture
- SBOMs and Cybersecurity
- Federated Software Composition Analysis Data