RFC: Potential EOL Schema? #3484

noqcks · 2023-08-28T03:47:53Z

noqcks
Aug 28, 2023

Hello everyone! I'm the creator of the https://github.com/xeol-io/xeol tool which uses endoflife.date for EOL information to matching EOL software in containers. First of all thank you tremendously for the project and leading the efforts around EOL.

I have been thinking about an improved schema for EOL lifecycles recently and wanted to open a discussion about it. I know that we have somewhat differing use cases for this schema/data. We would like to use it programatically, while the main use case right now for endoflife.date is for displaying it on the web page. And this schema is most useful for the programmatic use-case, and I tend to believe the two use cases are somewhat at odds (without doing some data transformations).

For our use case, we need a highly accurate database of EOL products, so much so that using ./well-known from vendors would not work as there would likely be too much variance in quality between vendors. I would like to work together as much as possible (for example sharing this RFC), but also understand that there may be different goals we each have in mind. In either case, hope this ignites some helpful discussion and has some ideas you can draw on.

This document builds off of some of the work from the releases.json RFC and is related to the discussion Distinct PURLs/Identifiers for each release cycle.

cc: @captn3m0 @marcwrobel and also @witchcraze as I know you have been thinking about this area for quite some time.

An Open Source EOL Schema

Aug 27, 2023

Purpose

The goal for this document is to describe a schema for describing end-of-life (EOL) dates for software products. It should be extremely accurate and yield low false positive rates. We believe that false positives are in fact more harmful than false negatives because they actively degrade the trust from the user. This must be kept in mind. There has been some great work in the space with https://endoflife.date and we hope to build upon that.

This format is a work in progress. Please feel free to add comments.

Definitions

We must first define some critical terminology, specifically around end-of-life (EOL). In the schema there are different dates surrounding when a project is no longer supported:

end-of-support (eos): this is the end of active support. This is when bug fixes are no longer being made, but there are still security bugs being fixed.
end-of-life (eol): this is the end of bugfixes and security support. All support stops.

There may be other definitions that have been made historically, but these are the two we think matter.

Background

The biggest problem with creating a lifecycle schema is the sheer complexity of the different things we need to identify. We need to first define the things with which we can attach a lifecycle. Let's give some examples of each:

service - A service is a solution provided by a vendor. Google Kubernetes Engine. It can be identified by Name and Vendor.
device - A device is a physical device like an iPhone. It can be identified by Name and Vendor.
software - A software is a collection of packages like Nginx or Kubernetes. It can be identified by a CPE or by any one of the package URLs contained within it.
package - A package is a standalone item that can be identified by a Package URL. For example, packages on NuGet may be marked as deprecated.
os - An operating system like RHEL or Ubuntu. It can be identified by a CPE or a SWID.

Since a software contains packages, you might assume that the lifecycle associated with them is always the same, but this is not the case. For example, .NET 7.0 is a software that has a lifecycle attached to it, but the packages in the .NET ecosystem hosted on NuGet may be deprecated before or after .NET 7.0 itself is EOL.

The schema must also have support for vendors in some way. A vendor may support software or packages in their ecosystem that are not their own. For example, Red Hat may support the lifecycle of .NET 6.0 in their Application Streams for RHEL even though .NET 6.0 is produced by Microsoft.

Project Schema

The format is a JSON structure.

{
  "schema_version": string,
  "id": string,
  "modified": string,
  "published": string,
  "entity": {
    "name": string,
    "version": string,
    "type": string,
    "released": string,
    "vendor": string,
    "lts": bool,
    "discontinued": bool
  },
  "references": [{
    "type": string,
    "url": string,
  }],
  "identification": [{
    "versions": [string],
    "cpes": [string],
    "purls": [string],
    "custom": {
      "name": string,
      "vendor": string
    }
  }],
  "support": [{
    "level": {
      "name": string,
      "id": int,
    },
    "eos": string,
    "eol": string,
  }]
}

Field Details

schema_version field

The schema_version field is used to indicate which version of the Lifecycle schema a particular lifecycle was created with. This can help consumer applications decide how to import the data for their own systems and offer some protection against future breaking changes. The value should be a string following the SemVer 2.0.0 format, with no leading "v" prefix. Clients can assume that new minor and patch versions of the schema only add new fields, without changing the meaning of old fields.

id, modified fields

The id field is a unique identifier for a lifecycle entry. The syntax of the id field follows the format LC–xxxx-xxxx-xxxx where:

x is a letter or number from the following set 23456789cfghjmpqrvwx. We use a set of characters that aren't easily confused with others (e.g 0 and O, 1 and I) to reduce errors in transmission and interpretation.
The numbers and letters randomly assigned (aside from LC prefix)
All letters are lowercase to simplify case-insensitive matching, making it easier to search.

The modified field gives the time the entry was last modified as an RFC3339-formatted timestamp in UTC (ending in "Z"). We have chosen RFC3339 over ISO8601 because it is more simple to parse and has fewer possible variations.

Given two different entries claiming to describe the same id field, the one with the later modification time is considered authoritative. The id and modified fields are required.

published field

The published field gives the time the entry should be considered to have been published,

as an RFC3339-formatted timestamp in UTC (ending in "Z").

withdrawn field

The withdrawn field gives the time the entry should be considered to have been withdrawn,

as an RFC3339-formatted timestamp in UTC (ending in "Z"). If the field is missing, then the

entry has not been withdrawn. Any rationale for why lifecycle has been withdrawn should go into the summary text.

entity field

The lifecycle field describes The combination of lifecycle.name, lifecycle.version, lifecycle.type, and lifecycle.vendor must be unique across all records.

entity.name

The entity object's name field is a pretty name for the package or software. It should be sufficiently close to the names used within the identifiers (such as the PURL identifier), but does not need to be exact.

entity.version

The entity object's version field is the cycle version for the software or package. In the case of a software like MongoDB Server, which ties lifecycle events to the minor version of the software, it would be "6.0" or "6.1" etc. In the case of a software like Vue, which ties lifecycle events to the major version, it would be "2" or "3".

entity.type

The entity object's type field is the type of object described. It may be either software or package or os or product

entity.released

The entity object's released field is when the entity was first released, as an RFC3339-formatted timestamp in UTC (ending in "Z"). This is an optional field, as finding the real release date of a software or package may be difficult in some cases.

entity.lts

The entity object's lts field is an optional boolean to describe whether the entity is a Long Term Support (LTS) release.

entity.discontinued

The entity object's discontinued field is an optional boolean to describe whether the entity is available or not. For software this means it's available via a non-archive source. For devices this would be whether is it available for sale.

references field

The optional references field contains a list of JSON objects describing references. This could be a link to the vendor's support page or an explanation of their versioning schema. Each object has a string field type specifying the type of reference and a string field url. The url is the fully qualified URL (including the scheme) linking to additional information about the entity.

The known reference type values are:

PACKAGE: A home web page for the package such as NuGet
SOURCE: A source page containing the package source code on Github or other VCS.
WEB: A web page of some unspecified kind.
ARTICLE: An article or blog post describing why the entity is EOL.
CHANGELOG: A web page containing the projects change log information.

identification field

identification.versions

The identification object's versions field is a JSON array containing strings of the versions that are matched along with either identification.cpes or identification.purls or identification.custom. This is an optional field as is the case with a software that doesn't have a version associated with it and only needs to be identified using the identification.custom object.

The identification.versions may include a wildcard, it may be a total wildcard (see the RHEL .NET example below) like "*" or a wildcard for a specific semver version "7.*" in which all versions like 7.1.2 and 7.1, etc are all matched.

In the case where semver is not used for a software or package, a complete list of versions should be used.

identification.cpes

The identification object's cpes field is a JSON array containing Common Platform Enumeration (CPE) strings used to identify the software. These may be CPE 2.2 or 2.3 strings. These CPE strings are used in combination with identification.versions to identify a software.

We must support CPE if we are to be able to identify things such as operating systems, as there is no PURL support for Operating Systems. SWIDs are one other possibility, but native support by the different OS's is much less reliable than CPEs.

identification.purls

The identification object's purls field is a JSON array containing strings following the Package URL specification that identifies the package. This PURL should include qualifiers when possible to ensure accuracy in matching. The PURLs are used in combination with identification.versions to identify the package.

identification.custom

The identification object's custom field is a JSON object that contains Name and Vendor which may be used to identify a software. This is used for cases where a CPE is not availabl e or the type is a software where PURL cannot be used. An example would be AWS EKS, which is a software, not a package, and where no CPE exists.

support field

The support field is a JSON array that is used to describe the end of life dates associated with a software or package. Vendors may provide support above their standard support levels for a fee and in exchange the customer is provided with a longer support period.

A support object has the field support.level, support.eol and support.eos. The support level describes the type of support level being described, for example Standard Support or Extended Support. support.eol describes the date at which security updates are no longer made; this is a mandatory field along with support.level. And finally, support.eos describes when there are no longer bug fixes being made, this is an optional field.

The support.level.name gives a pretty name to the support level from a vendor. This name should match the real name given by the vendor when possible. When no name for a support level by a vendor is given and there is only one, "Standard" will be used.

The support.level.int is the support level as an increasing int starting from 0. For example, RHEL has the support tiers "Standard Support" - 0, "Extended Update Support (EUS)" - 1, and "Enhanced Extended Update Support (Enhanced EUS)" - 2 with increasing support time for each level. This int may be used in consumer app configuration such as automated scanners to help set the level for a product without needing to understand the naming for each product.

Examples

RHEL

Red Hat Enterprise Linux is an operating system that can be identified with a CPE.

{
  ...
  "entity": {
    "name": "Red Hat Enterprise Linux",
    "version": "9",
    "released": "2022-05-17",
    "type": "os",
    "vendor": "Red Hat"
  },
  "identification": {
    "versions": [
      "9.*"
    ],
    "cpes": [
      "cpe:2.3:o:redhat:enterprise_linux",
      "cpe:/o:redhat:enterprise_linux"
    ]
  },
  "support": [
    {
      "level": {
        "name": "Standard Support",
        "level": 0
      },
      "eol": "2023-07-31"
    },
    {
      "level": {
        "name": "Extended Update Support (EUS)",
        "level": 1
      },
      "eol": "2024-05-17"
    },
    {
      "level": {
        "name": "Enhanced Extended Update Support (Enhanced EUS)",
        "level": 2
      },
      "eol": "2026-05-17"
    }
  ]
}

RHEL .NET

Red Hat Enterprise Linux versions 8 & 9 have introduced "Application Streams". Versions

of user-space components that are delivered and updated more frequently than the core operating system packages. Each Application Stream component has a given lifecycle, either the same as the RHEL release or shorter.

In this example, we will use the .NET 7.0 software. We use the .NET 7.0 SDK package to identify the .NET 7.0 software. There are a couple unique things to note here:

The .NET 7.0 PURL already has a version embedded in the name, so we use the * wildcard for the version.
The PURL has a qualifier for the distro, which is used to match the RHEL version. A new PURL should be created for each RHEL 8 minor release to ensure accurate matching
There is only one support level for Application Streams, even though the user may have an extended support package for their RHEL subscription.

{
  ...
  "entity": {
    "name": ".NET",
    "version": "7.0",
    "released": "2022-11-30",
    "type": "software",
    "vendor": "Red Hat"
  },
  "identification": {
    "versions": [
      "*"
    ],
    "purls": [
      "pkg:rpm/rhel/dotnet-sdk-7.0?distro=rhel-8.7",
      "pkg:rpm/rhel/dotnet-sdk-7.0?distro=rhel-8.8",
      "pkg:rpm/rhel/dotnet-sdk-7.0?distro=rhel-8.9",
      ...
    ]
  },
  "support": [
    {
      "level": {
        "name": "Standard Support",
        "level": 0
      },
      "eol": "2024-05-31"
    }
  ]
}

Microsoft .NET

Microsoft has its own support for .NET. Please note that the PURLs here include some platform information (linux-arm64, linux-arm, linux-x64), so a new PURL should be created for each platform. We use the wildcard in the versions list to match any version like 7.0.1, 7.0.2, etc. We use the NETCore runtime package to identify .NET 7.0.

{
  ...
  "entity": {
    "name": ".NET",
    "version": "7",
    "released": "2022-11-22",
    "type": "software",
    "vendor": "Microsoft"
  },
  "identifications": {
    "versions": [
      "7.0.*"
    ],
    "purls": [
      "pkg:nuget/Microsoft.NETCore.App.Runtime.linux-arm64",
      "pkg:nuget/Microsoft.NETCore.App.Runtime.linux-arm",
      "pkg:nuget/Microsoft.NETCore.App.Runtime.linux-x64",
      ...
    ]
  },
  "support": [
    {
      "level": {
        "name": "Standard Term Support (STS)",
        "level": 0
      },
      "eol": "2024-05-14"
    }
  ]
}

However, the Microsoft.NETCore.App.Runtime.linux-arm64 package might have its own lifecycle and need to be identified. Note that this is just an example, NuGet does not include the date when a package has been marked as deprecated, so we currently have no way to set an EOL date.

{
  ...
  "entity": {
    "name": "Microsoft.NETCore.App.Runtime.linux-arm64",
    "version": "5.0.13",
    "released": "2021-12-14",
    "type": "package",
    "vendor": "Microsoft"
  },
  "identifications": {
    "versions": [
      "5.0.13"
    ],
    "purls": [
      "pkg:nuget/Microsoft.NETCore.App.Runtime.linux-arm64"
    ]
  },
  "support": [
    {
      "level": {
        "name": "Standard Term Support (STS)",
        "level": 0
      },
      "eol": "2023-05-14"
    }
  ]
}

Google Kubernetes Engine

For Google Kubernetes Engine, we have neither a PURL nor a CPE to identify the software. We will identify by the product name and vendor.

{
  ...
  "entity": {
    "name": "Google Kubernetes Engine",
    "version": "1.27",
    "released": "2023-06-15",
    "type": "service",
    "vendor": "Google"
  },
  "identification": {
    "versions": [
      "1.27.*"
    ],
    "custom": {
      "name": "Google Kubernetes Engine",
      "vendor": "Google"
    }
  },
  "support": [
    {
      "level": {
        "name": "Standard",
        "int": 0
      },
      "eol": "2024-09-31"
    }
  ]
}

adriens · 2023-08-28T23:37:57Z

adriens
Aug 28, 2023

💭 It could be the opportunity to look around pydantic

0 replies

witchcraze · 2023-08-29T12:58:46Z

witchcraze
Aug 29, 2023

Thank you for your mension, and I think you cover my past comments.

If we got EOL dataset like this with perfect asset dataset, matching EOL will be very easy.
Some recent special cases in my work will be solved, I feel.

various package providers (epel, ppa, elrepo, unknown...)
EOL change of application streams with no announcement
OpenJDK with unknown providers (Syft binaly detector)
Licensed OracleJDK with rare situation

As I do not investigate about purl deeply, I do not know if we can use purl or not.
If there is some knowledge, I want to know it.

Each scan tool show same purl ?
Software with various install method show proper prusl ?

5 replies

noqcks Aug 29, 2023
Author

What are you using aside from PURL to match?

There is no guarantee that different tools will show the same PURL. We use syft in https://github.com/xeol-io/xeol to generate an SBOM with PURLs, but another SBOM generator may generate different slightly different PURLs.

For most of the packages, the PURL to generate can be quite easy, for example it's very easy to generate the PURL for https://www.nuget.org/packages/Microsoft.AspNetCore.Server.IIS/2.2.6 which would be pkg:nuget/[email protected]?qualifier=x the biggest thing that might change is the qualifiers.

For package with various install methods, lets say like Redis. Syft can do binary matching, so the PURL generated for the binary would be like pkg:binary/redis while the PURL for the deb installation would be pkg:deb/redis.

This is probably the biggest problem in ensuring no false negatives, since you may not appropriately cover the whole range of possible permutations of PURL for a package.

witchcraze Aug 30, 2023

Thank you for your comment

What are you using aside from PURL to match?

As I do not have enough asset data in company, I use command results data, package list, OS data, lisence list, and so on.
These data is not created by certain tool, and gathered tye is different by each platform.

In many cases, similar (or same) procedure is used (Infrastructure as Code), and EOL is able to be identified in my dataflow.
But sometimes software is installed with rare method, and EOL is not identified or strange results. In that case, I'll check operation ticket, procedure or design document. Communicating directly to host owner is final method.
Identifing OS package or not is from matching with version (x.y.z) list basically. (And using above information for rare cases).

Recently I try getting asset from container platform with Syft.
For Syft SBOMs, I applied same dataflow, and I confirmed matching process goes well in most cases. (there are also rare strange cases)
As you know, binary detector of Syft is powerful. I got many of binary (not package installed) items.
And I noticed many framework are detected. I do not have such a data in current asset data. I felt detection of Syft is good.

There is no guarantee that different tools will show the same PURL.

Ah, OK.

Redis

I checked roughly with syft.
I feel identifing OS package seems much knowledge...

Ubuntu 22.04 + OS package
pkg:deb/ubuntu/redis-server@5:6.0.16-1ubuntu1?arch=amd64&upstream=redis&distro=ubuntu-22.04
Ubuntu 22.04 + Official repository
pkg:deb/ubuntu/redis-server@6:7.2.0-1rl1~jammy1?arch=amd64&upstream=redis&distro=ubuntu-22.04
Ubuntu 22.04 + ppa : “Redis Labs” team
pkg:deb/ubuntu/redis-server@6:7.2.0-1rl1~jammy1?arch=amd64&upstream=redis&distro=ubuntu-22.04
RockyLinux8 + Application stream 6.x
pkg:rpm/rocky/[email protected]+el8.7.0+1105+8815ce78?arch=x86_64&upstream=redis-6.2.7-1.module+el8.7.0+1105+8815ce78.src.rpm&distro=rocky-8.8
RockyLinux8 + Remi repository 6.x
pkg:rpm/rocky/[email protected]?arch=x86_64&upstream=redis-6.0.20-1.el8.remi.src.rpm&distro=rocky-8.8

terotil Sep 7, 2023

Would repology package name consolidation data help in any way in matching between different identifiers given by different scan tools? Or am I misunderstanding something?

witchcraze Sep 7, 2023

I did not know site - repology, but it seems very interesting.
I do not think this solves all cases immidiately, but I feel big possibility.

From my rough check, OS packages seems registered. I think this is enough for checking EOL. (OS EOL or Upstream EOL)
If repology supports remi repository and redis official repository, provider of redis packages will be identified in most cases.
https://repology.org/project/redis/packages
https://repology.org/repositories/statistics

In my understanding, identifing package vendor/community will be important topic in scanner world.
Similar talk : anchore/syft#1607

Recently, PURL is focused, but legacy method - matching by package name - may be useful.

witchcraze Feb 21, 2024

Recently, I try to gather OS packages in major software.
By matching with our data, I think I can judge installed packages are OS package or not.

https://github.com/witchcraze/pkg_check

terotil · 2023-09-05T11:38:49Z

terotil
Sep 5, 2023

Absolutely lovely timing! I've been digging into component lifecycle management as part of my job and on our side we keep on circulating back to the problem of matching collected estate data with EoL data.

Robust way to communicate evolution events is needed. Feels a bit sad (although totally understandable) that communicating vulnerabilities is pretty well established, but communicating component evolution/lifecycle is far from it.

Only partially related to the topic: How does endoflife.date product name get decided? I couldn't find any guidelines and the current names don't seem to consistently match any of the identifier (CPE or PURL) parts.

I see SWID has been a consideration, but not included, not even as optional. Is that by explicit choice?

1 reply

noqcks Sep 5, 2023
Author

No reason not to include SWID as another method of identification. But as mentioned, from personal experience SWID support across operating systems is spotty. For github.com/xeol-io/xeol we just use CPEs, since they're supported for all operating systems in SBOMs generated by https://github.com/anchore/syft

lfrancke · 2023-09-09T06:22:26Z

lfrancke
Sep 9, 2023

FYI OASIS is in the process of defining a standard for EoL information as well.
The charter is up for comments now
https://docs.google.com/document/u/0/d/1OnBJUjmiJduoPQS_es5DdJSf5H8sSXhKHg5v-v1zt6U/mobilebasic

But it requires payment to participate https://www.oasis-open.org/join-a-tc/ so I believe it's worth it to have this discussion here anyway.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Potential EOL Schema? #3484

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 6 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

RFC: Potential EOL Schema? #3484

An Open Source EOL Schema

Field Details

schema_version field

id, modified fields

published field

withdrawn field

entity field

entity.name

entity.version

entity.type

entity.released

entity.lts

entity.discontinued

references field

identification field

identification.versions

identification.cpes

identification.purls

identification.custom

support field

Examples

Replies: 4 comments · 6 replies

noqcks Aug 29, 2023 Author

noqcks Sep 5, 2023 Author

Replies: 4 comments 6 replies

noqcks Aug 29, 2023
Author

noqcks Sep 5, 2023
Author