Discussion:
Storing package metadata in ELF objects
Add Reply
Luca Boccassi
2021-04-10 12:29:16 UTC
Reply
Permalink
Hello,

Cross-posting to the mailing lists of a few relevant projects.

After an initial discussion [0], recently we have been working on a new
specification [0] to encode rich package-level metadata inside ELF
objects, so that it can be included automatically in generated coredump
files. The prototype to parse this in systemd-coredump and store the
information in systemd-journal is ready for testing and merged
upstream. We are now seeking further comments/opinions/suggestions, as
we have a few months before the next release and thus there's plenty of
time to make incompatible changes to the format and implementation, if
required.

A proposal to use this by default for all packages built in Fedora 35
has been submitted [1].

The Fedora Wiki and the systemd.io document have more details, but to
make a long story short, a new .notes.package section with a JSON
payload will be included in ELF objects, encoding various package-
build-time information like distro name&version, package name&version,
etc.

To summarize from the discussion, the main reasons why we believe this
is useful are as following:

1) minimal containers: the rpm database is not installed in the
containers. The information about build-ids needs to be stored
externally, so package name information is not available immediately,
but only after offline processing. The new note doesn't depend on the
rpm db in any way.

2) handling of a core from a container, where the container and host
have different distros

3) self-built and external packages: unless a lot of care is taken to
keep access to the debuginfo packages, this information may be lost.
The new note is available even if the repository metadata gets lost.
Users can easily provide equivalent information in a format that makes
sense in their own environment. It should work even when rpms and debs
and other formats are mixed, e.g. during container image creation.

Other than in Fedora, we are already making the required code changes
at Microsoft to use the same format&specification for internally-built
binaries, and for tools that parse core files and logs.

Tools for RPM and DEB (debhelper) integration are also available [3].
--
Kind regards,
Luca Boccassi

[0] https://github.com/systemd/systemd/issues/18433
[1] https://systemd.io/COREDUMP_PACKAGE_METADATA/
[2] https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects
[3] https://github.com/systemd/package-notes
Luca Boccassi
2021-04-10 12:38:31 UTC
Reply
Permalink
Post by Luca Boccassi
Hello,
Cross-posting to the mailing lists of a few relevant projects.
After an initial discussion [0], recently we have been working on a new
specification [0] to encode rich package-level metadata inside ELF
objects, so that it can be included automatically in generated coredump
files. The prototype to parse this in systemd-coredump and store the
information in systemd-journal is ready for testing and merged
upstream. We are now seeking further comments/opinions/suggestions, as
we have a few months before the next release and thus there's plenty of
time to make incompatible changes to the format and implementation, if
required.
A proposal to use this by default for all packages built in Fedora 35
has been submitted [1].
The Fedora Wiki and the systemd.io document have more details, but to
make a long story short, a new .notes.package section with a JSON
payload will be included in ELF objects, encoding various package-
build-time information like distro name&version, package name&version,
etc.
To summarize from the discussion, the main reasons why we believe this
1) minimal containers: the rpm database is not installed in the
containers. The information about build-ids needs to be stored
externally, so package name information is not available immediately,
but only after offline processing. The new note doesn't depend on the
rpm db in any way.
2) handling of a core from a container, where the container and host
have different distros
3) self-built and external packages: unless a lot of care is taken to
keep access to the debuginfo packages, this information may be lost.
The new note is available even if the repository metadata gets lost.
Users can easily provide equivalent information in a format that makes
sense in their own environment. It should work even when rpms and debs
and other formats are mixed, e.g. during container image creation.
Other than in Fedora, we are already making the required code changes
at Microsoft to use the same format&specification for internally-built
binaries, and for tools that parse core files and logs.
Tools for RPM and DEB (debhelper) integration are also available [3].
Wrong Fedora list address - off to a great start already :-)
(fixed now)
--
Kind regards,
Luca Boccassi
Zbigniew Jędrzejewski-Szmek
2021-04-10 18:44:07 UTC
Reply
Permalink
[I'm forwarding the mail from Luca who is not subscribed to fedora-devel]

On Sat, Apr 10, 2021 at 01:38:31PM +0100, Luca Boccassi wrote:

Hello,

Cross-posting to the mailing lists of a few relevant projects.

After an initial discussion [0], recently we have been working on a new
specification [0] to encode rich package-level metadata inside ELF
objects, so that it can be included automatically in generated coredump
files. The prototype to parse this in systemd-coredump and store the
information in systemd-journal is ready for testing and merged
upstream. We are now seeking further comments/opinions/suggestions, as
we have a few months before the next release and thus there's plenty of
time to make incompatible changes to the format and implementation, if
required.

A proposal to use this by default for all packages built in Fedora 35
has been submitted [1].

The Fedora Wiki and the systemd.io document have more details, but to
make a long story short, a new .notes.package section with a JSON
payload will be included in ELF objects, encoding various package-
build-time information like distro name&version, package name&version,
etc.

To summarize from the discussion, the main reasons why we believe this
is useful are as following:

1) minimal containers: the rpm database is not installed in the
containers. The information about build-ids needs to be stored
externally, so package name information is not available immediately,
but only after offline processing. The new note doesn't depend on the
rpm db in any way.

2) handling of a core from a container, where the container and host
have different distros

3) self-built and external packages: unless a lot of care is taken to
keep access to the debuginfo packages, this information may be lost.
The new note is available even if the repository metadata gets lost.
Users can easily provide equivalent information in a format that makes
sense in their own environment. It should work even when rpms and debs
and other formats are mixed, e.g. during container image creation.

Other than in Fedora, we are already making the required code changes
at Microsoft to use the same format&specification for internally-built
binaries, and for tools that parse core files and logs.

Tools for RPM and DEB (debhelper) integration are also available [3].
Post by Luca Boccassi
--
Kind regards,
Luca Boccassi
Luca Boccassi
2021-05-04 13:43:32 UTC
Reply
Permalink
Hi,
Post by Zbigniew Jędrzejewski-Szmek
[I'm forwarding the mail from Luca who is not subscribed to fedora-
devel]
Cross-posting to the mailing lists of a few relevant projects.
Note that in this version of the email the [N] references in your email
don't seem to point anywhere. I found an older variant of the same
[0] https://github.com/systemd/systemd/issues/18433
[1] https://systemd.io/COREDUMP_PACKAGE_METADATA/
[2] https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects
[3] https://github.com/systemd/package-notes
Sorry about that! Must have messed up the copy&paste.
Post by Zbigniew Jędrzejewski-Szmek
After an initial discussion [0], recently we have been working on a new
specification [0] to encode rich package-level metadata inside ELF
objects, so that it can be included automatically in generated coredump
files. The prototype to parse this in systemd-coredump and store the
information in systemd-journal is ready for testing and merged
upstream. We are now seeking further comments/opinions/suggestions, as
we have a few months before the next release and thus there's plenty of
time to make incompatible changes to the format and implementation, if
required.
A proposal to use this by default for all packages built in Fedora 35
has been submitted [1].
The Fedora Wiki and the systemd.io document have more details, but to
make a long story short, a new .notes.package section with a JSON
payload will be included in ELF objects, encoding various package-
build-time information like distro name&version, package
name&version,
etc.
Is there a list of default keys (and their canonical spelling, upper-
lower-Camel_Case, etc.)? If there is, could we have a "debuginfod" key
with as value an URL pointing to the debuginfod server URL where the
embedded build-id executable, debuginfo and sources can be found?
https://sourceware.org/elfutils/Debuginfod.html
The "Implementation" section of the spec lists the "main" fields:

https://systemd.io/COREDUMP_PACKAGE_METADATA/

(source for that is https://github.com/systemd/systemd/blob/main/docs/COREDUMP_PACKAGE_METADATA.md )

Would you like to send a PR to update it and add that field?
Post by Zbigniew Jędrzejewski-Szmek
To summarize from the discussion, the main reasons why we believe this
1) minimal containers: the rpm database is not installed in the
containers. The information about build-ids needs to be stored
externally, so package name information is not available immediately,
but only after offline processing. The new note doesn't depend on the
rpm db in any way.
2) handling of a core from a container, where the container and host
have different distros
3) self-built and external packages: unless a lot of care is taken to
keep access to the debuginfo packages, this information may be lost.
The new note is available even if the repository metadata gets lost.
Users can easily provide equivalent information in a format that makes
sense in their own environment. It should work even when rpms and debs
and other formats are mixed, e.g. during container image creation.
Other than in Fedora, we are already making the required code changes
at Microsoft to use the same format&specification for internally-built
binaries, and for tools that parse core files and logs.
Tools for RPM and DEB (debhelper) integration are also available [3].
Post by Luca Boccassi
--
Kind regards,
Luca Boccassi
--
Kind regards,
Luca Boccassi
Luca Boccassi
2021-05-06 11:43:13 UTC
Reply
Permalink
Hi Luca,
Post by Luca Boccassi
Is there a list of default keys (and their canonical spelling, upper-
lower-Camel_Case, etc.)? If there is, could we have a "debuginfod" key
with as value an URL pointing to the debuginfod server URL where the
embedded build-id executable, debuginfo and sources can be found?
https://sourceware.org/elfutils/Debuginfod.html
https://systemd.io/COREDUMP_PACKAGE_METADATA/
(source for that is https://github.com/systemd/systemd/blob/main/docs/COREDUMP_PACKAGE_METADATA.md )
Would you like to send a PR to update it and add that field?
Sorry, I don't have a github account. But attached is a patch for to
document it and one for the package-notes generator to add an --
debuginfod argument (maybe the distro should set a default value for
that?) Hopefully those patches could be applied somehow.
Hi,

Thanks, opened PRs with your commits:

https://github.com/systemd/systemd/pull/19523
https://github.com/systemd/package-notes/pull/8

Yes, if the distro has a debuginfod server, it should definitely be
included.
--
Kind regards,
Luca Boccassi
Loading...