Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework of Reference Hints #1213

Open
mandelsoft opened this issue Dec 27, 2024 · 1 comment · May be fixed by #1212
Open

Rework of Reference Hints #1213

mandelsoft opened this issue Dec 27, 2024 · 1 comment · May be fixed by #1212
Labels
area/ipcei Important Project of Common European Interest kind/feature new feature, enhancement, improvement, extension

Comments

@mandelsoft
Copy link
Contributor

mandelsoft commented Dec 27, 2024

ADR-0003 - Rework of Reference Hints

Meaning of Reference Hints

During the transport of software artifacts referenced from external artifact repositories like
OCI registries, they might be stored as blobs along with the component version (access method
localBlob). If those component versions are transported again into a repository landscape they
might be uploaded again to external repositories.

To provide useful identities for storing those artifacts hints in external repositories, again,
the original identity of the external artifact must be stored along with the local blob.

Current Solution

The access method originally used to reference the external artifact provides a reference hint,
which can later be used by the blob uploaders to reconstruct a useful identity.
Therefore, the localBlob access method is able to keep track of this hint value.
The hint is just a string, which needs to be interpreted by the uploader.

Problems with Current Solution

The assumption behind the current solution is that the uploader will always upload the
artifact into a similar repository, again. Therefore, there would be a one-to-one relation
between access method and uploader.

Unfortunately this is not true in all cases:

  • There are access methods now (likewget), which are able to handle any kind of artifact blob
    with different natural repository types and identity schemes.
  • Therefore,
    • it can neither provide an implicit reference hint anymore
    • nor there is a one-to-one relation to a typed uploader anymore.
  • artifacts might be uploadable to different repository types using different
    identity schemes.

This problem is partly covered by allowing to specify a hint along with those access methods
similar to the localBlob access method. But this can only be a workaround, because

  • the hint is not typed and potential target repositories might use different identity schemes
  • it is possible to store a single hint, only.

Proposed Solution

Hints have to be typed to allow uploaders to know what identities are provided, how the
hint string has to be interpreted and to decide for uploaders, which hint to use. Additionally,
it must be possible to store multiple hints for an artifact to support multiple possible upload
repository types.

To be compatible a serialization format is defined for a list of type hints, which maps such
a list to a single string.

The library provides a new type ReferenceHint, which provides access to
a formal hint, by providing access to a set of string attributes. The semantic type is a string map
of strings. There are three predefined attributes:

  • type the formal type of the hint (may be empty to cover old component versions)
    The types are defined like the resource types. The following types are defined for now:
    • oci: OCI identity given by the attribute reference with the currently used format
    • maven: Maven identity (GAV) given by the attribute reference with the currently used format
    • npm: Node package identity given by the attribute reference with the currently used
      format
  • reference: the standard attribute to hold a string representation for the identity.
  • implicit: Value true indicated an implicit hint (as used today) provided by an accessmethods.

Typeless former hints are represented by the sole attribute reference.
New Hint types may use other attributes.

Access Methods

An access method can provide (and store) implicit hints as before. Those hints are indicated
to be implicit. When composing an access method it is only allowed to store implicit hints.
This is completely compatible to the current solution.

Additionally, multiple hints can be stored abd delivered.

To support any kind of hint for any scenario, the artifact metadata (resources and sources)
is extended to store explicit hints, which will be part of the normalized form.
This is done by an additional attribute referenceHints. It is a list of string maps
holding the hint attributes (including the hint type).

Uploaders

Uploaders are called with the aggregation of explicit (from metadata) and implicit (from
access methods) hints. Hereby, the explicit hints have precedence.

If an uploader creates a local access specification, only implicit hints may be stored, here.

There is a new composition option (--refhint) now for composing resources
and sources for the CLI. It accepts an attribute setting. Multiple such options starting with the type attribute are used to compose a single hint.

Inputs

Inputs may provide explicit or implicit hints, now. All file based inputs now allow to specify implicit hints as used before.
The implicit hints (if not conflicting with explicit hints) are used to be stored in localBlob
access methods. The explicit hints are used to default the explicit artifact hints.

Hints used in a component version must be unique. This check is extended to consider implicit
and explicit hints provided by inputs, access methods and artifact hints.

Hints may either be specified as a list of string maps (canonical form) or as string using the serialized from.

Hint Serialization Format

In general a hint is serialized to the following string

[<*type*>`::]`<*attribute*>`=`<*value*>{`,`<*attribute*>`=`<*value*>}

The type is not serialized as attribute, but as prefix separated by a ::. The implicit attribute is never serialized if the string is stored in an access specification.
If no type is known the type part is omitted.

A list of hints is serialized to

<*hint*>{`;`<*hint*>}

Attributes names consist of alphanumeric characters, only.
A value may not contain a ::. If it contains a ;, , or "
character it must be given in double quotes.
In the double-quoted form any " or \ character has to be escaped by
a preceding \ character.

To be as compatible as possible, a single attribute hint with the attribute
reference is serialized as naked value (as before) if there are no special
characters enforcing a quoted form.

Incompatible Changes:

Component Version Representation

  • The new library creates always typed hints for new elements. Old hints are
    left as they are. This means, that old versions of the OCM tooling
    cannot work correctly with component versions with persisted hints in
    access specifications
  • If explicit hints are created, they are not observed by old tool versions.
    Those component versions cannot be verified by an older tool version.

OCM Library

  • The SetResourceBlob and SetSourceBlob API methods now accept
    a hint specification instead of a string. To be as compatible as possible,
    it still accepts a string (as before), which is mapped by a hint parser to a hint list.
    Therefore, the technical type is interface{}, which accepts
    various effective types (single hints, string and hint lists).

  • Uploaders provided by a plugin now get a serialized hint list
    instead of a simple untyped reference format.

  • There are new options for creating resource(source access objects.

@mandelsoft mandelsoft added the kind/feature new feature, enhancement, improvement, extension label Dec 27, 2024
@github-actions github-actions bot added the area/ipcei Important Project of Common European Interest label Dec 27, 2024
@mandelsoft mandelsoft linked a pull request Dec 27, 2024 that will close this issue
@jakobmoellerdev
Copy link
Contributor

IMO This Rework tries to force in typed configurations into hints that should not be there in the first place. The plugins were designed in a way in which the hints are needed a lot, but I believe this is fundamentally flawed. Instead we should rethink how an identity is generated.

I believe we have other problems with uploaders too, for example that they do not get to touch the digests in the access after they return a new access spec.

IMO the whole system is too static and we need to rework it. An access should not only be defined by its access spec, but also by the entire resource alongside it (including digests, labels, etc.)

Just reworking a string to accept an arbitrary JSON shows that we hit the limit here too.

I think considering the gravity of this issue, I vote to accept the issue a problem as is, freeze it and think of a new system that is more flexible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ipcei Important Project of Common European Interest kind/feature new feature, enhancement, improvement, extension
Projects
Status: 🆕 ToDo
Development

Successfully merging a pull request may close this issue.

2 participants