Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial MMP protocol draft #899

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
129 changes: 129 additions & 0 deletions zips/zip-tbd1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
ZIP: Unassigned
Title: Media Memos Protocol (MMP)
Owners: Kyle Den Hartog <[email protected]>
Status: Draft
Category: Wallet
Created: 2024-08-27
License: MPL 2.0
Pull-Request: <https://github.com/zcash/zips/pull/899>


# Abstract

In this specification we’ll define how to include an on-chain pointer to a off-chain rich media message which is larger than 512 UTF-8 characters while maintaining the confidentiality guarantees of the memos defined in [^ZIP-302].

# Motivation

Today it's not possible to send a message larger than 512 UTF-8 characters in a memo. In this ZIP we'll define how this can be done while still maintaining confidentiality between the two communicating parties so that larger messages can be sent and received without needing to store all data on chain. This will allow for use cases like including a PDF invoice with a transaction or sending large personalized messages or videos directly to a ZCash address.

# Terminology

The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", "MAY",
"RECOMMENDED", "OPTIONAL", and "REQUIRED" in this document are to
be interpreted as described in BCP 14 [^BCP-14]_ when, and only when,
they appear in all capitals.

Definitions of commonly used terms used throughout this ZIP:
- Scheme - This is the scheme of the URI necessary to identify that this is a MMP message. It SHALL match the definition of a scheme as defined in [^RFC-3986] section 3.1.
- Version - This is the version of the Media Memo Protocol in use in order to support extensions of the protocol.
- Message Location - A URL (or derived URL as defined by a separate version spec) which can be used to locate the ciphertext of the message.
- Query Parameters - This is an extensible model for including additional query parameters directly within the MMP message URI. It SHALL match the definition of a query as defined by [^RFC-3986] section 3.4.

# Requirements

This spec should allow ZCash users to include rich media messages in their transactions through the use of memos as defined in RFC-302. Additionally, in order to maintain cryptographic agility and account for different means of sending messages, it should be extensible by design while still maintaining interoperability.

# Specification

## Message URI syntax

The current memo zip requires that the content of the message is less than 512 ASCII characters which is a sufficient size to include a URI which includes the sufficient details to locate and decrypt the message. Here is how the format of the pointer will be:

## ABNF definition of message URI syntax
kdenhartog marked this conversation as resolved.
Show resolved Hide resolved
mmp-uri = mmp ":" version ":" cid query-params
kdenhartog marked this conversation as resolved.
Show resolved Hide resolved
mmp-uri = /1*512VCHAR; Limit to 512 ASCII VCHARs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is VCHAR the correct choice here? Should this be a smaller character set? Perhaps pchar / "?"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid parsing issues with URI libraries, it does make sense to keep this URI safe. However, because this needs maximum extensibility such that a version has the versatility it might need with various reserve characters I'm thinking it makes sense to make this unreserved / reserved which I think would be correct here in saying that any URI safe character can be used as long as the total length is less than 512 characters.

Does that sound good to you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The important thing is that parsing ambiguities aren't introduced here. Also, since this isn't valid ABNF (to have two productions with different constraints) there should be some other mechanism used to express this constraint.

Copy link
Author

@kdenhartog kdenhartog Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The important thing is that parsing ambiguities aren't introduced here.

Agreed, will make sure that this is complying with the URI spec.

Also, since this isn't valid ABNF (to have two productions with different constraints) there should be some other mechanism used to express this constraint.

I believe this would be valid ABNF under: https://datatracker.ietf.org/doc/html/rfc5234#section-3.3

The reason I used this method was to make it clear the maximum length of the entire URI is meant to be 512 characters. If there's a way to do that within a single rule, I'm happy to change it but with the variability of length for the message-location and query-params sections it wasn't clear to me how to do it other than this way.

scheme = "mmp"
version = 1*2DIGIT
kdenhartog marked this conversation as resolved.
Show resolved Hide resolved
message-location = 1*256VCHAR
kdenhartog marked this conversation as resolved.
Show resolved Hide resolved
query-params = "?" encrypt-key-param "&" param *("&" param)
param = param-name "=" param-value
param-name = 1*ALPHA
param-value = 1*VCHAR
kdenhartog marked this conversation as resolved.
Show resolved Hide resolved

kdenhartog marked this conversation as resolved.
Show resolved Hide resolved
## Example of message URI

mmp:v1:bafybeihdwdcefgh4dqkjv67uzcmw7ojee6xedzdetojuzjevtenxquvyku?ttl=2024-07-02T14:30:00Z&key=Hy9X_k2mLpQrZtNbVc5hA7sDxEuFoP-iQnWyG4M6OjBv

## Protocol Versioning

kdenhartog marked this conversation as resolved.
Show resolved Hide resolved
Since the version number determines the various associated properties of the protocol (including cryptographic properties to support cryptographic agility) there can be a wide range of capabilities that MAY NOT be interoperable. As such, all valid implementations of this protocol MUST implement at least version 1 to ensure a fallback for messaging.

## Message Location

This message location is a URL which can be used to locate a message. It’s used to locate the encrypted ciphertext of the media which is stored at the URL location. It MUST be defined by each protocol version specification. For example in V1, this will be defined as a CID that can be used to get the message from IPFS, but in future iterations it could be a URL that points to a S3 bucket, Torrent file, or elsewhere. It MAY contain additional details about how a path as defined by [^RFC-3986] can be used if a version wants to support this. If the URL contains additional information such as a different scheme (e.g. `https://`) it MUST be defined how it’s encoded within the protocol versioning specification since this URI

## Query Parameters

All query parameters are optional and MAY be included. Any required query parameters MUST be defined by the version specification. The exception here is the key parameter which are defined in this document because of its expected ubiquitous use across versions.

### Encryption Key Query Parameter
In all versions of the Media Memo Protocol that require passing a secret key between sender and recipients the key parameter MUST be used to. The key parameter MUST be the first query parameter if used so it should follow directly after a ? symbol. Reuse of this query parameter for many protocols will ensure the greatest possible interoperability.

### ABNF for Encryption Key Query Parameter
encrypt-key-param = "key=" 1*64VCHAR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't define the ttl field that you show in your example; under the current ABNF, it would need to be part of the encrypt-key-param production, which is required to appear first. Instead of defining an explicit initial encrypt-key-param production that must appear before other query parameter pairs, permit an arbitrary order and then just require the key and potentially ttl (if intended) keys be present in prose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: the key should appear in a fragment, not as a query parameter, as it should NOT be sent to the server (if processed by a naive parser.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The various options I considered here were:

  1. Add an additional : delimited section after version, but before the message location that is the key, but this becomes weird if someone opts to not have a key all together in a version. This could be solved with an empty section that produces :: if we want and saves a few bytes.
  2. Reserve use of fragments for passing the key only and prevent usage of other fragments from being used
  3. query parameter specific so that it can be dropped or added as needed.

The reason I opted to go with the query parameter option is that I expected that the parsers would handle this fine, but as soon as someone passed this to a URL resolver it would fail in which case it wouldn't make a network call at all. The one edge case that I can think of here is the registration of custom scheme handlers in operating systems and some browsers which may end up passing the URI around locally on device, but would likely still fail to resolve them in which case I couldn't see a reason that the query parameter didn't make sense.

However, in looking into the behavior of fragments a bit more for different media types, it does look possible to define multiple such as the example for PDFs.

In this case, I'm convinced that using fragments is the better option here but will keep the syntax matching what's described (well with some modifications to the characters that can be used as the questions above around VCHAR apply here too).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I opted to just drop the ttl example from this spec. Instead it seems more relevant that this is defined in the actual v1 spec.

Protocol flow

To respond to the message, it's as simple as sending a new message. There isn't a concept of threading built within the protocol itself. Although this could be handled ad a higher layer if desired. Some basic error handling should be defined as well. In general the message should be assumed to be a “broadcast and forget” model where as long as the message has been published it’s not necessary to maintain threading or further capabilities in the message order. Further complex threading, complex protocol features, and complex error handling should be defined in a new message format spec. However this protocol does support some basic error handling to be able to inform the counterparty that a message wasn’t able to be received for some reason.


## Basic Error Handling

While it is possible to define other methods of version specific error handling here are default error codes which MUST be understood by an implementation. Usage of these rather than protocol specific definitions will allow for more interoperable support. Further error codes can be defined in a protocol version doc in which case a version should be included in the definition. Error Codes defined by a MMP Version Specification MUST NOT override the semantics or values of these error codes.

### Default Error Codes
Here are the following definitions of the default error codes:

- Unsupported Version: `0001`
- Please resend using version 1. Version 1 is specifically chosen here to avoid needing to go through a negotiation protocol to determine a supported version.
- Unable to Retrieve: `0100`
- This should include a TTL parameter that the recipient would prefer. If the sender resends the message they MAY opt to use a different TTL value in the follow up response which is advisable in order to avoid resource exhaustion attacks where the recipient sends many errors to try and get the sender to re-encrypt and re-pin the same message multiple times. Therefore, the included TTL parameter in this error code is just a recommended time.
- Unable to Retrieve - decryption error: `0101`
- This error code `0101` MAY be returned when the recipient attempts to decrypt a message but the decryption process fails. This typically occurs in two scenarios:
- The provided decryption key is incorrect.
- The message has been tampered with, causing the authentication check to fail.
- Unable to Retrieve - content unavailable: `0102`
- The recipient attempted to retrieve the encrypted message and was unable to. This may mean that the recipient was unable to parse or retrieve the message via the URL. This may mean that the message wasn’t still pinned or that they received a 404 on pickup for example. Additionally, it may mean that the protocol necessary to pick the message up may not be supported. For example, if the URL provided is .onion TLD and TOR protocol isn’t supported then the recipient would not be able to retrieve the message.

# Security and Privacy Considerations

## Encryption Key Message Security

This still requires further review, but in general we should be able to extend the confidentiality guarantees of the memo field in order to securely send a symmetric key without the need for a key agreement protocol as well. This section should be updated further before finalization with greater detail about this.

## Protocol Version Specifications

Since the cryptographic and protocol security will be defined within separate protocol specification documents, these MUST include further security and privacy consideration sections which highlight specific tradeoffs that have been made. It’s generally expected that these sections will follow the guidelines of [^ZIP-0000] which suggests the usage of [^RFC-3552] and [^RFC-6973] as a starting point.

# Protocol Versions Registry

In order to support additional protocol versions and make it easy to find the documentation of a protocol version for implementers we’ll establish a first come first serve registration process. The version number MUST be the next integer available in the registry table. Minor or patch versions won't be used. The requirements to register a new version are the following:

1. It MUST be published as a ZIP and have reached “Final” Status.
2. Once the ZIP has been finalized a version identifier can be gotten by registering it with a table entry. The table entry should include the ZIP number defining the version, the ZIP’s title, the next version number, and the date of registration. The
kdenhartog marked this conversation as resolved.
Show resolved Hide resolved
3. Once this is done the version number MUST NOT be changed or reused

## Registry Table
| MMP Version | ZIP Number | Title | Registration Date |
| --- | --- | --- | --- |
| 1 | [^ZIP-tbd2] | Media Memos Protocol (MMP) Version 1 | 2024-08-27 |
| ... | ... | ... | ... |

# References

[^BCP14]: [Information on BCP 14 — "RFC 2119: Key words for use in RFCs to Indicate Requirement Levels" and "RFC 8174: Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words"](https://www.rfc-editor.org/info/bcp14)
[^ZIP-0302]: [ZIP 302: Standardized Memo Field Format](zip-0302.rst)
[^RFC-3986]: [Uniform Resource Identifier (URI): Generic Syntax](https://datatracker.ietf.org/doc/html/rfc3986)
[^ZIP-0000]: [ZIP 0: ZIP Process](zip-0000.rst)
[^RFC-3552]: [Guidelines for Writing RFC Text on Security Considerations](https://datatracker.ietf.org/doc/html/rfc3552)
[^RFC-6973]: [Privacy Considerations for Internet Protocols](https://datatracker.ietf.org/doc/html/rfc6973)
[^ZIP-tbd2]: [ZIP TBD2: Media Memos Protocol (MMP) Version 1](zip-tbd2.md)
96 changes: 96 additions & 0 deletions zips/zip-tbd2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
ZIP: Unassigned
Title: Media Memos Protocol (MMP) Version 1
Owners: [email protected]
Status: Draft
Category: Wallet
Created: 2024-08-27
License: MPL 2.0
Pull-Request: <https://github.com/zcash/zips/pull/899>
# Abstract

This is the first version of the [^zip-tbd1] specification. It’s intended to be a basic implementation that can be used as a first iteration to build off as well as a fallback protocol in the future. As such, there may be certain capabilities or features that are intentionally excluded which can be added in later versions.

# Motivation

In order to define an interoperable version of the specification while allowing for extensibility a base version needs to be defined along with the [^zip-tbd1] specification. This first version is intended to be a simple method to define how messages are encrypted and stored on the IPFS network and then subsequently retrieved by a recipient.

# Terminology

The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", "MAY",
"RECOMMENDED", "OPTIONAL", and "REQUIRED" in this document are to
be interpreted as described in BCP 14 [^BCP-14]_ when, and only when,
they appear in all capitals.

# Specification

## Message Location Encoding Scheme

In this version, the intention is to remain focused on storing messages on the IPFS network so that the protocol can operate in a maximally decentralized fashion. In order to accomplish this the message location in the MMP URI MUST be a CID of either version 0 or version 1. Both versions SHOULD be supported. Since CIDv1’s rely on the multicodec specification which includes multiple different encodings via multibase there’s a possibility that a codec is chosen that cannot be decoded. If this is the case, the message MUST be ignored. An implementation MAY return an error to the recipient if they choose. Additionally, for this reason implementers SHOULD use base64url codec by default.

## Message Encryption

In order to properly end-to-end encrypt the media contents and not need to redefine a new encryption process in detail this specification will reuse commonly implemented and defined cryptographic constructions. In this case, since it’s considered out of scope to support multi-messaging capabilities or extremely large file sizes the secretbox construction from libsodium will be sufficient. This makes it easier to implement this functionality safely without requiring re-implementation of the lower level cryptographic functionality which is important for a default fallback protocol.

## Encryption process

The Message Encryption process MUST utilize the secretbox construction from libsodium with the authentication tag attached. The XSalsa20Poly1305 stream cipher MUST be used also. In order to ensure proper usage both the key and nonce MUST not be reused. For this reason it is RECOMMENDED that implementers leverage the `crypto_secretbox_easy` API in libsodium.

## Decryption process

The Message Decryption process MUST utilize the secretbox construction from libsodium with the authentication tag attached. The XSalsa20Poly1305 stream cipher MUST be used. The recipient MUST use the key provided in the 'key' query parameter to decrypt the message. Implementers are RECOMMENDED to leverage the `crypto_secretbox_open_easy` API in libsodium to ensure proper decryption and verification of the authentication tag. If decryption fails due to an incorrect key or tampering of the authentication tag (or it failing to be included), the implementation MAY return an `0101` ("Unable to Retrieve - decryption error") to the sender and let the user know the message failed to be received.

## Message Storage

### Message Size

The size of the media content MUST NOT exceed 1GB after encryption in order to prevent excessive decryption times and denial of service attacks for the recipient. Implementations MAY set a lower limit when receiving messages based on the limits of their hardware. If they do, they SHOULD return a “message too large” error code to the sender if they opt to not decrypt the message due to the file size. In order to limit further bloat as well the message MUST NOT be further encoded beyond the bytes output by the secretbox construction.

### Message Availability

Due to the nature of the tradeoffs on the IPFS network there is not a guarantee that content will remain pinned or be replicated by others on the network. Especially since the content is encrypted and therefore unable to be seen by the sender or recipient. For this reason senders SHOULD expect to pin the message on the network for at least 7 days so that the recipient has enough time to retrieve the message. Implementations may opt to use a separate pinning service to extend this time period further. This time period that the sender intends to pin the message MUST be communicated to the recipient via the ttl query parameter.

### Message Retention

Once the recipient has received the message they SHOULD store it locally as well to avoid loss of the message. Similarly, the sender SHOULD store the message to retain a message sent history. Implementers are generally expected to encrypt their message histories at rest.

## Query Parameters

### Encryption Key Query Parameter Usage

The encryption key MUST be provided as a query parameter named 'key' in the URL. This key MUST be a 32-byte (256-bit) random value encoded as a base64url string resulting in a 43 character string. The key MUST be generated using a cryptographically secure random number generator and unique to each message sent. This includes if the same message is being re-encrypted to re-send the message. The 'key' parameter MUST NOT be included anywhere other than the URI syntax which MUST be sent via the memo attachment field to ensure the key is not leaked.

### ABNF for Encryption Key Query Parameter

```
encrypt-key-param = "key=" 1*43VCHAR
```

### Time To Live Query Parameter Usage
Since we’re dealing with a decentralized storage of media in many cases we don’t necessarily have a guarantee that the message will be available when the recipient goes to retrieve the message. As such we need a way for the sender to communicate that a message will be available up to a given point in time and what that time is. While it’s expected that the message will be around at least until the expiration of the TTL there can never be a guarantee that the message will be retrievable even within the time period of the TTL. Therefore this parameter should convey a reasonable time frame that the sender (or a provider of their choosing) is willing to store the message on the IPFS network. Beyond the TTL, the recipient SHOULD NOT expect the message to be stored and the sender MAY delete it. The value of the the query parameter MUST adhere to [RFC-3339].

### ABNF for TTL Query Parameter
```
timestamp = ”ttl=” date-time ; see RFC 3339 for date-time definition
```

# Security and Privacy Considerations

This section discusses potential security and privacy issues related to the Version 1 protocol.

## Resource Exhaustion Attacks
Implementers should be aware of the potential for resource exhaustion attacks. This is best prevented by first checking the size of a message to make sure it conforms with implementation max limits. It’s also a good idea to set user limitations on both the sender and recipient. Additionally, senders should consider unnecessary additional costs that need to be accounted for by setting TTL query parameters at higher thresholds.

## Key Leakage Attacks
The encryption key is a critical component of the protocol's security. Care should be taken to transmit the key securely and avoid logging or caching of the URI which contains the key. If logging is necessary implementers are expected to drop the key query parameter to avoid key leakage. Additionally, care should be taken when sharing MMP URIs to avoid other forms of key leakage.

## IP address and ZCash Address Correlation Attacks
Senders who serve messages directly from a self hosted IPFS node will likely inadvertently expose their IP address to the recipient since it’s highly unlikely any other node on the network will have the encrypted message pinned. To avoid correlation of IP address and ZCash addresses a pinning service or a secure proxy can help protect sender’s privacy. Similarly, recipients who retrieve messages directly from an IPFS node may expose their IP address when retrieving a message. To avoid this recipients can use a gateway or a secure proxy for message retrieval to prevent correlation.

# MMP Version Registry entry

| 1 | [^ZIP-tbd2] | Media Memos Protocol (MMP) Version 1 | 2024-08-27 |

# References

[^ZIP-tbd1]: [ZIP TBD: Media Memos Protocol (MMP)](zip-tbd1.md)
[^RFC-3339]: [Date and Time on the Internet: Timestamps](https://datatracker.ietf.org/doc/html/rfc3339)