A URI schema for attestations

There are a few competing designs.

Go for under-design

Recognising the danger of over-design, we first attempt to under-design, then go up from here, in order to not to over-design. First, the minimum-design approach.

The shortest secp384r1 signed attestation I can construct (based on the data object in the example in our document) is about 281 bytes†.

Adding the contract address before it, connect the two with a ?, this is the result:

attestation:0xD76b5c2A23ef78368d8E34288B5b65D616B746aE?MIIBFTCBnKADAgECAggsK
ZxbFu0FlTAKBggqhkjOPQQDAjB0MBAGByqGSM49AgEGBSuBBAAiA2AABEVuqVDEpiM2nl8ojRfLl
iJkP9x6jh3MCLOicSS6jkm5BBtHllirLZXI7Z4INcgn64mMU1jrYor-8FsPazFSY0E7ic3s7LaNG
dM0B9y7xgZ_wkWV7Mt_qCPgCemB-vMwCQIBAQoBAQIBATAKBggqhkjOPQQDAgNoADBlAjEAiuZAi
Tfr6dUT2crUayTzsD2HRlga7LHfb_tWunBrxzjM6LGMTw_38Wd2DoPQHlGPAjA99iMoJkzGYIeTJ
puyNR661vc80RzO-iU8phqBFVvzEg9s7mWKyYeo-QfgYpqMXEo=

It is 431 characters long. Users without an asn1 parser (or software capable to do so) will not know what's inside from the look of it. (old-time cryptic links like news:3B48D5FC.3030004@sc.rr.com didn't work out well)

Sidenote: There are ~ 161 bytes duplicated data since Public Key appeared 2 times. Tore commented that the curve parameter is part of the public key, so I kept it.

The methods by which it could be shorter

Sticking to DER (not PER†), we have:

  • Use secp256k1 instead of secp384r1 (reduction of 64 bytes)
  • Use compressed public key (reduction of 32 bytes roughly)
  • Drop out the information that can be implied or recovered. (e.g. drop out the public key (as it can be recovered))

† PER would reduce the length, but

  • It works on syntax level, does not cure the redundancy in the schema itself. e.g. both the public key and the signature is encoded (when the latter can be used to recover the former); AlgorithmIdentifier appeared 2 times and Subject repeated a part of (in the case of RDN, a subset of) the fields in the data object.
  • Would the reduction in the structural data allow an attacker to create some data that means A in the current schema, yet means B in an older contract that didn't get updated? It looks like something designed for speed rather than security. More on this with Tore.
  • The majority of data is the public key and signature. PER wouldn't put a dent on them.

Drop out the implied/recoverable information

Let's observe the data itself. Here is the same data with JSON-encoding rule applied:

{
   "signedInfo":{
      "version":2,
      "serialNumber":3182246526754555285,
      "signature":[
         "1.2.840.10045.4.3.2"
      ],
      "subjectPublicKeyInfo":{
         "algorithm":[
            "1.2.840.10045.2.1",
            "1.3.132.0.34"
         ],
         "subjectPublicKey":{
            "value":"04456EA950C4A623369E5F288D17CB9622643FDC7A8E1DCC08B3A27124BA8E49B9041B479658AB2D95C8ED9E0835C827EB898C5358EB628AFEF05B0F6B315263413B89CDECECB68D19D33407DCBBC6067FC24595ECCB7FA823E009E981FAF3",
            "length":760
         }
      },
      "dataObject":{
         "match":1,
         "class":"lounge",
         "admission":1
      }
   },
   "signatureAlgorithm":[
      "1.2.840.10045.4.3.2"
   ],
   "signatureValue":{
      "value":"30650231008AE6408937EBE9D513D9CAD46B24F3B03D8746581AECB1DF6FFB56BA706BC738CCE8B18C4F0FF7F167760E83D01E518F02303DF62328264CC6608793269BB2351EBAD6F73CD11CCEFA253CA61A81155BF3120F6CEE658AC987A8F907E0629A8C5C4A",
      "length":824
   }
}

In the attestation, there are only 2 parts of data that a reader can't guess. First, the object class:

  "dataObject":{
     "match":1,
     "class":"lounge",
     "admission":1
  }

Second, the signature. (The public key can be recovered from it)

If we only keep the two parts, following the contract address, we get this:

attestation:0xD76b5c2A23ef78368d8E34288B5b65D616B746aE?match=1;class=lounge;
admission=1;30650231008AE6408937EBE9D513D9CAD46B24F3B03D8746581AECB1DF6FFB56
BA706BC738CCE8B18C4F0FF7F167760E83D01E518F02303DF62328264CC6608793269BB2351E
BAD6F73CD11CCEFA253CA61A81155BF3120F6CEE658AC987A8F907E0629A8C5C4A

Still pretty long. If we use a shorter signature (secp256k1), we get this:

attestation:0xD76b5c2A23ef78368d8E34288B5b65D616B746aE?match=1;class=lounge;
admission=1;304402202cb265bf10707bf49346c3515dd3d16fc454618c58ec0a0ff448a676
c54ff71302206c6624d762a1fcef4618284ead8f08678ac05b13c84235f1654e6ad168233e82

If we use base64 encoding, we get this:

attestation:0xD76b5c2A23ef78368d8E34288B5b65D616B746aE?match=1;class=lounge;
admission=1;MEQCICyyZb8QcHv0k0bDUV3T0W/EVGGMWOwKD/RIpnbFT/cTAiBsZiTXYqH870YY
KE6tjwhnisBbE8hCNfFlTmrRaCM+gg==

It's short enough to be considered a URI now.

Side note: this version can be configured to be lunched by a specific mobile app:

http://attestation.id/0xD76b5c2A23ef78368d8E34288B5b65D616B746aE?match=1;cla
ss=lounge;admission=1;MEQCICyyZb8QcHv0k0bDUV3T0W_EVGGMWOwKD_RIpnbFT_cTAiBsZi
TXYqH870YYKE6tjwhnisBbE8hCNfFlTmrRaCM-gg==

Make it useful

1. Encode 𝑣 value

There are 2 public keys recoverable from a signature. Vitalik addressed the possibilities with a 3rd parameter 𝑣 acting as a selector. However, the above signature was encoded in the same format used for Bitcoin's signature: RFC5480: Elliptic Curve Cryptography Subject Public Key Information, which doesn't encode the 𝑣 value:

ECDSA-Sig-Value ::= SEQUENCE {
     r  INTEGER,
     s  INTEGER 
}

Typically, a signature is 70 (0x46) bytes long:

total 46
sequence 30
length 44
integer 02
length 20
X 2cb265bf10707bf49346c3515dd3d16fc454618c58ec0a0ff448a676c54ff713
integer 02
length 20
Y 6c6624d762a1fcef4618284ead8f08678ac05b13c84235f1654e6ad168233e82
  1. We can deviate from RFC5480 and add one more element to the SEQUENCE to identify which public key. It's a bad idea because when we reconstruct the attestation in its full form, the signature will have a different DER encoding than the one on the URI.
  2. We can tweak the separator character in the URI that separates the data object and the signature, say, @ as a separate represents 𝑣=1. This is a bad encapsulation.
  3. We can leave it as is. Since the public key is part of the data being signed, we can enumerate the possible public keys from the signature, and check which one, when used in the signed data, makes the signature validate.

2. Encoding the action

Let's say the action related to an attestation is 'buy', you might expect an attestation like this:

attestation:0xD76b5c2A23ef78368d8E34288B5b65D616B746aE/buy?match=1;class=lou
nge;admission=1;MEQCICyyZb8QcHv0k0bDUV3T0W/EVGGMWOwKD/RIpnbFT/cTAiBsZiTXYqH8
70YYKE6tjwhnisBbE8hCNfFlTmrRaCM+gg==

But I actually hope that we don't need to do that since, in a typical scenario, the action you can do with an attestation is pretty evident. For example, if you have an attestation like the UEFA ticket used in this example, the apparent action is to save it in your user agent. If instead, the attestation is a bid to the ticket at a certain price, then it must refer to the ticket in the offer, plus price and expiry. The bid will be the attestation and the action here is quite clear ("sell").

3. Encoding attestations that have a nested structure

This method, of course, can't handle complicated data object. For example, the following dataObject is an academic transcript:

{
   "year": 2020,
   "studentId": 123,
    "score" {
        "physics": {
            "project": 4.3,
            "test": 5
        }
        "art" : {
            "project": 4.3,
            "test": 5
        }
}

Let's assume that the schema rules that the subject of this data object is year=2020 and such was encoded in the subject of SignedInfo outside of the data object, then we have:

attestation:0xD76b5c2A23ef78368d8E34288B5b65D616B746aE/RPZytRrJPOwPYdGWBrssd
9v-1a6cGvHOMzosYxPD?year=2020;MGUCMQCK5kCJN-vp1RPZytRrJPOwPYdGWBrssd9v-1a6cG
vHOMzosYxPD_fxZ3YOg9AeUY8CMD32IygmTMZgh5Mmm7I1Hj71

Where the RPZytRrJPOwPYdGWBrssd9v-1a6cGvHOMzosYxPD part is the DER-encoded data-object and ?year=2020; is the subject. (there might be multiple key-value pairs as the construct of subject is Relative Distinguished Name)

If the RDN contains more than one key-value pair (as specified by its schema), for example:

year=2020,studentId=123

Then the URI would be:

attestation:0xD76b5c2A23ef78368d8E34288B5b65D616B746aE/RPZytRrJPOwPYdGWBrssd
9v-1a6cGvHOMzosYxPD?year=2020,studentId=123;MGUCMQCK5kCJN-vp1RPZytRrJPOwPYdG
WBrssd9v-1a6cGvHOMzosYxPD_fxZ3YOg9AeUY8CMD32IygmTMZgh5Mmm7I1Hj71

Notice that the separator for different key-value pairs in RDN is comma, while the separator for the key-value pairs in an objectClass is semi-column. This distinction is important because RDN is often itself a value inside an attestation (a reference to another data object).

If we always use secp256k1 then I don't think it is necessary to include the curve parameters as they are static for this (or any other) specific curve choice.

I believe that it should be possible to craft such a malicious message since PER requires knowledge of the encoding rules/schema used when encoding the message. Thus a PER message will be decoded into different data structures depending on which schema the decoder is holding. I think this problem could be avoided by including a hash digest of decoded message in the PER encoded message. Thus when PER is decoded, the decoding (reflecting the full schema description) is hashed and verified to ensure that the client uses the same, unique schema, used by the server.

There might be some issues by still allowing '=' in the URL encoding since it already has a semantic meaning in a URL.

As far as I can see there are only two public keys because an elliptic curve will have an equation y^2=x^3+ax+b, meaning that for a choice of x there is two possible options for y fulfilling the equation. However, there might be more than one choice of x. To be precise there is a single choice of 0 < r < n where n is the curve order, but x can be larger than n and so there are two key options for each c being a positive integer constant such that x=r+cn. But as far as I can see the verification should still pass no matter what x has been used because of the rest of the math involved work modulo n. To conclude; we can assume there are at most two choices and we are fine by simply testing the two possible options. Finally for public key extraction to work we must still have a digest of the real public key externally to verify the extraction against. Otherwise we cannot verify where the signature came from.