Devil in the details - Unlinkability and selective disclosure
In which we discuss how selective disclosure and unlinkability interact and why the resulting digital identities stacks strongly differ in term of privacy guarantees.
Introduction note: I originally wrote this a couple years ago and then moved on before finishing it. I recently had to talk about these topics again and figured this article is not as dated as it seemed. I'm releasing it now with no changes to the original one, of course the environment and my own knowledge of things have evolved in the meantime but I believe this piece still stands on its own pretty well.
In a context of surveillance capitalism and “you don’t own anything” services, communities working on digital identity systems make refreshing promises to their users: own your own data, use it at your own discretion.
With some concrete propositions shaping up in Europe and elsewhere, it is important to get a close look at the proposed technologies and architectures to understand how they fare against these objectives.
In this article, we'll talk both on a high and low level about one of the most publicized technique aiming at delivering on user-empowerment and privacy: selective disclosure. We'll see why it is important to think about it with unlinkability in mind.
Basic glossary of self-sovereign identity (SSI)
If you've never heard of holders, issuers and verifiers, this small glossary is quick way to get into the lingo of SSI in a relatively painless way.
Verifiable credential, sometimes credential by language abuse - a chunk of data, anything, that was cryptographically signed and which signature can be verified by other parties.
Holder - sometimes referred to as user. A holder is in possession of credentials that concern them. Think about your passport.
Issuer - a trusted party that has the authority to deliver verifiable credentials. Think about the authority that delivers your passport to you.
Verifier - a party that needs verifiable and reliable information provided by the holder. Think about border control checking your passport.
The basic premise is that holders can prove information about themselves without the verifiers having to directly contact the issuer of the informations.
Note that these roles are not rigid and can depend on the context of an interaction. A party can in turn be holder, verifier and issuer. Think about border control checking your passport and delivering a travel authorization in return.
Selective disclosure
In the context of digital credentials, selective disclosure is the ability to divulge only a subset of one's credential's content to a given verifier. Different credential formats vary in the way they enable this.

Role and purpose of selective disclosure
The Issuer-Holder-Verifier triangle is the design that empowers users to decide with whom they share identifying information. No direct contact is required between the issuing authority (the city) and the verifying party (the library) and the holder can - in theory - accept or reject interactions with a verifier.
Selective disclosure is a crucial piece of digital credential systems that enables a holder to decide what to share with verifiers. Reusing subset of credentials provides a lot of flexibility when it comes to using ones credentials without requesting “specialized” credentials to the original issuer (more on that later).
Unlinkability
Linkability, defined in words rather than math, is the ability for a set of adversaries to come together and correlate any number of credential presentations as originating from the same credential in a "reasonable computing time". We call this coming together and sharing of data “collusion”.
We can detail this definition in a few "grades" by defining who are the adversaries:
If the set consists of colluding verifiers we can talk about "verifiers linkability".
If the set consists of colluding verifiers and the issuer of the original credential we can talk about "issuer linkability".
The strongest version, defines the adversaries as anybody who can observe the presentation, except for the holder itself.
Unlinkability is the property of a system that does not allow for such credential presentation correlation.
Important note: linkability in this context refers to the ability to uniquely identify a credential through it's cryptographic peculiarities, not through the holder information being shared.
Example: linkable presentations
When presentation are linkable, they offer a way for verifiers to exchange information to "fill-in the blank" in what they have been presented in an easy or even trivial way.
It is the case when presenting an SD-JWT credential. SD-JWT authenticity and integrity rely on signing a JWT. The JWT signature has to be revealed for each presentation for the verifier to ensure the authenticity and integrity of the presented information - the signature represents a cryptographically unique correlator for the credential.

Example: unlinkable presentations
When presentations are unlinkable, presentations cannot be tied back to the original credential (even for a single verifier receiving successive presentations of the same credential) and it becomes much safer to share some pieces of information.
AnonCreds and W3C verifiable credentials with BBS+ are credential formats that use cryptographic schemes enabling both selective disclosure and issue-unlinkability.

Selective disclosure and unlinkability - interactions and why it matters
Each layer of an identity ecosystem comes with strings attached. Selective disclosure techniques and linkability properties are no exception.
Collusion of verifiers is extremely common and even “normal” in the digital industry. Advertisement might be the most infamous example, incredibly big sets of data on application and website users are exchanged and consolidated for the purpose of selling targeted ads.
This is of course not the only industry where correlation is interesting, and advertisers are not the only one interested in data. It is easy to think about sectors where customer insight is fundamental: insurance, banking, hiring, etc.
Unlinkability is crucial when it comes to user-empowerment and privacy. A digital identity system that provides selective disclosure but not unlinkability is achieving close to nothing to protect its users’ from abuse. It is trivial to put seemingly unrelated pieces of information together.
Issuer linkability is relevant to more specific cases where issuers are actors prone to participate in data-sharing (think commercial actors such as online stores) as they can be the key to link any presentation to the full credential that was issued. Another important use case is that of e-voting. E-voting is one of the holy grail of digital governments. But an e-voting scheme cannot rely on issuer-linkable presentations as the issuer and verifier are the same political entity.
Salted-hash selective disclosure - SD-JWT
Let's briefly talk about SD-JWT credentials as it is the format championed by the European digital identity project and a trendy format. If you want some technical references into what is an SD-JWT and how it works, I've got you covered.
The specification for SD-JWT can be found here: SD-JWT specification
In general, the arguments applying to SD-JWT also apply for credential formats that rely on salted-hashes to enable selective disclosure. The most relevant formats are currently SD-JWT and mDL/mDoc.
It is not absolutely necessary to understand every details of SD-JWT to make sense of the discussion to follow. But understanding how salted-hashes selective disclosure works is essential in understanding how it impacts the greater design goals.
How does this work
A regular JWT is issued to the holder. It contains:
Claims that the holder MUST disclose during every presentation
An array of cryptographic hashes
Additionally, the holder receives a list of claims and salt values. When disclosed to verifiers, these values can be used to generate the hashes signed in the JWT to ensure their authenticity.
This technique enable the reuse of well established technology - JWTs and all the associated specifications for signature - while enabling selective disclosure. There's a catch however. For the verifier to be able to verify the authenticity of a credential, it has to be shown the original JWT. The holder has to present the original credential, no other choice there, and this leaks the ultimate unique correlator: the cryptographic signature.
TL;DR: salted-hash selective disclosure requires the holder to disclose the complete credential at presentation time, the cryptographic signature is the ideal unique correlator.
Here come the band-aids
Ad-hoc solutions that have been proposed to solve SD-JWT's linkability issue revolve around how and when credentials are issued:
Multiple issuance: issuing multiple credentials for the same claims result in distinct credential signatures. The salt used in SD-JWT ensures the disclosures and JWT signature are unique for each of the issued credential. This has the drawback of putting a lot of responsibility on the holder and its wallet to ensure a credential is used only for a given verifier, or even only once. This also leaks some information to the issuer about holder's credential use patterns.
Just-in-time issuance: issue the credential on-demand to the holder that should consider it one-time use. This would even allow for issuing only the required attributes, removing the need to support complex selective disclosure schemes. In my opinion however, that this crosses the line from self-sovereign identity back to the existing identity provider schemes where the credential subject is merely a convenient enabler in data exchange between parties.
Notably, there's no easy fix for issuer linkability for SD-JWT.
Conclusion
User empowerment and privacy are primary objectives in digital identity propositions. But the devil is in the details. SD-JWT shapes up as the main contender for European digital credentials; and in the details, SD-JWT has linkability issues that are not easy to mitigate. SD-JWT looks like a convenience choice and it looks like there is no intention to adopt a timeline allowing for development of alternatives.
In the interest of intellectual honesty, it must be mentioned that the European project allows for other types of credentials depending the level of assurance required by the interaction. The existence of an easy to implement default option leaves little incentive for the adoption of other technologies.
Before we part ways it is important to quickly address a last issue: linkability does not only happen at the credential format layer.
Despite technological safeguards, other techniques can still compromise user privacy.
And technology alone cannot prevent the natural formation of uniquely identifying records such as combinations of names, surnames and birth dates.
Regardless, it is still critical for the digital identity ecosystem to provide a safe and reliable foundational layer for its users and that it does not leave the entirety of its aspiration and promises in the hands of governance.
Addendum: Because I had this article forgotten at the bottom of my drawers for almost a two years, I must add a note on the fact that some research has emerged around making SD-JWT credentials unlinkable using zero-knowledge techniques, most notably I will cite: Longfellow-zk by Google, and Crescent by Microsoft. What they are, how they work, and why only big tech names are publishing research are questions that would fill much more space and that I hope to address later.


