How AI can perform key image analysis and map TXOs by looking for patterns onchain without help from any centralized third parties
In my previous article I explained why Monero is broken beyond repair. Considering the size of exchanges such as Binance it’s only reasonable to assume that their data have contributed in good part to filtering out decoys in Monero transactions so far. But what if an exchange like Binance refused to hand over data and an investigator had to rely exclusively on what’s published onchain, would investigators still be able to map hundreds of thousands of TXOs to their key images? The answer is yes.
TXO metadata
The Monero blockchain, when seen from the outside, consists of many TXOs. These are transaction outputs that belong to various users. In many cases a user controls more than 1 TXO. Aside from TXOs we also have transactions. In each Monero transaction we have 1 or more key images depending on how many TXOs are being spent, then for each key image we have a ring of 16 TXOs where one of the TXOs being spent is mixed with 15 decoy TXOs, and then on the output side of the transaction we have at least 2 new TXOs that were created as result of the transaction. The moment a TXO is produced it carries some metadata with it. While it’s impossible to know what metadata exactly is attached to each TXO today, we can come up with a list of basic metadata that don’t require deep surveillance. So let’s say we are observing a random TXO-R, we can immediately derive some metadata from the transaction that produced TXO-R:
Sibling TXOs: Monero transaction outputs are always at least 2, so our TXO-R must have at least one (or more) sibling(s) from the moment it was produced.
Time when transaction was first seen/broadcasted.
Fee structure of the transaction that produced it.
Wallet related metadata (IP address, wallet specific fee structure): various wallets adopt different approaches when it comes to broadcasting transactions or managing TXOs depending on whether they prioritise UX or privacy.
Time to next confirmation: time between the first time a transaction is broadcast to us and the second time.
These metadata, even though quite basic, allow us to look for patterns that create sets of related TXOs. Related TXOs are TXOs that are likely to be owned/controlled by the same or related entities.
AI for the creation of SUS sets
Now let’s assume we have access to AI data processing capabilities, and we train our AI to recognize the following TXOs as related TXOs (likely to belong to the same entity or related entities):
TXOs have a single sibling, but their siblings regularly appear in transactions with more than 10 key images (inputs). This because such transactions are most likely consolidation transactions and it would mean that the TXO that went to the user originated from a centralized third party that then periodically consolidates their change TXOs. Same periodicity could mean same third party (we don’t want to know which third party and we don’t care).
TXOs have same fee structure (centralized services use similar fee structures to process payments)
TXOs are created in the same time bracket (indicating user habit or time zone)
By following this rule we can create many TXO pools for different service providers and different sets of users from these service providers by picking for different consolidation patterns, fee structures and time zones. We then call each of the sets created this way a SUS set (unknown centralized Service User Set). Each SUS set would consist of TXOs belonging to a specific subgroup of users of a certain centralized service provider. Subgroup such as belonging to a certain time zone or sharing behavioral patterns with respect to a service provider. It is clear that sets of TXOs created this way (for different fee structures and time zones) are related, even though being related doesn’t mean that they necessarily belong to the same party or to the same centralized service provider. For different fee structures and consolidation patterns we would eventually be fishing sets of TXOs from different services. But, as we will see in the next paragraph, being related already allows us to expose their key images.
Detector AI: Mapping Key Images
So we trained our AI to play with fee structures, consolidation patterns and time brackets/patterns and as result it has created many SUS sets. Each of these SUS sets consists of many TXOs belonging either to the same user or to users with similar habits/geolocation and behavioral profiles. SUS sets constitute huge cracks in Monero’s privacy because whenever 2 or more TXOs belonging to the same SUS set appear among the inputs of the same transaction, then it’s highly likely that those TXOs are being spent by one user belonging to the SUS set. Depending on the size of the SUS set the certainty can already be extremely high even for pairs of TXOs. In fact, the smaller the set the more unlikely it is that different users part of the same SUS set pick each other’s TXOs as decoys in a multi input transaction. So now we tell our AI to pick up the following transactions:
transactions with more than 1 key image. Number of key images is K.
where K TXOs from the same SUS set appear
and these TXOs appear in separate rings (not ring of the same key image)
For every transaction picked this way we can safely map the SUS TXOs to the key images of those transactions.
Conclusion
As shown in the above example, we can train an AI to create sets of related TXOs based on very simple metadata that do not include IP addresses which allows for mappings TXOs to their key images without having to exchange data with any centralized third party whatsoever but by simply observing for patterns. This is possible because Monero uses the UTXO accounting model and key images. Because Monero user balances are fragmented, that allows us to look for when related TXOs are spent together and therefore unmask their key images. This means an AI can independently create a TXO-KI database. Since Monero TXOs are used only once where their key image appears, this database can then be used in forward tracing other TXOs by filtering out decoys.