Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Countermeasure against linking IP to address based on first hop analysis: random delays #136

Open
kristovatlas opened this issue Oct 9, 2016 · 5 comments

Comments

@kristovatlas
Copy link
Member

I don't immediately follow how analytics attackers are measuring time to link IP address to first hop, but @TheBlueMatt suggested that they are using this currently, and that a valid countermeasure is to use mixnets with random delays in the routing portion of P2P clients' network protocol.

The discussion starts a little bit earlier than this timestamp in this presentation: https://youtu.be/8BLWUUPfh2Q?t=30m32s

If I could guess, I think he's saying:

  1. If you encrypt transaction relay traffic between P2P nodes (which people generally aren't so far), network attackers can still observe the time that messages are relayed and use this to infer which encrypted message corresponded to the first hop of a given message. (Also, unless encrypted messages are padded to a standardized size, the size of the message most likely can also be used.)
  2. Under that condition, you can introduce random delays to attenuate the usefulness of such time-based analysis.

It would be helpful if @TheBlueMatt could weigh in to clarify.

@kristovatlas
Copy link
Member Author

kristovatlas commented Oct 9, 2016

The relevant parts of the 3rd edition threat model are currently:

  • II D: Link a transaction's input address(es) to a specific IP address by observing the first relay of a broadcasted transaction (96d8d118a8a65aa8dbad15dbfd78365f7cda16e4bfa7ceeabdc1b5cd7e92a2f9)
    • Neither CM11 (c0933c692a76d18c3c20f6346eb4a02ba703b5ac12cb6c060a05e2a29cd6000f) nor CM20 (797090dd00d11157408f5c11bc4d55bcf95a281fae925f732900edafbb53e096 ) seem to fit for the proposed countermeasure/criteria.

@TheBlueMatt
Copy link

One of the primary goals of analytics attackers is to group transactions which were likely generated by the same group/entity/etc. A lot of your guide focuses on preventing grouping based on the wallet which generated a transaction, equivalently, however, attackers connect to the entire network in an attempt to identify which services/nodes transactions came from, potentially identifying either /the/ node which a transaction came from, or identifying groups of transactions which likely originated from the same source. This has been successfully used many times against various individual services (eg dice sites often end up with a public node), though this can also be used to deanonimize certain pseudo-centralized which relay to lots of nodes at once from one or two addresses.

Today on the network, when you restart a node that has been online for a while, you immediately receive ~50 connections from one service (badly) pretending to be various SPV wallets based on bitcoinj. This seems to be an attack against Bitcoin Core's current transaction-relay-privacy technique of using per-node timers for transaction relay. By connecting many times, it is much more likely that there will be less latency between when a node hears about a transaction and when it inv's you.

The easiest fix is to limit and bucket outbound relay. One strategy we're hoping to use in Bitcoin Core is, when we receive transactions, to only relay on to one or two statically-selected nodes, and then only relay to the remaining nodes in discrete buckets (eg once every minute we relay all the transactions we received between 30 and 90 seconds ago). This provides pretty good privacy by ensuring that, when a globally-connected adversary first hears about a transaction, it could have propagated randomly around the network significantly already. Note that much of the privacy provided by solutions such as this comes as the network upgrades, not as individuals upgrade.

This, however, isnt perfect, there is still information leakage. Ideally we'd use something like a high-latency mixnet (minutes, not days), whereby no participant can identify which piece of data was added by which other participant.

@kristovatlas
Copy link
Member Author

@TheBlueMatt

Thanks for explaining more about the delay strategy you have planned for Core and how mixnets will be involved.

By connecting many times, it is much more likely that there will be less latency between when a node hears about a transaction and when it inv's you.

Are you able to explain why this is? Is this because a nodes broadcasting transactions will only send them to a subset of their connected peers?

@TheBlueMatt
Copy link

Oops, just realized I hadnt responded to this...so the idea isnt new here: the easy way to attack someone introducing random noise into your statistics is to gather more data. We are even worse because we, randomly, sometimes do fast-relay (IIRC), which results in, if you have enough connections, you will likely always get fast-relay over at least one of them.

@kristovatlas
Copy link
Member Author

@TheBlueMatt thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants