Breaking Stealth (or, I Can Name That Algo in Four Notes)
Traders Magazine Online News, May 10, 2018
If I know four of your credit card transactions, I know all of your credit card transactions. A scary but true statement based on a research paper titled “Unique in the shopping mall: On the Re-identifiability of Credit Card Metadata.”[1] Another study, “Health Data in an Open World”[2], discussed the unintended re-identification of patients in a heavily masked dataset of 2.9 million Australian citizens that was made public with the goal of transparency for the communal good.
In the former instance, a group of researchers developed an algorithm that can identify an individuals’ credit card transactions with 90% accuracy from a database of masked purchases transacted by 1.1 million people--by only knowing four transactions. The researchers didn’t even have full transaction data, but only location and date. When relatively wide price ranges were added, the accuracy increased to 94%.
From just this skimpy amount of input, these analysts were able to pick out a single consumer’s activity across tens of millions of masked transactions in this data set.
The ability to identify a user in a larger, anonymized dataset is called unicity. The higher the unicity of a data set is, the easier it is to identify that user. The unicity of a credit card data set is extremely high. In this blog post, we will do a basic exploration of the unicity of algorithmic trading.
Our Test
For more information on related topics, visit the following channels:
Comments (0)