On Monday, the UN Global Working Group (GWG) on Big Data published UN Handbook on Privacy-Preserving Computation Techniques. This book talks about the emerging privacy-preserving computation techniques and also outlines the key challenges in making these techniques more mainstream.
— UN Big Data (@UNBigData) April 1, 2019
Motivation behind writing this handbook
In recent years, we have come across several data breaches. Companies collect users’ personal data without their consent to show them targeted content. The aggregated personal data can be misused to identify individuals and localize their whereabouts. Individuals can be singled out with the help of just a small set of attributes.
This large collections of data are very often an easy target for cybercriminals. Previously, when cyber threats were not that advanced, people used to focus mostly on protecting the privacy of data at rest. This led to development of technologies like symmetric key encryption. Later, when sharing data on unprotected networks became common, technologies like Transport Layer Security (TLS) came into the picture.
Today, when attackers are capable of penetrating servers worldwide, it is important to be aware of the technologies that help in ensuring data privacy during computation. This handbook focuses on technologies that protect the privacy of data during and after computation, which are called privacy-preserving computation techniques.
Privacy Enhancing Technologies (PET) for statistics
This book lists five Privacy Enhancing Technologies for statistics that will help reduce the risk of data leakage. I say “reduce” because there is, in fact, no known technique that can give a complete solution to the privacy question.
#1 Secure multi-party computation
Secure multi-party computation is also known as secure computation, multi-party computation (MPC), or privacy-preserving computation. A subfield of cryptography, this technology deals with scenarios where multiple parties are jointly working on a function. It aims to prevent any participant from learning anything about the input provided by other parties.
MPC is based on secret sharing, in which data is divided into shares that are random themselves, but when combined it gives the original data. Each data input is shared into two or more shares and distributed among the parties involved. These when combined produce the correct output of the computed function.
#2 Homomorphic encryption
Homomorphic encryption is an encryption technique using which you can perform computations on encrypted data without the need for a decryption key. The advantage of this encryption scheme is that it enables computation on encrypted data without revealing the input data or result to the computing party. The result can only be decrypted by a specific party that has access to the secret key, typically it is the owner of the input data.
#3 Differential Privacy (DP)
DP is a statistical technique that makes it possible to collect and share aggregate information about users, while also ensuring that the privacy of individual users is maintained. This technique was designed to address the pitfalls that previous attempts to define privacy suffered, especially in the context of multiple releases and when adversaries have access to side knowledge.
#4 Zero-knowledge proofs
Zero-knowledge proofs involve two parties: prover and verifier. The prover has to prove statements to the verifier based on secret information known only to the prover. ZKP allows you to prove that you know a secret or secrets to the other party without actually revealing it. This is why this technology is called “zero knowledge”, as in, “zero” information about the secret is revealed. But, the verifier is convinced that the prover knows the secret in question.
#5 Trusted Execution Environments (TEEs)
This last technique on the list is different from the above four as it uses both hardware and software to protect data and code. It provides users secure computation capability by combining special-purpose hardware and software built to use those hardware features. In this technique, a process is run on a processor without its memory or execution state being exposed to any other process on the processor.
This free 50-pager handbook is targeted towards statisticians and data scientists, data curators and architects, IT specialists, and security and information assurance specialists. So, go ahead and have a read: UN Handbook for Privacy-Preserving Techniques!