background preloader

De-anonymisation

Facebook Twitter

Paul Ohm, Carnegie Melon research, "The Massachusetts Group Insurance Commission had a bright idea back in the mid-1990s—it decided to release "anonymized" data on state employees that showed every single hospital visit.

The goal was to help researchers, and the state spent time removing all obvious identifiers such as name, address, and Social Security number. But a graduate student in computer science saw a chance to make a point about the limits of anonymization."
Study of Airnb customers de-identified
Study of mobile phone data
Professor Sambuddho Chakravarty, "a former researcher at Columbia University’s Network Security Lab and now researching Network Anonymity and Privacy at the Indraprastha Institute of Information Technology in Delhi, has co-published a series of papers over the last six years outlining the attack vector, and claims a 100% ‘decloaking’ success rate under laboratory conditions, and 81.4% in the actual wilds of the Tor network."

"Recently, thanks to a Freedom of Information request, Chris Whong received and made public a complete dump of historical trip and fare logs from NYC taxis. It’s pretty incredible: there are over 20GB of uncompressed data comprising more than 173 million individual trips. Each trip record includes the pickup and dropoff location and time, anonymized hack licence number and medallion number (i.e. the taxi’s unique id number, 3F38, in my photo above), and other metadata."

Yves-Alexandre de Montjoye : "We study fifteen months of human mobility data for one and a half million individuals and find that human mobility traces are highly unique. In fact, in a dataset where the location of an individual is specified hourly, and with a spatial resolution equal to that given by the carrier's antennas, four spatio-temporal points are enough to uniquely identify 95% of the individuals. We coarsen the data spatially and temporally to find a formula for the uniqueness of human mobility traces given their resolution and the available outside information. This formula shows that the uniqueness of mobility traces decays approximately as the 1/10 power of their resolution. Hence, even coarse datasets provide little anonymity. These findings represent fundamental constraints to an individual's privacy and have important implications for the design of frameworks and institutions dedicated to protect the privacy of individuals."

Open Data Institute: de-anonymising Titanic passengers data

"Professor Sambuddho Chakravarty, a former researcher at Columbia University’s Network Security Lab and now researching Network Anonymity and Privacy at the Indraprastha Institute of Information Technology in Delhi, has co-published a series of papers over the last six years outlining the attack vector, and claims a 100% ‘decloaking’ success rate under laboratory conditions, and 81.4% in the actual wilds of the Tor network."

What is Privacy? New Research Reveals We May Need a New Definition. If you still believe your personal credit information is truly private, newly released research by a Rutgers professor may lead you to reconsider. Vivek Singh's research paper on credit card metadata dispels the belief that anonymous personal data is truly private. Rutgers Today spoke with Vivek Singh, assistant professor of Library and Information Science in the School of Communication and Information, about his paper, “Unique in the Shopping Mall: On the Reidentifiability of Credit Card Metadata.”

Published Jan. 30, 2015 in Science, the study focused on the question: “Out of an anonymous set of credit card data from millions of people, how easily can you find one person?” The main finding: Information gleaned from just four transactions can uniquely identify a person most of the time (90 percent). The paper, as Singh explains, shows why “we need to rethink the ideas we have about privacy.

" This finding has large implications for what we trust are private sources of information about us. Eu-council-dp-reg-pseudonymisation-14705-rev1-14.pdf.

TOR network de-anonymization

NYC taxi GPS. data. MIT research. Paul Ohm presentation anonymisation. The anonymization/de-identification debate moves to the FCC. Unique in the Crowd: The privacy bounds of human mobility : Scientific Reports. Uniqueness of human mobility In 1930, Edmond Locard showed that 12 points are needed to uniquely identify a fingerprint30. Our unicity test estimates the number of points p needed to uniquely identify the mobility trace of an individual. The fewer points needed, the more unique the traces are and the easier they would be to re-identify using outside information.

For re-identification purposes, outside observations could come from any publicly available information, such as an individual's home address, workplace address, or geo-localized tweets or pictures. To the best of our knowledge, this is the first quantification of the uniqueness of human mobility traces with random points in a sparse, simply anonymized mobility dataset of the scale of a small country.

(A) Ip = 2 means that the information available to the attacker consist of two 7am-8am spatio-temporal points (I and II).In this case, the target was in zone I between 9am to 10am and in zone II between 12pm to 1pm. There’s No Such Thing as Anonymous Data. Photo by Andrew Nguyen About a decade ago, a hacker said to me, flatly, “Assume every card in your wallet is compromised, and proceed accordingly.” He was right. Consumers have adapted to a steady thrum of data breach notifications, random credit card charges, and out-of-the-blue card replacements.

A privacy-industrial complex has sprung up from this — technology, services, and policies all aimed at trying to protect data while allowing it to flow freely enough to keep the modern electronic bazaar thriving. A key strategy in this has been to “scrub” data, which means removing personally identifiable information (PII) so that even if someone did access it, they couldn’t connect it to an individual. So much for all that. De Montjoye and colleagues examined three months of credit card transactions for 1.1 million people, all of which had been scrubbed of any PII. What’s more, de Montjoye showed that even “coarse” data provides “little anonymity.” » Goodbye Internet Anonymity: Swedish Tabloids Publically Shame Forum Users.

The De-identification Maturity Model

People search - deanonimisation. Anonimysed DNA re-identified. You’re not anonymous. I know your name, email, and company. Sumit Suman recently visited a site, did not sign up for anything, did not connect via social media, but got a personal email from the site the next day. Here’s how they did it. I’ve learned that there is a “website intelligence” network that tracks form submissions across their customer network. So, if a visitors fills out a form on Site A with their name and email, Site B knows their name and email too as soon as they land on the site. It all started 2 weeks ago when I got a promotional email (anonymized to avoid promotion) offering to discretely integrate with your existing web site to identify visitors to your website. I get B2B marketing emails all the time but what caught my eye was the inclusion of a report snapshot for 42Floors.com showing names, companies, and emails of site visitors and the information seemed plausible.

Note the last sentence: I was still skeptical. A real-world analogue would be this scenario: You drive to Home Depot and walk in. Farfetched? About Darren Nix. Protecting the anonymity of cyberbullying victims. A recent decision of the Supreme Court of Canada has upheld a girl’s right to keep her identity concealed while she seeks the identity of a cyberbully and pursues a defamation claim against that individual. The court characterized her request as follows: “It is not merely a question of her privacy, but of her privacy from the relentlessly intrusive humiliation of sexualized online bullying.” The case is important because it recognizes that children and youth have a greater need for privacy than adults based on their inherent vulnerability, which is the result of their age and not the personal factors of the particular child or youth.

The case also confirms that, in some circumstances when a child or youth is seeking to have their identity remain anonymous, it is not necessary that s/he provide subjective evidence of the harm that would occur if their identity were released. Rather, a court can rely on objective evidence and “the court can find harm by applying reason and logic” [par 16]. Save the Titanic: Hands-on anonymisation and risk control of publishing open data | Guides.

Save the Titanic: Hands-on anonymisation and risk control of publishing open data This guide on anonymisation is based on a presentation at the OK festival on the 16th July 2014. It is presented as a walkthrough. In the session participants were asked to perform a series of tasks with the Titanic passenger data. You can download the data below and follow some of the steps. The aim of the session was to turn personal data into data that you can publish. Anonymisation means that you cannot find (“identify”) a single person in the data. Titanic passenger data The Titanic carried 1,309 passengers and the crew. You can download the data or access it as a Google spreadsheets. The original is a CSV-file. If you want to anonymise a dataset, it is crucial that you understand the data. Two open data use cases Start with the end in mind: what are people potentially doing with the data?

For the Titanic passenger data we discuss two putative use cases. What does it mean to be ‘identified’? Why You Don’t Need to Hide Online Anymore...Or Do You? Me and my data: how much do the internet giants really know? | Technology. To briefly state the obvious, the internet giants are seriously big: Google is not only the world's largest search engine, it's one of the top three email providers, a social network, and owner of the Blogger platform and the world's largest video site, YouTube. Facebook has the social contacts, messages, wallposts and photos of more than 750 million people. Given that such information could be used to sell us stuff, accessed by government or law enforcement bodies (perhaps without warrants, under legal changes), or – theoretically, at least – picked up by hackers or others, it's not unreasonable to wonder exactly how much the internet giants know about us.

US users of the sites are out of luck: there's no legal right under US law to ask a company to hand over all the information it holds on you. Users do have some say in how much companies are allowed to take, usually contained in the terms of service. Things didn't get off to a great start with Google. Or does it?

Can you really remain anonymous on the Internet? Anonymouth: A stylometry tool. Et vous vous croyez anonyme ? Raaaah, bientôt on fera moins les malins quand il s'agira d'aller troller un mec sur l'Internet mondial en nous cachant comme des lâches derrière un TOR ou un VPN... Je vous vois ouvrir grand les yeux, pris de panique ! Voui voui voui... D'après Aylin Caliskan Islam et Sadia Afroz, 2 chercheuses spécialisées en linguistique, il serait possible d'identifier 80% des internautes qui publient anonymement sur le net... Comment ?

Et bien tout simplement en analysant les mots employés et la façon de rédiger, des internautes. Pour leur présentation lors du 29ème CCC (Chaos Communication Congress) en Allemagne, les chercheuses ont mis en pratique leur théorie sur plus de 300 sujets ouverts sur des forums undergrounds où s'échangent et se vendent des numéros de CB, des identités volées, des services pour cracker des mots de passe et autres joyeusetés de black hat SEO et de spécialistes du chiffrement.

Flippant en tout cas ! Source Vous avez aimé cet article ? L'anonymat n'existe pas. Samedi 12 janvier 6 12 /01 /Jan 11:04 Il y a quelques temps de cela j'assistais à l’événement « Pas Sages en Seine 2012 ». Rassemblement d'hackers dans tout la polysémie que peut prendre ce terme, il était l'occasion pour des citoyens passionnés d'informatique comme de simples curieux de se rencontrer en ce lieu étonnant qu'est la Cantine à Paris. Au delà de l'événement lui-même, riche en de multiples dimensions, un point précis exita ma curiosité. Ce point c'était la répétition par plusieurs intervenants de l'idée : « Ce qu'il y a de bien avec Internet c'est qu'il permet l'anonymat. » Mon esprit narquois et mon maigre contact à l'actualité d'alors provoquèrent dans ma tête une collision semblable à un milliard de milliardième de celle d'un astéroïde sur la Terre les jours de grands vents.

En réalité la concordance est logique car les deux raisonnements s'appuient sur le même paradigme : Nom = carte d'identité. Deuxièmement le nom est multiple. Quatrièmement, l'anonymat n'existe pas. Disruptions: Privacy Fades in Facebook Era. David Paul Morris/Bloomberg NewsPrivacy is a rare commodity today with the high amount of information being posted on social networking sites such as Facebook. As much as it pains me to say this: privacy is on its deathbed. I came to this sad realization recently when a stranger began leaving comments on photos I had uploaded to Instagram, the iPhone photo-sharing app. After several comments — all of which were nice — I began wondering who this person was.

Now the catch here is that she had used only a first name on her Instagram profile. Trust me, it’s not. So I set out, innocently and curiously, to figure who she was. I knew this person lived in San Francisco, from her own photos. There it was: a full name. Creepy, right? Nearly everyone has done something like this. A friend who works in technology recently told me I would never be able to figure out her age online. So who is at fault for this lack of privacy protection? Ms. Now which one of us is going to do that? How anonymous is NHS patient data? | Healthcare Network | Guardian Professional. A claim by the Department of Health that patient data shared with private firms for medical research would be anonymised has been challenged by privacy campaigners. The prime minister said last week that plans to share records and other NHS data would make it easier to develop and test new drugs and treatments.

The DH says all necessary safeguards would be in place to ensure protection of patients' details. But Ethics and Genetics, a social and technology campaign group, says freedom of information requests show that under certain circumstances data anonymity would not always be guaranteed. Data accessed under the secondary uses service, which is jointly delivered by the NHS Information Centre and Connecting for Health (CfH), for the NHS and its partners, is not always anonymised.

When asked whether the data accessed under the service was always anonymised, the DH said it was "not always accessed in an anonymised format. " Connecting for Health Section 251 The Spine. Security Clan Editor's Blog - Personal Data Mining: Government & Business Share Blame.

Take Control of Your Online Image - PCWorld Business Center. The rise of online networking sites has made it easier to connect with colleagues and learn about job openings. It's also part of a much larger trend in which more information about you may be available to anyone who's interested -- including hiring managers, who often perform Internet searches on job candidates. If you want to advance in your career, you need to make sure that both your online networking efforts and your overall Web presence are working for -- not against -- you. A good way to do so is by treating all of your online activity as part of a public relations campaign that presents a professional image for potential employers and colleagues alike.

Use networking sites with care. Professional networking sites such as LinkedIn make it easy to expand your web of business contacts, an essential element of any successful IT career. Valuable professional connections can also come from more socially oriented sites such as Facebook, MySpace and Google's Orkut. OCR issues guidance on methods for de-identification of PHI.

As mandated by the American Recovery and Reinvestment Act of 2009, on November 26, 2012, the U.S. Department of Health and Human Services’ Office for Civil Rights (OCR) issued guidance on the de-identification of protected health information (PHI) in accordance with the HIPAA Privacy Rule (Privacy Rule). This tool, which is set forth in a navigable Q&A format, is the result of stakeholders with practical, technical, and policy experience contributing to a Washington, D.C. workshop in March 2010. The resulting guidance does not change the Privacy Rule, but rather clarifies the existing regulation as it pertains to de-identification of PHI. The guidance specifically addresses the two acceptable de-identification methodologies under the Privacy Rule: (1) Expert Determination and (2) Safe Harbor. De-identified information does not constitute PHI and, therefore, it is not governed by the Privacy Rule. Expert Determination The guidance also notably provides the following clarifications:

What You Don't Know about Your Online Reputation Can Hurt You - PCWorld Business Center. Social networking, and the broader concept of online privacy, have been under some rather intense scrutiny over the past couple of weeks. The issues at Google--voracious indexer of all things Internet, and Facebook--the largest social network and number one most visited site (according to Google) have made many users more acutely aware of what information is available about them on the Internet. However, your online reputation is being used in ways you may not be aware of, and could cost you. Users don't necessarily need to be concerned, but should at least be aware of who they are connected with online, and what they say. No, Big Brother isn't watching, but potential employers and lenders are. Increasingly, your online reputation is becoming a deciding factor in whether you get that job, or get approved for that car loan. Companies and lenders are turning to services like those offered by Rapleaf, a San Francisco-based company focused on social media monitoring.