PPDM Blog
From DDMWiki
I hope this Blog could be a resource dedicated to promoting issues relating to privacy and privacy preserving data mining (PPDM). Topics will span everything from scholarly reports to commercial endeavors that express opinions and employ technology developed through the study of this topic.
A Face Is Exposed for AOL Searcher No. 4417749
By MICHAEL BARBARO and TOM ZELLER Jr.
Published: August 9, 2006
Buried in a list of 20 million Web search queries collected by AOL and recently released on the Internet is user No. 4417749. The number was assigned by the company to protect the searcher’s anonymity, but it was not much of a shield.
No. 4417749 conducted hundreds of searches over a three-month period on topics ranging from “numb fingers” to “60 single men” to “dog that urinates on everything.”
And search by search, click by click, the identity of AOL user No. 4417749 became easier to discern. There are queries for “landscapers in Lilburn, Ga,” several people with the last name Arnold and “homes sold in shadow lake subdivision gwinnett county georgia.”
It did not take much investigating to follow that data trail to Thelma Arnold, a 62-year-old widow who lives in Lilburn, Ga., frequently researches her friends’ medical ailments and loves her three dogs. “Those are my searches,” she said, after a reporter read part of the list to her.
full text http://www.nytimes.com/2006/08/09/technology/09aol.html?ex=1156046400&en=c06ff6a2c7708bb4&ei=5070
Financial Search Raises Privacy Fears
By Paul Blustein
Washington Post Staff Writer
Saturday, June 24, 2006; Page A12
For most Americans, the confidentiality of their bank accounts and other financial holdings is a right to be cherished. The idea that government agents might be secretly scrutinizing the records of individuals arouses discomfort in people who view their wealth, income and other financial information as nobody's business but their own.
So questions of privacy arose yesterday after revelations that the Bush administration has been tracking clues about terrorists by searching the records of a Belgium-based banking consortium that handles millions of financial transactions daily across national borders...
full text http://www.washingtonpost.com/wp-dyn/content/article/2006/06/23/AR2006062301689.html
The Death of Privacy
John Lloyd writes "Money, the internet and a new type of 'public-private person' have conspired to destroy old notions of confidentiality."
full text http://commentisfree.guardian.co.uk/john_lloyd/2006/06/post_151.html
--Kun Liu 00:51, 28 June 2006 (EDT)
ICDM Workshop on Privacy Aspects of Data Mining
Submission deadline: July 30
Workshop: December 18
Address: Hong Kong, China
link: http://www-kdd.isti.cnr.it/padm06/calls.html
The workshop will seek submissions that cover aspects of privacy protection solutions and threats as they pertain to various data mining endeavors. The following comprises a sample, but not complete, listing of topics:
Biomedical and healthcare data mining research privacy
Cryptographic tools for privacy preserving data mining
Inference and disclosure control for data mining
Learning algorithms for randomized/perturbed data
Legal and regulatory frameworks for data mining and privacy
Privacy and anonymity in e-commerce and user profiling
Privacy aspects of business processes and enterprise management
Privacy aspects of geographic, spatial, and temporal data
Privacy aspects of ubiquitous computing systems
Privacy enhancement technologies in web environments
Privacy policy infrastructure, enforcement, and analysis
Privacy preserving link and social network analysis
Privacy preserving applications for homeland security
Privacy preserving data integration
Privacy protection in fraud and identify theft prevention
Privacy threats due to data mining
Query systems and access control
Trust management for data mining
--Kun Liu 16:31, 20 June 2006 (EDT)
U.S. Gov't Spent 30M Dollars On Citizens' Personal Info
infosec_spaz@slashdot.org writes "According to a news story on Yahoo! News, the U.S. Government has spent US$30 million in the last year on buying citizens' personal phone records from online brokers...The very ones who Congress is trying to put out of business."
--Kun Liu 16:31, 20 June 2006 (EDT)
Cryptographic Hashes
I guess some of you guys have seen MD5 signature (or digest) of the files you downloaded from the Internet before. So what does that mean?
Simply speaking, this signature is used to guarantee the integrity of the file. If some malicious party modified the file before/during the download, the new signature would be different, and therefore you would know this file is corrupted. The signature is generated by a cryptographic hash function which "guarantees" the following cases are computational infeasible:
- finding a (previously unseen) file that matches a given signature
- finding collisions, wherein two different files have the same signature
In crypography, MD5 (Message-Digest algorithm 5) is a widely-used a cryptographic hash function with a 128-bit hash value. As an Internet standard, MD5 has been employed in a wide variety of security applications, and is also commonly used to check the integrity of files. However, in 2004, serious security flaws have been identified by Xiaoyun Wang et al. from Shandong University, China. In 2005, these researchers also broke another cryptographic hash function SHA-1, which is invented by the National Security Agency in 1995 as "the most common cryptographic primitive" on the Internet. (see 1 2 for more details)
But this doesn't mean the government or criminals will be able to spy on your encrypted communications immediately. Breaking encryption takes immense amounts of computing power. For regular computer users, the breaking still has no sudden repercussions. That is why I think, for regular people you and me, we can still use MD5 signature to check file integrity.
The following are two good articles talking about cryptographic hashes.
- http://en.wikipedia.org/wiki/Cryptographic_hash
- http://www.unixwiz.net/techtips/iguide-crypto-hashes.html
And HkSFV 2.01 -- a tiny powerful free software, using CRC-32 and MD5, for validating the integrity of files that you have downloaded or moved or published through a potentially unstable medium (burnt to CD, transferred over the Internet/LAN).
--Kun Liu 00:46, 14 May 2006 (EDT)
NSA has massive database of Americans' phone calls
Leslie Cauley from USA TODAY writes "The National Security Agency has been secretly collecting the phone call records of tens of millions of Americans, using data provided by AT&T, Verizon and BellSouth, people with direct knowledge of the arrangement told USA TODAY.
The NSA program reaches into homes and businesses across the nation by amassing information about the calls of ordinary Americans — most of whom aren't suspected of any crime. This program does not involve the NSA listening to or recording conversations. But the spy agency is using the data to analyze calling patterns in an effort to detect terrorist activity, sources said in separate interviews."
full text http://www.usatoday.com/news/washington/2006-05-10-nsa_x.htm
--Kun Liu 15:16, 11 May 2006 (EDT)
Protecting Your Medical Privacy
By Karen Pallarito
HealthDay Reporter
FRIDAY, May 5 (HealthDay News) -- As a patient, you've probably scribbled your name on countless forms over the years. One of them is an acknowledgement that you have received a copy of your medical provider's privacy policy.
But do you have any inkling what that so-called "Notice of Privacy Practices" actually says?
full text at http://www.wkyt.com/Global/story.asp?S=4867775
--Kun Liu 23:26, 7 May 2006 (EDT)
Encryption Tool Maps Database Fields
By Laurie Sullivan
TechWeb.com
Wed May 3, 4:38 PM ET
Pennsylvania State University researchers say they have developed a software tool that dramatically improves the process of automatically mapping database fields. They say it also encrypts the information with access controls and tokens to share the data safely across multiple organizations.
The Access Control Toolkit (PACT) built on Java and open source tools, such as SQL and OWL parsers, acts as a filter for data being electronically shared between companies, a Penn State professor said Wednesday.
full text at http://news.yahoo.com/s/cmp/20060504/tc_cmp/187003293
--Kun Liu 23:26, 7 May 2006 (EDT)