PII vs. preference data

In my last two posts, I kept coming back to questions concerning personal data: is it anonymous, who has control over it, and how is it used. The cable company knows who you are, so if they’re going to track your viewing preferences it has to stay local so that it stays anonymous (see here). Cookies are anonymous, unless they’re matched to data that says who you are.

One issue here is one of language, clearly and consistently distinguishing between personal data which is anonymous and that which is not. The latter does have a somewhat standardized name: Personally Identifiable Information (PII). This includes any data that can be used to communicate with you or find out more about you, like name, address, email address, social security number, or bank account number. Completely anonymous data such as interests, demographics, and recent browsing history do not seem to have a standard name, but are sometimes called “preference data.”

An important part of the reasoning behind keeping PII private is avoiding the scenario of a central record being created under your name. There are two problems with such a record: (1) you may not want someone with your email address to automatically know your physical address; i.e. each piece of PII should remain separately under your control to disburse, and (2) you may not want someone with your name to automatically know your detailed interests and browsing history; i.e. preference data should not be merged with PII.

In general, we all want as much control as is reasonably possible over our personal data. Of course, normal business transactions almost always involve some loss of this control: if you visit a store or a web site, you automatically provide preference data associated with your behavior at the store or site; and if you buy something, you have to provide PII in order to pay. What generates more concern is when this data is shared with third parties (beyond the necessary, e.g. sending your PII to the bank to effect payment).

One aspect of this concern was expressed in Walt Mossberg’s recent WSJ column. After following a bit of the conversation around this I can see some themes emerging:

– Some folks may not have noticed Walt’s distinction between “helpful cookies” and “tracking cookies”
– Cookies aren’t the problem, but misuse of them might be
– Blocking of third party cookies can be done, but for many is not practical

One problem here is that third party cookies are not necessarily tracking cookies (they can be associated with any outsourced function such as analytics), and in any case, other ways exist to monitor user actions on a site (Flash, IP address, etc.). In my opinion, it might be best to set aside these distracting technical issues and focus on what’s really important: personal data and what happens to it.

As stated above, by visiting a site you already provide preference data, and if you register, subscribe, or buy something, you provide PII. Thus it seems inevitable that vendors know something about you, their customer. My feeling is that Walt’s problem isn’t with tracking cookies per se, it’s with third party tracking cookies, i.e. he doesn’t like vendors sharing his data or providing it to advertisers.

A good point made by others is that this happens offline all the time, and that the laws governing this sharing are inconsistent. Putting this aside for the moment, I think one thing everyone would agree on is that sharing of PII is certainly a bigger concern than sharing of preference data. At least for me, I’m not that concerned with the sharing of my anonymous, and therefore transient preference data — but I sure don’t want anyone sharing my PII with anyone they don’t have to.

Well, it seems as though the companies that place third party tracking cookies (“network advertisers”) have responded to this concern: a glance at the blogosphere leads you to the Network Advertising Initiative (NAI), which seems to include all the major players. The NAI requires adherence to a set of principles and provides for complaints and enforcement. Unfortunately, these principles allow for merging of tracking data with PII, if the user is “properly notified.” Fortunately, at present not one member network takes advantage of this.

It also requires that all publishers using tracking cookies provide a link to an “opt out” page at the NAI. However, this opt-out has to be performed for each ad network separately, and I’ve never seen it; it’s probably buried in the privacy policy no one reads. My feeling is that the NAI is a good proactive step by the industry in the right direction, but that it should be tightened up, and in particular provide an iron-clad guarantee that PII won’t be shared. Otherwise, people’s concerns will lead to actions that are not necessarily informed by a good understanding of this complex issue, and that might even degrade the utility of the web itself by focusing on specific technical aspects like cookies.

UPDATE: Or alternatively, people could focus on specific companies instead of on the underlying personal data issues…

