Thursday, August 30, 2007

Slowing down: 17 minutes for privacy

In the era of the soundbite and the tabloid headline, it's almost startling to be invited to talk on radio about privacy at Google for 17 minutes. I don't normally believe in cross-posting media stuff into this blog, but it's not everyday that you get a chance to talk about things slowly, in depth. The audio link is here:

http://oe1.orf.at/highlights/107732.html

All this is in connection with the Ars Electronica Privacy Symposium in Linz, Austria.

IP Geolocation: knowing where users are – very roughly

A lot of Internet services take IP-based geolocation into account. In other words, they look at a user's IP address to try to guess the user's location, in order to provide a more relevant service. In privacy terms, it's important to understand the extent to which a person's location is captured by these services. Below are some insights into how precise these are (or rather, are not), how it's done, and how they're used in some Google services.

The IP geolocation system Google uses (similar to the approach used by most web sites) is based primarily on third-party data, from an IP-to-geo index. These systems are reasonably accurate for classifying countries, particularly large ones and in areas far from borders, but weaker at city-level and regional-level classification. As measured by one truth set, these systems are off by about 21 miles for the typical U.S. user (median), and 20% of the time don't know where the user is located within less than 250 miles. The imprecision of geolocation is one of the reasons that it is a flawed model to use for legal compliance purposes. Take, for example, a YouTube video with political discourse that is deemed to be “illegal” content in one country, but completely legal in others. Any IP-based filtering for the country that considers this content illegal will always be over- or under-inclusive, given the imprecision of geolocation.

IP address-based geolocation is used at Google in a variety of applications to guess the approximate location of the user. Here are examples of the use of IP geolocation at Google:

Ads quality: Restrict local-targeted campaigns to relevant users
Google Analytics: Website owners slice usage reports by geography
Google Trends: Identifying top and rising queries within specific regions
Adspam team: Distribution of clicks by city is an offline click spam signal
Adwords Frontend: Geo reports feature in Report Center

So, an IP-to-geo index is a function from an IP address to a guessed location. The guessed location for a given IP address can be as precise as a city or as vague as just a country, or there can be no guess at all if no IP range in the index contains the address. There are many efforts underway to improve the accuracy of these systems. But for now, IP-based geolocation is significantly less precise than zip codes, to take an analogy from the physical world.

Tuesday, August 28, 2007

Do you read privacy policies, c'mon, really?

What’s the best way to communicate information about privacy to consumers? Virtually all companies do this in writing, via privacy policies. But many are not easy to read, because they are trying to do two (sometimes contradictory) things, namely, provide consumers with information in a comprehensible format, while meeting legal obligations for full privacy disclosure. So, should privacy policies be short (universally preferred by consumers) or long (universally preferred by lawyers worried about regulatory obligations)? Perhaps a combination of the two is the best compromise: a short summary on top of a long complete privacy policy, the so-called “layered” approach. This is the approach recommended in a thoughtful study by the Center for Information Policy Leadership:
http://www.hunton.com/files/tbl_s47Details/FileUpload265/1405/Ten_Steps_whitepaper.pdf

But then I’m reminded of what Woody Allen said: “I took a speed reading course and read ‘War and Peace’ in twenty minutes. It involves Russia.” Yes, privacy summaries can be too short to be meaningful.

Indeed, maybe written policies aren’t the best format for communicating with consumers, regardless of whether they’re long or short. Maybe consumers prefer watching videos. Intellectually, privacy professionals might want consumers to read privacy policies, but in practice, most consumers don’t. We should face that reality. So, I think we have an obligation to be creative, to explore other media for communicating with consumers about privacy. That’s why Google is exploring video formats. We’ve just gotten started, and so far, we’ve only launched one. We’re working on more. Take a look and let me know what you think. Remember, we’re trying to communicate with “average” consumers, so don’t expect a detailed tech tutorial.

http://googleblog.blogspot.com/2007/08/google-search-privacy-plain-and-simple.html

Personally, I’ve also been trying to talk about privacy through other video formats, with the media. Below is just one example. I don’t know if all these videos are the right approach, but I do think it’s right to be experimenting.

http://www.reuters.com/news/video/videoStory?videoId=57250

Did you read the book, or watch the movie?

Monday, August 27, 2007

Data Protection Officers according to German law

Some of you might be interested in German law on data protection officers. I’m going to give this to you in factual terms. [This isn’t legal advice, and it’s not commentary: so, I’m not commenting on how much or little sense I think this makes in practice.]

Since August 2006, according to the German Data Protection Act, the appointment of an Data Protection Officer (“DPO”) is compulsory for any company or organization employing more than nine employees in its automated personal data processing operations.

Anyone appointed as DPO must have the required technical and technical-legal knowledge and reliability (Fachkunde und Zuverlässigkeit). He or she need not be an employee, but can also be an outside expert (i.e., the work of the official can be outsourced). Either way, the official reports directly to the CEO (Leiter) of the company; must be allowed to carry out his or her function free of interference (weisungsfrei); may not be penalized for his or her actions; and can only be fired in exceptional circumstances, subject to special safeguards (but note that this includes being removed as DPO at the suggestion of the relevant DPA). The company is furthermore required by law to provide the official with adequate facilities in terms of office space, personnel, etc.

The main task of the DPO is to ensure compliance with the law and any other data protection-relevant legal provisions in all the personal data processing operations of his employer or principal. To this end, the company must provide the DPO with an overview of its processing operations that must include the information which (if it were not for the fact that the company has appointed a DPO) would have had to be notified to the authorities as well as a list of persons who are granted access to the various processing facilities. In practice, it is often the first task of the DPO to compile a register of this information, and suggest appropriate amendments (e.g., clearer definitions of the purpose(s) of specific operations, or stricter rules on who has access to which data). Once a DPO has been appointed, new planned automated processing operations must be reported to him or her before they are put into effect.

The DPO’s tasks also include verifying the computer programs used; and training the staff working with personal data. More generally, he has to advise the company on relevant operations, and to suggest changes where necessary. This is a delicate matter, especially if the legal requirements are open to different interpretations. The Act therefore adds that the official may, “in cases of doubt” contact the relevant DPA. However, except in the special context of a “prior check” issues, the Act does not make this obligatory.

It is important to note that the DPO in Germany is not just a cosmetic function, and it is important for the company and DPO to take his role seriously. Thus, the DPO must be given sufficient training and resources to do his job properly. Failure to take the DPO function seriously can have serious legal consequences, both for the company and the DPO.

When appointing a DPO, it is important to identify potential incompatibility and conflict of interests between this position and other positions of the person within the company. Non-compliance with the law is subject to an administrative offense which can be punished by a fine of up to € 25,000. Moreover, the DPA can order the dismissal of the DPO if he or she also holds a position which is incompatible with the role as DPO. Finally, non-compliance may give rise to liability under the Act.

Unfortunately, with regard to conflicts of interest there is no clear picture, and much depends on local requirements and views by local DPAs. In general, the following positions are considered to be incompatible with the position of a DPO:

CEO, Director, Corporate Administrators, or other managerial positions that are legally or statutory compulsory
Head of IT/ IT Administrator
Head of HR
Head of Marketing
Head of Sales
Head of Legal
Executives of corporate units processing massive or sensitive personal data

Employees in the administrative department and employees in the legal department are more likely considered to have no conflicts of interest. Finally, views differ considerably with regard to the position of an internal auditor and the head of corporate security. An IT security manager can be appointed if he is independent in the organization of the department.

Finally, German law does not provide for having a “Group DPO” that oversees a group of companies or a holding (Konzerndatenschutzbeauftragter). Such a DPO needs to be appointed by every single entity and also has to implement local data protection coordinators.

Tuesday, July 17, 2007

Safe Harbor: the verification problem

A company that signs up to comply with the provisions of the Safe Harbor Agreement for the transfer of personal data from Europe to the US must have a process to verify its compliance. There’s very little in the way of “official” guidance on this question. I’ve spent some time trying to figure out how companies can verify compliance. Here are three options, and companies should choose the model that fits best with their corporate culture and structure.

Traditional Audits

A company can conduct a traditional audit of privacy practices company-wide. The problem with company-wide audits based on traditional checklists, however, is that no two people read the checklist the same way; and all the incentives are to be brief and forgetful when filling out a form. If the checklist is used by an interviewer, the return on investment of time goes up in terms of quality of information, but only so much as the interviewer has the knowledge of the product and the law to ask the right questions. The bigger and more diverse the company, the more daunting the task and the less consistent the information collected.

The traditional auditor approach to verification usually includes massive checklists, compiled and completed by a large team of consultants, usually driven by outputs that require formal corrective action reporting and documented procedures, and cost a fortune. To an auditor, verification means proof, not process; it means formal procedures that can be tested to show no deviation from the standard, and corrective action steps for procedures that fail to consistently deliver compliance.

Alternative Model – Data Flow Analysis

An alternative model involves a more simple procedure focusing on risk. It shows that a company is least at risk when it collects information, and that the risk increases as it uses, stores and discloses personal information to third parties. The collection risk is mitigated through notice of the company’s privacy practices; IT security policies that include authorizations for access and use of information mitigate the risks associated with storage and use; and strong contractual safeguards mitigate the risk on disclosure of personal information.

A sound privacy policy is built around understanding how data flows through an organization. Simply put, you ask the following four questions:

What personal information do you collect
What do you use it for
Where is it stored and how is access granted to it
To whom is it disclosed

The results must then be compared to the existing privacy policy for accuracy and completeness. The best way to do that is on the front-end of the interview, not after the fact. In other words, preparation for each interview should include a review and analysis of the product and the accompanying policy.

A disadvantage with the above approach is that it is somewhat labor intensive and time consuming. Note however that this procedure is not a traditional audit, which can take far longer, cost much more and generally is backward looking (i.e., what did you do with data yesterday?). Instead, the data flow analysis identifies what the company does with data on an ongoing basis and armed with that knowledge, permits the company to continuously improve its privacy policies – it is a forward-looking approach that permits new internal tools or products to be developed around the output. For example, one side benefit of this approach is that every service would yield up the data elements captured and where they are stored.

Sub-Certification Method

There is yet one more alternative – the use of SoX-like sub-certifications to verify the accuracy and completeness of product or service privacy statements. Sarbanes-Oxley requires the company CFO and CEO certify that the information provided to the public regarding the company’s financial matters is true. In order to make the certification, most companies have established a system of sub-certifications where those officers and employees with direct, personal knowledge of the underlying facts certify up that the information is correct.

The same could be done in regard to privacy. There is a two-fold advantage from this approach. First, it emphasizes the importance of the information collection by attaching to it the formality of a certification. Second, it can inform a training program as it forces periodic review of the policy and therefore attention to its existence and relevance.

How granular should the inquiry be at the product level? In a distributed model of verification, the manner and means of confirming the accuracy of the content can be left to the entrepreneurial talents of the managers. The key is to ensure that the information provided is complete and accurate, and that the product lead and/or counsel are willing to certify the results.

There is very little guidance publicly available that informs the process of an in-house review, but it is hard to criticize the very same process accepted for validation of a company’s financial statements upon which individual consumers and investors rely for financial decision-making.

Monday, July 16, 2007

Safe Harbor Privacy Principles

Some privacy advocacy groups have made the claim (and others have repeated it) that Google doesn’t comply with any "well-established government and industry standards such as the OECD Privacy Guidelines." That’s just plain incorrect. Google complies with the robust privacy requirements of the US-EU Safe Harbor Agreement, as disclosed in its Privacy Policy. http://www.google.com/intl/en/privacy.html

The Safe Harbor privacy principles are generally considered to exceed the requirements of the OECD Privacy Guidelines, since they were designed to provide an equivalent level of privacy protection to the laws of the European Union. http://www.export.gov/safeharbor/
As a reminder, here are the privacy principles of the Safe Harbor Agreement:

WHAT DO THE SAFE HARBOR PRINCIPLES REQUIRE?

Organizations must comply with the seven safe harbor principles. The principles require the following:

Notice
Organizations must notify individuals about the purposes for which they collect and use information about them. They must provide information about how individuals can contact the organization with any inquiries or complaints, the types of third parties to which it discloses the information and the choices and means the organization offers for limiting its use and disclosure.
Choice
Organizations must give individuals the opportunity to choose (opt out) whether their personal information will be disclosed to a third party or used for a purpose incompatible with the purpose for which it was originally collected or subsequently authorized by the individual. For sensitive information, affirmative or explicit (opt in) choice must be given if the information is to be disclosed to a third party or used for a purpose other than its original purpose or the purpose authorized subsequently by the individual.
Onward Transfer (Transfers to Third Parties)
To disclose information to a third party, organizations must apply the notice and choice principles. Where an organization wishes to transfer information to a third party that is acting as an agent(1), it may do so if it makes sure that the third party subscribes to the safe harbor principles or is subject to the Directive or another adequacy finding. As an alternative, the organization can enter into a written agreement with such third party requiring that the third party provide at least the same level of privacy protection as is required by the relevant principles.
Access
Individuals must have access to personal information about them that an organization holds and be able to correct, amend, or delete that information where it is inaccurate, except where the burden or expense of providing access would be disproportionate to the risks to the individual's privacy in the case in question, or where the rights of persons other than the individual would be violated.
Security
Organizations must take reasonable precautions to protect personal information from loss, misuse and unauthorized access, disclosure, alteration and destruction.
Data integrity
Personal information must be relevant for the purposes for which it is to be used. An organization should take reasonable steps to ensure that data is reliable for its intended use, accurate, complete, and current.
Enforcement
In order to ensure compliance with the safe harbor principles, there must be (a) readily available and affordable independent recourse mechanisms so that each individual's complaints and disputes can be investigated and resolved and damages awarded where the applicable law or private sector initiatives so provide; (b) procedures for verifying that the commitments companies make to adhere to the safe harbor principles have been implemented; and (c) obligations to remedy problems arising out of a failure to comply with the principles. Sanctions must be sufficiently rigorous to ensure compliance by the organization. Organizations that fail to provide annual self certification letters will no longer appear in the list of participants and safe harbor benefits will no longer be assured.


While the Safe Harbor Agreement principles were designed as a framework for companies to comply with European-inspired privacy laws, the OECD Guidelines from the year 1980 were designed as a framework for governments to create privacy legislation. http://www.oecd.org/document/18/0,2340,en_2649_34255_1815186_1_1_1_1,00.html
The US has chosen to not (yet) implement those principles into its Federal legislation. As a public policy matter, in the US, Google is working with other leading companies to encourage the development of robust Federal consumer privacy legislation. http://googleblog.blogspot.com/2006/06/calling-for-federal-consumer-privacy.html
I’ll come back to the issue of US Federal and global privacy standards again soon. The global nature of data flows on the Internet requires renewed focus on the need for global privacy standards. I hope privacy advocates will work with us on that.

Monday, July 9, 2007

I know people who spent their entire childhood hiding from the German government

Governments around the world are asking whether they should restrict anonymity on the Internet in the name of security. Take Germany as an example. Should Internet service providers be required to verify the identity of their users? Germany recently proposed – and then retreated – on requiring that providers of email services must verify the identity of their account holders. However, Germany is on the path to require that providers of VoIP services must verify the identity of their users. The debates about the proper limits of anonymity on the Internet are profound. In case you’re interested in the details, here is a history of the proposals in Germany, from the drafts of the telecommunicationsurveillance act. German outside counsel summarized these for me.

* 8. Nov. 2006 - First draft submitted to the GovernmentThe German Ministry of Justice put together the first draft of law designed to reform telecommunications monitoring and to implement the directiveadopted by the European Union on the retention of traffic and location data.This draft contained the proposal that email service providers should be obliged to COLLECT and to STORE account data, name, address, date of birth,start date of the contractual relationship (proposed changes to §111 TKG).

* 18 April 2007 - First Draft of the German Government - "Regierungsentwurf" The draft of the German Government did not include an obligation for emailservice providers to COLLECT personal information. It contained, however,the obligation to STORE a personal identifier as well as name and address of the account holder IF the provider collects such data (proposed changes to§111 TKG).
Text: http://www.bmj.bund.de/files/-/2047/RegE%20TK%DC.pdf

* 29. May 2007 - Recommendation ("Empfehlung") of different working groupsto the German Federal Assembly (Bundesrat)The text did not proposed additional requirements for email serviceproviders to collect or to store personal data. However, it recommended that telecommunication service providers should be obliged to verify via theofficial ID card if the telecommunication user is the person who signed upfor the service (proposed changes to § 95 sec. 4 sent. 1 TKG). German legal experts expressed the opinion that this might also be applicable for email services.

* 8. June 2007 - Statement of the German Federal Assembly (Bundesrat) -"Stellungnahme des Bundesrates" The Bundesrat did not follow the recommended wording and did not suggest anychanges to the First Draft of the German Government as of 18 April 2007 with regard to email services.

So, in conclusion, anonymous use of Internet services is very much up in the air, in Germany, as regards certain services, such as VoIP services like Google Talk, even if the proposal to limit anonymity for email users appears to be off the table. Fundamental rights are in play. The age-old trade-offs between government security and privacy is being re-debated. I know people who spent their entire childhood hiding from the German government.

Saturday, June 23, 2007

The Working Party


The Working Party is a group of representatives from every European country’s data protection authority plus the European Commission, dedicated to working on the harmonized application of data protection across Europe. I think I have the (perhaps dubious) distinction of being the private sector privacy professional who has worked the most with this group in the last decade. Most of my peers flee the Working Party like the plague, but I agree with Mae West, who said, “Too much of a good thing is wonderful.”

In my many years of privacy practice, I’ve always thought the best strategy is to work constructively with the Working Party. They are thoughtful privacy regulators, trying to improve privacy practices and to enforce often-unclear data protection laws. The companies I worked for are committed to improving their privacy practices and to complying with European laws. And the Working Party itself is committed to becoming more effective at working with the private sector, and in particular with the technology sector. So, based on my many years of experience, how could this all work better? And by the way, if you think I’ll be biased and self-serving in making these observations, feel free to stop reading here.

Here’s my golden rule: when regulators want to change practices across an entire industry, then they shouldn’t just work with one company. To make the point, here’s a little timeline summary of the recent Working Party exchanges with Google.

November 2006: the international data protection authorities issued a resolution calling on all search companies to limit the time periods during which they retain personally-identifiable data. No leading search company publicly disclosed a finite retention period at this time.

March 2007: Google chose to lead the industry by announcing it would anonymize its search server logs after 18-24 months.

This generated considerable positive press, in my opinion quite justified, as the first such move by a leading search company.

May 2007: the Working Party sent Google a letter asking it to explain its retention decisions, and to justify whether this period was “too long” under European data protection principles. This set off a worldwide press storm, as hundreds of newspapers ran headlines like: “Google violates EU data protection laws.” And many of the EU privacy regulators added fuel to the media flames, as they issued comments expressing their concerns about “Google”, or even declaring Google’s practices to be “illegal”, without even waiting for Google to respond to their letter.

June 2007: Various privacy advocates jumped on the publicity bandwagon. One even went so far as to declare Google to be the “worst” in terms of privacy, due to the vagueness of its data collection and data retention practices. But since Google was the only one of the entire list of companies to have publicly stated a finite retention period, I would have thought Google should have been declared the “best.” Of course, that report was thoroughly de-bunked by more thoughtful industry observers, such as Danny Sullivan: “Google Bad on Privacy? Maybe it’s Privacy International’s Report that Sucks.” http://searchengineland.com/070610-100246.php

Nonetheless, the press damage was done. Even my dad called me after reading his small-town Florida newspaper to ask me why I was so bad at my job. Argh.

Then, I published a long open letter explaining the factors Google took into account while announcing a new retention period of 18 months: privacy, security, innovation, retention obligations. http://googleblog.blogspot.com/2007/06/how-long-should-google-remember.html
I wanted us to be transparent about our analysis and the factors that guided it. Of course, I couldn’t really describe all the security reasons for log retention: you can’t describe all your security practices publicly without undermining your security. And you can’t describe all your uses of data for search algorithm improvements without revealing trade secrets to your competitors. But nonetheless, I think we have been remarkably transparent throughout this process. Meanwhile, our competitors have been completely, studiously silent.

Finally, the Working Party realized how unfair all this had become for Google, and told the press that its sub-group, called the Internet Task Force, would consider these issues further in July, and include other search companies in the review.

I’m quite eager to hear from other search companies. I undertook a thorough and thoughtful analysis of Google’s need for logs for these various (sometimes conflicting) purposes. I am intellectually curious to understand whether our peer companies balance these factors in the same way as we did, or differently. Will they announce retention periods too? And will they announce periods that are longer or shorter than ours?

Privacy on the Internet concerns everyone, and all companies. The Working Party has got to learn how to engage with the industry. I continue to remain committed to working with the Working Party, but I fear that other companies in the industry will draw the opposite lesson: keep a low profile and try as hard as possible not to make it onto their radar screen. That would be bad for privacy. Well, the Working Party is a work in progress. And I hope someone tells my dad I’m not doing such a bad job… Or maybe my studiously-silent peers were right, and I was wrong…?

Thursday, June 14, 2007

Server Logs and Security

I recently posted a blog to explain why Google retains search server logs for 18 months before anonymizing them.
http://googleblog.blogspot.com/2007/06/how-long-should-google-remember.html
Security is one of the important factors that went into that decision. Google uses logs to help defend its systems from malicious access and exploitation attempts. You cannot have privacy without adequate security. I've heard from many people, all agreeing that server logs are useful tools for security, but some asking why 18 months of logs are necessary. One of my colleagues at Google, Daniel Dulitz, explained it this way:

"1. Some variations are due to cyclical patterns. Some patterns operate on hourly cycles, some daily, some monthly, and others...yearly. In order to detect a pattern, you need more data than the length of the pattern.

2. It is always difficult to detect illicit behavior when bad actors go to great lengths to avoid detection. One method of detecting _new_ illicit behaviors is to compare old data with new data. If at time t all their known characteristics are similar, then you know that there are no _new_ illicit behaviors visible in the characteristics known at time t. So you need "old" data that is old enough to not include the new illicit behaviors. The older the better, because in the distant past illicit behaviors weren't at all sophisticated.

3. Another way of detecting illicit behaviors is to look at old data along new axes of comparison, new characteristics, that you didn't know before. But the "old" data needs to run for a long interval because of (1). So its oldest sample needs to be Quite Old. The older the data, the more previously undetected illicit behaviors you can detect.

4. Some facts can be learned from new data, because they weren't true before. Other facts have been true all along, but you didn't know they were facts because you couldn't distinguish them from noise. Noise comes in various forms. Random noise can be averaged out if you have more data in the same time interval. That's nice, because our traffic grows over time; we don't need old data for that. But some noise is periodic. If there is an annual pattern, but there's a lot of noise that also has an annual period, then the only way you'll see the pattern over the noise is if you have a lot of instances of the period: i.e. a lot of years.

This probably isn't very surprising. If you're trying to learn about whether it's a good idea to buy or rent your house, you don't look only at the last 24 months of data. If you're trying to figure out what to pay for a house you're buying, you don't just look at the price it sold for in the last 24 months. If you have a dataset of house prices associated with cities over time, and someone comes along and scrubs the cities out of the data, it hasn't lost all its value, but it's less useful than it was."

Monday, June 4, 2007

Did you mean Paris France or Paris Hilton?

Here's an OpEd I contributed to the Financial Times.
http://www.ft.com/cms/s/560c6a06-0a63-11dc-93ae-000b5df10621.html

Published: May 25 2007

There was a survey conducted in America in the 1980s that asked people a deceptively simple question: "Who was shot in Dallas?" For many who had lived through the national trauma of 1963, the deliberations of the Warren Commission, the theories about the grassy knoll and the magic bullet, there was only one answer: JFK. For others, who followed every twist of the Ewing family, the oil barons' ball and Cliff Barnes's drink problem, there was also only one answer: JR.

The point of the survey was to show how the same words can have very different meanings to different people depending on their background and their interests. It is the same idea that is driving Google's personal search service.

Our search algorithm is pretty sophisticated and most people end up with what they want. But there is inevitably an element of guesswork involved. When someone searches for "Paris" are they looking for a guide to the French capital or for celebrity gossip? When someone types in "golf" are they looking to play a round on the nearest course or to buy a Volkswagen car? An algorithm cannot provide all the answers.

But if an algorithm is built to take into account an individual's preferences it has much more chance of guessing what that person is looking for. Personalised search uses previous queries to give more weight to what each user finds relevant to them in its rankings. If you have searched for information about handicaps or clubs before, a search for "golf" is more likely to return results about the game than the car. If you have been checking out the Louvre, you are less likely to have to wade through all the details of a particular heiress's personal life.

This makes search more relevant, more useful and much quicker. But it is not for everybody. As the Financial Times has pointed out this week, personalised search does raise privacy issues. In order for it to work, search engines must have access to your web search history. And there are some people who may not want to share that information because they believe it is too personal. For them, the improved results that personalised search brings are not matched by the "cost" of revealing their web history.

The question is how do we deal with this challenge? Stop all progress on personalised search or give people a choice? We believe that the responsible way to handle this privacy issue is to ask users if they want to opt in to the service. That is why Google requires people to open an account and turn on their personalised search functionality. They do not have to give a real name to open a Google account, but even if they cannot be identified, we think they should have to give explicit consent before their web history is used. Unless they do, they will simply have the standard Google search service.

Our policy puts the user in charge. It is not something Google seeks to control. At any time they can turn off personal search, pause it, remove specific web history items or remove the whole lot. If they want, they can take the whole lot to another search engine. In other words personalised search is only available with the consent ofthe user.

If you think of search as a 300chapter book, we are probably still only on chapter three. There are enormous advances to be made. In the future users will have a much greater choice of service with better, more targeted results. For example, a search engine should be able to recommend books or news articles that are particularly relevant - or jobs that an individual user would be especially well suited to.

Developing more personalised search results is crucial given how much new data is coming online every day.The University of California Berkeley estimates that humankind createdfive exabytes of information in 2002 - double the amount generated in 1999. An exabyte is a one followed by 18 noughts. In a world of unlimited information and limited time, more targeted and personal results can really add to people's quality of life.

If you type "Who was shot in Dallas?" into Google today, the results are as divided as the survey's respondents a quarter of a century ago. But with personalised search you are more likely to get the "right" result for you. Giving users the chance to choose a search that is better for them as individualsis something we are proud of andwill continue to build on. After all, the web is all about giving people - you and me - more choice and more information.

Thursday, May 31, 2007

Sweden and government surveillance

All democratic governments need to maintain a delicate balance between 1) respect for the private lives of their citizens, and 2) police and government surveillance to combat crime. The Swedish government has proposed legislation to shift the balance radically towards government surveillance. These measures have a huge impact on the daily life of each citizen, living inside or outside Sweden. By introducing these new measures, the Swedish government is following the examples set by governments ranging from China and Saudi Arabia to the US government’s widely criticised eavesdropping programme. Do Swedish citizens really want their country to have the most aggressive government surveillance laws in Europe?

Recently, a new bill was introduced allowing the National Defence Radio Establishment (Försvarets radioanstalt, FRA) to intercept internet traffic and telephone conversations that cross Sweden's borders at some point. The FRA claims this additional surveillance power to be essential because terrorists and fraudsters now mainly rely on the internet to communicate. Operators will be obliged to co-operate with the legal authorities by channelling the data about their users to the FRA through so-called collection nodes (samverkanspunkter). While the FRA claims it is not interested in intercepting each citizen's emails and telephone conversations, it will nevertheless have the capability to do so once the bill is adopted. Citizens will not need to be suspected of fraud or any other illegal activity for their communications to be intercepted.

Apart from this stringent surveillance measures, the Minister of Justice also want to introduce a monitoring duty for internet access providers. Minister Beatrice Ask indicated that she wants access providers to be responsible for blocking illegal internet content. Strict legislation would be adopted if the internet service providers do not take their responsibility. The Minister's position is remarkable, as European eCommerce legislation explicitly forbids imposing this type of general monitoring on access providers. It also raises the question on which types of content should be considered illegal enough to warrant blocking, and runs the risk of crippling freedom of speech.

Technical experts are not convinced that massively storing and monitoring communication data will indeed aid in the fight against terrorism and fraud. For one thing, terrorists and fraudsters can easily use special tools (such as encryption) to circumvent any wiretapping. When telephone companies and internet access providers are required to monitor, filter and store communication data, costly investments are required. In Sweden, as in most European countries, the law provides no proper compensation for these investments by the government. Obviously, end-users will – literally – pay the price for having their conversations monitored.

Technical feasibility and high costs aside, I think the most important objection against wiretapping and storing data is that they interfere with every citizen's private life, communications and freedom of speech. By storing and being capable of monitoring data about every single phone call, fax, email message and website visited, safeguards provided by the European Convention on Human rights and the European Data Protection Directive are effectively undermined.

Sometimes, a government has to make difficult choices. It would be a sad day for Sweden, if it passes the most privacy-invasive legislation in Europe, and thereby puts itself outside of the mainstream of the global Internet economy. And don't get me wrong, I love Sweden. That's why I care.

Monday, May 7, 2007

Some rules of thumb for online privacy


Here's a short 0pinion piece that I contributed to this month's edition of .net magazine:
http://www.netmag.co.uk/zine/latest-issue/issue163

Privacy is one of the key legal and social issues of our time. Mobile phones pinpoint where we are to within a few hundred meters. Credit cards record what we like to eat, where we shop and the hotels we stay in. Search engines track what we are looking for, and when. This places a huge duty on business to act responsibly and treat personal data with the sensitivity it deserves.

The Internet is where privacy issues are the most challenging. Any website that collects personal data about its visitors is confronted with an array of legal compliance obligations, as well as ethical responsibilities. I deal with these every day, and here are some of my rules of thumb.

First, be very clear about whether your site needs to collect “personal data” or not. “Personal data” is information about an identifiable human being. You may wish to construct your site to avoid collecting personal data, and instead only collect anonymous and statistical information, thereby avoiding all the compliance obligations of privacy law. For example, we designed Google Analytics to provide anonymous and statistical reports to the websites that use it, giving them information about their visitors in ways that do not implicate privacy laws (e.g., the geographic distribution of their visitors). Even the UK Information Commissioner’s website uses Google Analytics, and I think the disclosure that they put on their site is a best practice in terms of transparency to end users: http://www.ico.gov.uk/Global/privacy_statement.aspx

Second, if your site collects “personal data”, then you must post a privacy policy. Most sites choose to display it as a link on the bottom of each page. A privacy policy is a legal document, in which you provide “notice” to your visitors about how your site will collect and use their personal data, as well as obtain their “consent”. Because it’s a legal document, it needs to be drafted carefully. But that doesn’t mean that it needs to sound like it was written by lawyers. I think the best privacy policies are short, simple, and easy to read. If you have a complicated site, like Google’s, then it’s a good idea to present the privacy policy in a layered architecture, with a short, one-page summary on top, with links to the fuller policy, and/or with links to privacy policies for specific products or services within your site. Take a look and see if you like our model: http://www.google.com/privacy.html

Third, if your site collects “sensitive” personal data, such as information about a person’s health, sex life, or political beliefs, then you will have to obtain their explicit opt-in consent. In fact, it’s usually a good idea to obtain a user’s opt-in consent anytime your site collects personal data in an unusual, or particularly broad way that the average Internet user might not be aware of. Remember, the privacy legal standard for using a person’s personal data is “consent”, so deciding on the right level of consent will always depend on the facts and circumstances of what your site does.

Fourth, EU data protection law places restrictions on the transfer of personal data from Europe to much of the rest of the world, to places that are deemed not to have “adequate” data protection, such as the US. So, if your site operates across borders, then you should find a legal mechanism for this transfer. Google has signed up to the terms of the US-EU Safe Harbor Agreement, which legitimizes the transfers of personal data from Europe to the US, as long as the company certifies that it will continue to apply the Safe Harbor’s standard of privacy protections to the data. You can read more about that here: http://www.export.gov/safeharbor/
But the Safe Harbor is only one of various alternative methods, including: 1) the explicit consent of the data subject, or 2) “binding corporate rules”, which obligate the company to apply consistent, EU-style privacy practices worldwide, to name just two.

Finally, privacy is about more than legal compliance, it’s fundamentally about user trust. Be transparent with your users about your privacy practices. If your users don’t trust you, you’re out of business.

Tuesday, April 24, 2007

Pour vivre heureux, vivons cachés

It used to be said that “pour vivre heureux, vivons cachés.” If only life were still that simple. But today all of us regularly trust other people with our personal information. Mobile phones pin point where we are to within a few hundred meters. Credit cards record what we like to eat, where we shop and the hotels we stay in. Search engines log what we are looking for, and when.

This places a huge duty on business to act responsibly and treat personal data with the sensitivity it deserves – of which more later. But it also raises important questions for governments, which increasingly see the information companies hold on their customers as a valuable weapon in the fight against terrorism.

For decades politicians have had to strike a balance between personal privacy and the power of the police when drafting criminal justice legislation – and generally they have erred on the side of caution, aligning themselves with the rights of the individual. But in the aftermath of the atrocities on 9/11 and the horrendous bombings in Madrid and London, governments globally have sought to redress that balance – giving more power to the police and in the process starting a fierce debate about where the boundary between security and privacy lies.

The Patriot Act in the United States, for example, made it easier for law enforcement agencies to access people’s personal data so that they could more quickly investigate acts of terrorism. It has been widely criticized for over-riding longstanding safeguards designed to protect individual liberty. In Europe politicians have taken a different approach – although the consequences look as if they will be the same: an erosion of personal privacy. The EU Data Retention Directive requires phone operators and Internet companies to store data on their users – such as the emails they send and receive – for between six and 24 months so that the police can use it to investigate serious crimes.

Many people will see nothing wrong with this approach, arguing that it will impact only terrorists and that the innocent have nothing to hide. However, as is so often the case the problem lies in the detail, which will vary country by country as different governments intend to implement the Directive in different ways. In Italy, for example, the 2005 Act on Urgent Measures to Fight International Terrorism – which effectively anticipated the Directive - led to the suspension of certain privacy provisions in the Italian Data Protection Code. The Act also requires companies to store Internet traffic data for twelve months to help investigate terrorism and serious crime. In Germany the Ministry of Justice has decided that anyone who provides an email service must verify the identity of their customers before giving them an account – effectively ending the use of anonymous email.

The Data Retention Directive is being challenged on many fronts. Some question whether it will actually help in the fight against terrorism when tech savvy people will be able to use the Internet in such a way as to ensure they do not leave tracks that can be traced. Nor is it at all clear that the benefits outweigh the additional security risks posed by the creation of such massive databases. And then there is the whole question of whether this Directive actually applies to non-European-based companies.

Take Google for example. We do not ask our users to prove their identity before giving them an email address – and we think it would be wrong to do so because we believe that people should have the right to use email anonymously. Just think about dissidents. We would therefore challenge any government attempt to try and make us do this. Of course we recognize our responsibility to help the police with their inquiries where they have been through the proper legal process. While most people use the Internet for the purposes it was intended – to help human kind communicate and find information – a tiny minority do not. And it’s important that when criminals break the law they are caught.

But we think personal privacy matters too. From the start Google has tried to build privacy protections into our products at the design stage – for example we have an off the record button on our instant messaging services so that people cannot store each others’ messages without permission. And we allow people to use many of our services without registration. Like our search engine, we want our privacy policies to be simple and easy to understand – they are not the usual legal yada yada.

Nor do we believe that there are always right and wrong answers to these complex issues. That’s why we keep our policies under constant review and discuss them regularly with data protection specialists. For example we have recently decided to change our policy on retaining users’ old log data. We will make this data anonymous after 18 to 24 months – though if users want us to keep their logs for longer so that they can benefit from personalized services we will. This change in policy will add additional safeguards for users’ privacy while enabling us to comply with future data retention requirements.

In the meantime we expect to see the debate on privacy intensify as the Data Retention Directive is passed into law across Europe. The European Union has written both privacy and security into its Charter of Fundamental Rights. Important principles are at stake here – and an open and honest discussion is important if we are to balance these two, often conflicting, principles.

Tuesday, April 17, 2007

Online Ad Targeting

Google’s plan to acquire DoubleClick has refocused attention on the privacy issues in online ad targeting. Let’s be frank: in privacy terms, there are practices in the industry of online ad targeting that are good, others that are bad, and some that could be improved. I am convinced that this acquisition will start a process to improve privacy practices across the ad targeting industry. To improve them, we need to start by understanding them.

We live in an age when vast amounts of content and services are available for free to consumers. And that has been made possible by the growth of online ad targeting, which provides the economic foundations for all this. Given the enormous economic role that ad targeting now plays in sustaining the web, it’s important to analyze it very carefully for privacy implications. Of course, advertising has historically subsidized lots of services before the Internet, such as TV, radio, newspapers etc. And advertisements in those media have always been targeted at their audiences: a TV program on gardening carries different types of ads than a football match, because the advertisers assume their audiences fit different demographic profiles. Although the advertisements are “targeted” based on demographics, they remain anonymous, and hence raise no real privacy issues.

Online, the issues of ad targeting are more complicated, and in terms of privacy practices, there is a wide spectrum. On the responsible end, ad targeting respects the core privacy principles: providing notice to end-users and respecting their privacy choices. On the bad end of the spectrum, “adware”, a type of spyware, is malicious software which engages in unfair and deceptive practices, such as hi-jacking and changing settings on a user’s machine, and making itself hard to un-install. Below are thoughts about how to keep ad targeting on the responsible end of the spectrum.

Ad targeting is based on “signals”, and these signals can be either anonymous or “personally-identifiable information” (known as PII). To analyze privacy implications, the first question to ask about ad targeting is whether it is based on anonymous signals or on PII. Moreover, there are roughly two categories of signals (demographic and behavioral), and each of them can be either anonymous or PII.

Anonymous ad targeting is the most common form of ad targeting on the Internet. There are many different types of demographic signals, such as location, language, or age. For example, ads are routinely targeted to people who live in a particular location: an advertiser may wish to target people who live in Paris, which can be done based on the geolocation code in the IP address of end-users. Or an advertiser may wish to target people who speak a particular language, such as French, which can be done based on the language settings in end-users’ browsers or based on language preferences in their cookies. Or an advertiser may wish to target a young demographic, which might be done by targeting ads to sites where young people congregate, such as social networking sites. Anonymous ad targeting can also be based on an end-user’s behavior, such as the keyword search term that someone types. If I type the search “hotel in Rio”, Google may show me an ad for a hotel in Rio. This is a contextual ad, related to the search term, and based on the “behavior” of the person who typed it. It can be done without knowing the identity of the person typing the search.

Ad targeting can also be based on PII. For example, a retailer may target ads to me, as an identifiable person, because I have bought particular books from them in the past, and they have developed a profile of my likely interests. The key privacy principles which govern the collection and use of PII are “notice” and “choice”. So, any ad targeting based on PII needs to be transparent to end-users and to respect their privacy preferences.

The use of third-party cookies for ad targeting requires special care. If an end-user goes to a site, xyz.com, it may receive a cookie from that site, and the cookie would be known as a first-party cookie, since it was downloaded by the site the end-user was visiting. When a website uses an advertising network to serve ads on its site, the advertising network may download its own cookies on end-users’ machines to help target ads. Because the end-user receives a cookie from the advertising network while it is on the website of xyz.com, the advertising network’s cookies are known as third-party cookies.

Third-party cookies present particular challenges in terms of transparency and choice to end-users. Some users may not be aware that they are receiving cookies from third-parties at all. Others may be aware of receiving them, but they may not be aware of how to accept or to reject them.

The Network Advertising Initiative (“NAI”) has published a set of privacy principles in conjunction with the Federal Trade Commission. http://www.networkadvertising.org/industry/principles.asp
Among other things, they set standards for notice and choice in the context of ad targeting based on third-party cookies, which have been adopted by many of its member companies, including DoubleClick. These principles require that all websites served by these networks inform their end-users that, to quote:
1) “The advertising networks may place a 3rd party cookie on your computer;
2) Such a cookie may be used to tailor ad content both on the site you are visiting as well as other sites within that network that you may visit in the future.”
In addition to requiring notice to consumers about the use of 3rd party cookies, these NAI mandates that member advertising networks provide an opt-out mechanism for the targeted ads programs they provide.

It seems to me that these NAI principles are right to focus on notice and consent to end-users. As so often, there’s room to scrutinize the individual implementations of these principles. Amongst privacy advocates, we will continue to debate about the meaning of “anonymity”, and whether or not the types of unique identifying numbers used in the cookies of advertising networks can be linked with identifiable users under particular circumstances. There is a wide spectrum from “anonymity” to “identifiability”, so there is also a need for a constructive policy debate about the level of anonymity to be expected in online ad targeting. Similarly, there is room for a debate about the way choices are presented to end-users: Are the notices clear? Does the end-user have meaningful choices? Are the end-user’s choices respected?

Most companies facilitating online ad targeting, like DoubleClick, have operated in the background. Because they have generally not been consumer-facing sites, many consumers do not understand how they work. Google only recently announced its plans to acquire DoubleClick, so it’s too early to list any specific privacy improvements that it might try to make, although it’s not to early to start thinking about them.

I think it’s a good thing for people to become more aware of online ad targeting. It’s an industry that has operated in the shadows for too long. The attention that this deal may generate can do a lot of good. In the weeks and months ahead, I’ll be speaking with lots of privacy stakeholders, to solicit their ideas about how privacy practices could be improved in this industry. I’m optimistic that the process to improve transparency and user choice in online ad targeting has gotten a fresh impetus.

Friday, April 6, 2007

La protection de la vie privée sur Internet


LE MONDE 05.04.07
On a coutume de dire "pour vivre heureux, vivons cachés". Si la vie était aussi simple... Aujourd'hui, nous confions nos informations personnelles à des tiers. Les téléphones mobiles peuvent nous localiser à quelques centaines de mètres près, les cartes de crédit enregistrent nos plats préférés, nos boutiques favorites et les hôtels dans lesquels nous nous rendons. Les moteurs de recherche mémorisent la date et l'objet de nos recherches.

Les entreprises portent donc la lourde responsabilité de traiter nos données personnelles avec le respect qu'elles méritent. Mais cela soulève aussi d'importantes questions pour les gouvernements, qui considèrent de plus en plus que les informations détenues par les entreprises sur leurs clients constituent une arme précieuse pour lutter contre le terrorisme.

A la suite du 11-Septembre et des horribles attentats de Madrid et de Londres, les gouvernements ont cherché dans l'ensemble à redéfinir l'équilibre entre la protection de la vie privée et les pouvoirs de la police en donnant plus de pouvoirs à cette dernière. Cela a suscité un vif débat sur la frontière entre la sécurité et la vie privée. Aux Etats-Unis, par exemple, le Patriot Act a facilité l'accès des autorités publiques aux données personnelles des citoyens pour accélérer les enquêtes sur les actes de terrorisme. Cette loi a été critiquée comme remettant en question les garde-fous établis de longue date afin de protéger les libertés individuelles.

En Europe, les pouvoirs publics ont adopté une approche différente, mais dont les conséquences risquent d'être les mêmes : une érosion de la protection de la vie privée. En France, le décret du 24 mars 2006 fixe à un an la durée de conservation des données des communications électroniques pour aider les services de police dans le cadre de leurs enquêtes criminelles. De manière plus générale, la directive communautaire relative à la conservation des données exige que les opérateurs téléphoniques et les fournisseurs de services Internet conservent toutes les données de connexion de leurs abonnés entre six et vingt-quatre mois pour que la police puisse les utiliser dans le cadre d'enquêtes concernant des délits graves.

Peu de gens trouveront à y redire, considérant que cela n'affectera jamais que les terroristes, les innocents n'ayant rien à cacher. Mais, comme souvent, les problèmes surgiront au niveau des modalités d'application, qui varieront d'un pays à l'autre. En Allemagne, par exemple, le ministère de la justice a décidé que tout prestataire de services de courrier électronique doit vérifier l'identité de ses clients avant de leur ouvrir un compte - interdisant ainsi en pratique tout usage anonyme du courriel.

La directive sur la conservation des données a fait l'objet de critiques. Certains doutent qu'elle puisse contribuer à la lutte antiterroriste, car les petits génies de l'informatique seront capables d'utiliser Internet sans laisser de traces. En outre, il n'est pas sûr que les avantages de cette législation l'emportent sur les risques en matière de sécurité entraînés par la création de bases de données personnelles aussi vastes. Enfin, de nombreuses questions se posent quant à l'application internationale de cette directive - en particulier pour les entreprises établies hors de l'UE.

Prenons l'exemple de Google. Nous ne demandons pas à nos utilisateurs de nous communiquer leur pièce d'identité avant de leur fournir une adresse électronique - et nous pensons que cela serait injustifié car nous estimons que les citoyens doivent conserver le droit d'utiliser le courrier électronique de façon anonyme (il suffit de songer aux dissidents politiques). C'est pourquoi nous exprimerions notre désaccord à l'égard de toute initiative gouvernementale allant dans ce sens. Nous sommes cependant tout à fait conscients de notre obligation de concourir au travail de la police dans ses enquêtes, dès lors que le cadre légal est respecté. Si l'énorme majorité des internautes utilise Internet dans le but pour lequel il a été conçu - communiquer et trouver des informations -, tel n'est pas le cas pour certains d'entre eux, et il est important que les criminels agissant sur le Net puissent être poursuivis.

Néanmoins, il nous semble tout aussi important que la protection de la vie privée soit garantie. Dès le début, Google a cherché à intégrer la protection de la vie privée dans ses services et ce, dès le stade de leur conception. Il existe par exemple sur nos services de messagerie instantanée un mode privé qui rend impossible l'enregistrement des conversations sans autorisation. Par ailleurs, nous permettons aux internautes d'utiliser beaucoup de nos services sans avoir à s'inscrire au préalable. Tout comme notre moteur de recherche, nous voulons que notre politique de confidentialité soit simple et claire, et pour cela nous n'utilisons pas le jargon juridique habituel.

Nous ne pensons pas non plus qu'il y ait de bonnes ou de mauvaises façons de régler ces problèmes complexes. Nos politiques sont donc revues et soumises à des spécialistes de la protection des données. Nous avons décidé de modifier notre politique de conservation des données de connexion des utilisateurs (les logs de connexion incluant l'adresse IP, la date et l'heure de connexion, les mots recherchés, les cookies). Ces données seront rendues anonymes au bout de dix-huit mois, vingt-quatre mois au plus tard, sauf lorsque la loi exige une conservation supplémentaire. Mais les utilisateurs pourront bénéficier de services personnalisés et conserver ces données plus longtemps s'ils le souhaitent. Cette nouvelle politique renforcera encore la protection de la vie privée des utilisateurs, tout en nous permettant d'anticiper nos obligations en matière de conservation de données.

D'ici là, le débat va sans doute s'intensifier à mesure que la directive sur la conservation des données est mise en oeuvre dans les différents pays d'Europe. L'UE a inscrit à la fois le respect de la vie privée et la sécurité dans sa Charte des droits fondamentaux. Des principes majeurs sont ici en jeu, et une discussion ouverte et honnête est indispensable si nous voulons trouver le juste équilibre entre ces deux principes essentiels et souvent contradictoires.

Peter Fleischer est responsable protection des données personnelles, Google Europe
Article paru dans l'édition du 06.04.07

Saturday, March 31, 2007

Stop! Make sure you’re on the white list!

The European Data Protection Directive divides the countries of the world into two lists: the white list (with “adequate” data protection) and the black list (without “adequate” data protection). All the EU countries automatically get on the white list. The European privacy regulators have the unenviable task of assigning other countries to that list, and they have taken a very conservative approach, only putting countries on that list that have a clone of EU-style data protection. So, Argentina and the Channel Islands are deemed to have “adequate” data protection, but the USA is not. In other words, data flows from Europe to such places as Bulgaria, Romania and Argentina are unimpeded by regulatory constraints, but similar flows to the USA are subject to considerable regulatory process. Of course, all this exists in a parallel universe, rather divorced from reality. I doubt many people in Europe would honestly believe that their data is more protected in Argentina or Bulgaria than in the USA.

It’s time to scrap these artificial concepts. White lists and black lists are inherently unfair, and they simply do not reflect the realities of privacy protection, especially when they are based on rather arbitrary legalistic concepts, far divorced from the realities of the world. Such concepts might have been defensible in the days before the Internet, when global transfers of data were rare, but they are patently absurd in the era of the World Wide Web, when data zips around the planet with the click of a mouse.

I’m all for robust data protection legal obligations. What we really need are global standards. You don’t get those by creating silly white lists and black lists. And if you don’t agree, you can always choose to move all your sensitive data to Argentina. It’s on the white list.

Binding Corporate Rules: Data Protection for the Rich

Yes, the rich are different. They can afford to spend millions in fees and years in regulatory process, all in the hope that their “binding corporate rules” will be approved by 27 different EU regulators, all applying slightly different rules. Whether all this money results in better privacy is dubious. I have never believed that regulatory paperwork by itself improves privacy practices. Indeed, every euro from a privacy professional’s budget that is spent on such paperwork is not being spent on other things, like employee privacy trainings, or improving privacy systems. Even the rich have budgets.

The concept of “binding corporate rules” is rather weird: a company makes a promise to itself, or rather its various affiliates make a promise to their parent company, or the other way around. And the promise is essentially to respect the law. In other words, to respect EU data protection concepts governing the transfer of personal data outside of the EU to countries that are not deemed to have “adequate” data protection. In case you’re wondering, Bulgaria and Romania have “adequate” data protection, but the USA does not... I’ll come back to that in another blog post. In essence, “binding corporate rules” are a solution to an artificial problem: the legal presumption that any data transfer outside of Europe will not have “adequate” privacy protection unless it fits into some sort of exception, like “binding corporate rules” or the Safe Harbor Agreement.

The reason that “binding corporate rules” are so expensive, and so well-loved by the legions of outside counsel who help their clients try to complete them, is because they require a company to document its data handling processes to the satisfaction of every data protection regulator in every jurisdiction in which it operates in Europe. A recent effort to streamline the process adopted the concept of “lead regulator” – a concept well known in many other regulatory fields in EU law – but still retained the legal obligation to obtain approval from all the other regulators. You can read the recommendation of the WP29 from January 10, 2007 here:
http://ec.europa.eu/justice_home/fsj/privacy/docs/wpdocs/2007/wp133_en.doc
Since all these independent regulators are free to have a different opinion than the lead regulator, it’s hard to see how complicated companies with complicated data processing practices are ever going to obtain the unanimity required to have their “binding corporate rules” approved. And in fact, almost none have. GE famously obtained approval of its “binding corporate rules” after spending tons of time and money, but only for its human resources data. And GE and its regulators spent considerable effort publicizing this “success” across Europe. Considering that human resources data is only a small part of the data handling operations of any corporation, I can only wonder at the modesty of the achievement, at least in the real world of privacy protection. Since GE is one of the most sophisticated companies on the planet, what does that portend for the rest of us? Realistically, most companies that enter the process of “binding corporate rules” are going to be stuck in a sort of regulatory limbo, for years, and perhaps permanently. And in the unlikely event that any company obtains such approval, what would it mean in an era when companies are constantly changing their data processing practices?

The business world knows a flop when it sees one. Unless you're so rich, you don't care.

Thursday, March 29, 2007

“We can lick gravity, but sometimes the paperwork is overwhelming”

Wernher von Braun was not speaking about the paperwork of European data protection filings, but he might as well have been. Having worked for two large companies with operations all across Europe, I’ve probably done more data protection notification filings than just about anyone, and I’m exhausted. I wouldn’t mind, if it wasn’t such a waste of time and money.

Every European country requires that companies file data protection notifications with the local data protection authority. While most other European regulatory fields allow companies to file their regulatory paperwork in their country of origin only, EU data protection requires this to be duplicated in every country. And every country takes a completely different approach, magnifying the work considerably. Some countries require filings on a “per-controller” basis (e.g., the UK requires one filing per company), others require filings on a “per-database” basis (e.g., France requires filings for all “databases”, whatever that means). Some countries provide exemptions from some or all filing requirements if the company appoints a data protection officer (e.g., Germany). In case you’re interested, see this helpful “vademecum”, a summary of the filing requirements across Europe. The summary runs to 76 pages:
http://ec.europa.eu/justice_home/fsj/privacy/docs/wpdocs/others/2006-07-03-vademecum.doc

Companies in Europe take one of two approaches. The vast majority essentially ignores the filing requirements completely, or fills them out with cursory and meaningless generalities (e.g., "yes, I have a database with my employees' names"). The minority spend a lot of time and money in trying to complete these filings conscientiously. Having worked in the latter category, I have some ideas for a radical revision of the entire process.

1) Filings should be required in a company’s Country of Origin only. This is a classic Common Market concept, and if it works in so many other areas of European regulatory law, I think it should work for data protection too.
2) Filings should be required only once for each Controller (i.e., company). The concept of multiple filings for each database is archaic, and makes no sense in the modern world of IT, where “databases” can be created by any employee with a few keystrokes.
3) Delete all requirements for “prior approval” for international transfers. Data protection authorities already cannot meet the requirements to review and provide the theoretic “prior approval” required for international data transfers. In the era of the Internet, such transfers are routine, instantaneous and unproblematic.
4) Re-allocate all the money that will be saved from this simplification of data protection filings to more productive purposes: companies can spend it on real improvements to their privacy practices, and the data protection authorities can spend it on higher priorities, like education, advocacy, and enforcement.

The current European maze of data protection filing requirements makes less and less sense every day. Leading privacy thinkers, like the UK Information Commissioner Richard Thomas, are starting to call for a re-think: “There may be scope for less bureaucracy, less emphasis on prior authorisation and more concrete focus on preventing real harm.” ICO press release of March 9, 2007 www.ico.gov.uk
Enterprise and Industry Commissioner Günter Verheugen has repeatedly called on the Commission to cut the burden of red tape. Simplifying and improving the EU regulatory environment is one of the Commission’s key instruments under the Lisbon Strategy to revitalize Europe's economy. Let’s start here!

Saturday, March 3, 2007

Are there things you only tell your dog?


There’s a lot of hope that Privacy Enhancing Technologies (called PETs) will restore the privacy that technology took away. When you speak with someone on the telephone, you can be reasonably assured that there is no record of the contents of your communications, since it’s generally illegal to record a phone call without notice. But the evolution of communications technologies has unfortunately undermined that sense of confidentiality. When you send an email, you know that the contents of your communications may be permanently retained by the recipient, forwarded, or read by third parties. And online chatting raises the same privacy issues, in a medium where people tend to ramble on with even less thought.

I am therefore heartened by a PET in Google’s instant messaging service, called Talk. With a simple click, you can take the chat “off the record,” preventing the person with whom you’re chatting from retaining a written copy of the communication. In fairness, the confidentiality is not absolute, since someone could always take a screenshot of the message to retain it. You can read more about how it works, and its limitations, here:
http://www.google.com/talk/chathistory.html#offrecord
But, for everyday purposes, the “off the record” functionality restores some of the evanescence of communications that have become lost. I don’t think the Internet will ever offer the same level of anonymity as talking to your dog, so there are things you may only want to tell your Rover. But then, as Andy Rooney said: “If dogs could talk, it would take a lot of the fun out of owning one.”

Tuesday, February 27, 2007

The Slippery Slope of Data Retention


The Article 29 Working Party issued a blunt Opinion in March 2006 about data retention: “The decision to retain communication data for the purpose of combating serious crime is an unprecedented one with a historical dimension. It encroaches into the daily life of every citizen and may endanger the fundamental values and freedoms all European citizens enjoy and cherish.”
http://ec.europa.eu/justice_home/fsj/privacy/docs/wpdocs/2006/wp119_en.pdf

The Working Party went on to make some concrete, practical recommendations for Member States to address when they implement the Directive. As someone who will likely be on the receiving end of law enforcement requests, and will likely struggle with the ambiguities of the law, I’d like to highlight four of their recommendations, all of which present slippery slopes indeed.

1) Since the Directive mandates retaining data for the purposes of investigating “serious crime”, that term should be defined. What is a “serious crime”? And which crimes are not “serious”? I’m sure terrorism and child pornography are “serious”. But is defamation “serious”? And if the law doesn’t define them, who are going to decide: law enforcement, or the companies receiving these orders, or independent arbiters?

2) The data should only be available to specifically designated law enforcement authorities. The Working Party opined that a list of such designated law enforcement authorities should be made public. In the absence of such a public list, I’m sure that lots of officials will make requests for data. To take just one European country, France are we talking about the gendarmerie, the police, the CRS, investigative magistrates, military personnel, diplomatic officials, or any of many other officials? And for companies dealing with cross-border issues, how else could companies know which officials are “designated” in 27 different countries, each with different languages and legal systems?

3) Investigations should not entail large-scale data-mining. But in practice, who is going to enforce limitations on data mining: the companies that refuse to provide large amounts of data? Google famously went to court to challenge a DOJ subpoena in the US for large amounts of data, but 34 other companies receiving requests from the DOJ around the same time did not.


4) Access should be authorized on a case by case basis by judicial authorities or other independent scrutiny. If this Working Party recommendation were implemented, it would indeed insert a level of independent review. In the absence of such a process, who ensures that the requests are indeed valid under the laws? It’s optimistic to assume that all the recipient companies in Europe will exercise independent scrutiny, and only answer the types of requests that a judge or independent authority would have authorized.


We’re on a slippery slope, and we need much clearer rules. Or, as W Somerset Maugham put it: “There are three rules for writing the novel. Unfortunately, no one knows what they are.”