Data Ownership – Media Tech Law

Legal Issues in Ad Tech: De-Identified vs. Anonymized in a World of Big Data

Posted May 31, 2017 by Andrew Mirsky 0

In the booming world of Big Data, consumers, governments, and even companies are rightfully concerned about the protection and security of their data and how to keep one’s personal and potentially embarrassing details of life from falling into nefarious hands. At the same time, most would recognize that Big Data can serve a valuable purpose, such as being used for lifesaving medical research and to improve commercial products. A question therefore at the center of this discussion is how, and if, data can be effectively “de-identified” or even “anonymized” to limit privacy concerns – and if the distinction between the two terms is more theoretical than practical. (As I mentioned in a prior post, “de-identified” data is data that has the possibility to be re-identified; while, at least in theory, anonymized data cannot be re-identified.)

Privacy of health data is particularly important and so the U.S. Health Insurance Portability and Accountability Act (HIPPA) includes strict rules on the use and disclosure of protected health information. These privacy constraints do not apply if the health data has been de-identified – either through a safe harbor-blessed process that removes 18 key identifiers or through a formal determination by a qualified expert, in either case presumably because these mechanisms are seen as a reasonable way to make it difficult to re-identify the data.

“Do Not Track” and Cookies – European Commission Proposes New ePrivacy Regulations

Posted May 2, 2017 by Andrew Mirsky 0

The European Commission recently proposed new regulations that will align privacy rules for electronic communications with the much-anticipated General Data Protection Regulation (GDPR) (the GDPR was fully adopted in May 2016 and goes into effect in May 2018). Referred to as the Regulation on Privacy and Electronic Communications or “ePrivacy” regulation, these final additions to the EU’s new data protection framework make a number of important changes, including expanding privacy protections to over-the-top applications (like WhatsApp and Skype), requiring consent before metadata can be processed, and providing additional restrictions on SPAM. But the provisions relating to “cookies” and tracking of consumers online activity are particularly interesting and applicable to a wide-range of companies.

Cookies are small data files stored on a user’s computer or mobile device by a web browser. The files help websites remember information about the user and track a user’s online activity. Under the EU’s current ePrivacy Directive, a company must get a user’s specific consent before a cookie can be stored and accessed. While well-intentioned, this provision has caused frustration and resulted in consumers facing frequent pop-up windows (requesting consent) as they surf the Internet.

Dataveillance Protection: The E.U.-U.S. Privacy Shield

Posted December 20, 2016 by Andrew Mirsky 0

For many years, technology outpaced policy when it came to standards and protections around ownership of and access to personal data. Privacy policies are not set by governments but rather by technology companies that created the digital world as it is experienced today. Many if not all of the dominant players in this space are American technology companies that include Alphabet (i.e. Google), Apple, Amazon, Facebook and Microsoft. These companies have more say about a user’s online life than any individual local, state or national government.

Legal Issues in Ad Tech: IP Addresses Are Personal Data, Says the EU (well … sort of)

Posted November 20, 2016 by Andrew Mirsky 0

Much has been written in the past 2 weeks about the U.S. Presidential election. Time now for a diversion into the exciting world of data privacy and “personal data”. Because in the highly refined world of privacy and data security law, important news actually happened in the past few weeks. Yes, I speak breathlessly of the European Court of Justice (ECJ) decision on October 19th that IP (internet protocol) addresses are “Personal Data” for purposes of the EU Data Directive. This is bigly news (in the data privacy world, at least).

First, what the decision actually said, which leads immediately into a riveting discussion of the distinction between static and dynamic IP addresses.

The decision ruled on a case brought by a German politician named Patrick Breyer, who sought an injunction preventing a website and its owner – here, publicly available websites operated by the German government – from collecting and storing his IP address when he lawfully accessed the sites. Breyer claimed that the government’s actions were in violation of his privacy rights under the EU Directive 95/46/EC – The Data Protection Directive (Data Protection Directive). As the ECJ reported in its opinion, the government websites “register and store the IP addresses of visitors to those sites, together with the date and time when a site was accessed, with the aim of preventing cybernetic attacks and to make it possible to bring criminal proceedings.”

The case is Patrick Breyer v Bundesrepublik Deutschland, Case C-582/14, and the ECJ’s opinion was published on October 19th.

Private Metadata is Doublethink

Posted October 21, 2016 by Andrew Mirsky 0

It is commonly said that if you do not want digital information found, then you should not create any. A challenge however is that there is a class of digital information, called metadata, that often is not created by you but is nonetheless associated with you. While content is any information that can be published or posted online, such as tweets, photos, videos, ebooks, Facebook posts and games, to name a few, metadata is something entirely different.

At first metadata was used to help describe content. It was automatically assigned by the software or app developer to help make the content easier to sort, index and later find. Location tags for example, are a common form of metadata. When content is posted online, the location of the sender may be attached to the content. A person posting a picture of a beach to his Facebook account will reveal to his network that he is in Cabo San Lucas, Mexico and that the picture was taken two days ago at 3:30 PM.

Metadata is expressive now, meaning it reveals information about the individual’s behavior and ultimately identity, regardless of whether the individual wants to be known or the behavior identified, based on the content the individual creates, and what the individual searches or buys online. Thus a user who posts a picture of a positive pregnancy test, shops for prenatal vitamins and performs an online search for baby furniture can be inferred to be female and pregnant.

Legal Issues in Ad Tech: Who Owns Marketing Performance Data?

Posted October 11, 2016 by Andrew Mirsky 0

Does a marketer own data related to performance of its own marketing campaigns? It might surprise marketers to know that data ownership isn’t automatically so. Or more broadly, who does own that data? A data rights clause in contracts with DSPs or agencies might state something like this:

“Client owns and retains all right, title and interest (including without limitation all intellectual property rights) in and to Client Data”,

… where “Client Data” is defined as “Client’s data files”. Or this:

“As between the Parties, Advertiser retains and shall have sole and exclusive ownership and Intellectual Property Rights in the … Performance Data”,

… where “Performance Data” means “campaign data related to the delivery and tracking of Advertiser’s digital advertising”.

Both clauses are vague, although the second is broader and more favorable to the marketer. In neither case are “data files” or “campaign data” defined with any particularity, and neither case includes any delivery obligation much less specifications for formatting, reporting or performance analytics. And even if data were provided by a vendor or agency, these other questions remain: What kind of data would be provided, how would it be provided, and how useful would the data be if it were provided?

Legal Issues in Ad Tech: Anonymized and De-Identified Data

Posted September 30, 2016 by Andrew Mirsky 0

Recently, in reviewing a contract with a demand-side platform (DSP), I came across this typical language in a “Data Ownership” section:

“All Performance Data shall be considered Confidential Information of Advertiser, provided that [VENDOR] may use such Performance Data … to create anonymized aggregated data, industry reports, and/or statistics (“Aggregated Data”) for its own commercial purposes, provided that Aggregated Data will not contain any information that identifies the Advertiser or any of its customers and does not contain the Confidential Information of the Advertiser or any intellectual property of the Advertiser or its customers.” (emphasis added).

I was curious what makes data “anonymized”, and I was even more curious whether the term was casually and improperly used. I’ve seen the same language alternately used substituting “de-identified” for “anonymized”. Looking into this opened a can of worms ….

What are Anonymized and De-Identified Data – and Are They the Same?

Here’s how Gregory Nelson described it in his casually titled “Practical Implications of Sharing Data: A Primer on Data Privacy, Anonymization, and De-Identification”:

“De-identification of data refers to the process of removing or obscuring any personally identifiable information from individual records in a way that minimizes the risk of unintended disclosure of the identity of individuals and information about them. Anonymization of data refers to the process of data de-identification that produces data where individual records cannot be linked back to an original as they do not include the required translation variables to do so.” (emphasis added)

Or in other words, both methods have the same purpose and both methods technically remove personally identifiable information (PII) from the data set. But while de-identified data can be re-identified, anonymized data cannot be re-identified. To use a simple example, if a column from an Excel spreadsheet containing Social Security numbers is removed from a dataset and discarded, the data would be “anonymized”.

But first … what aspects or portions of data must be removed in order to either de-identify or anonymize a set?

But What Makes Data “De-Identified” or “Anonymous” in the First Place?

Daniel Solove has written that, under the European Union’s Data Directive 95/46/EC, “Even if the data alone cannot be linked to a specific individual, if it is reasonably possible to use the data in combination with other information to identify a person, then the data is PII.” This makes things complicated in a hurry. After all, in the above example where Social Security numbers are removed, remaining columns might include normally non-PII information such as zip codes or gender (male or female). But the Harvard researchers Olivia Angiuli, Joe Blitzstein, and Jim Waldo show how even these 3 data points in an otherwise “de-identified” data set (i.e. “medical data” in the image below) can be used to re-identify individuals when combined with an outside data source that shares these same points (i.e. “voter list” in the image below):

Data Sets Overlap Chart

(Source: How to De-Identify Your Data, by Olivia Angiuli, Joe Blitzstein, and Jim Waldo, http://queue.acm.org/detail.cfm?id=2838930)

That helps explain the Advocate General opinion recently issued in the European Union Court of Justice (ECJ), finding that dynamic IP addresses can, under certain circumstances, be “personal data” under the European Union’s Data Directive 95/46/EC. The case involves interpretation of the same point made by Daniel Solove cited above, namely discerning the “personal data” definition, including this formulation in Recital 26 of the Directive:

“(26) … whereas, to determine whether a person is identifiable, account should be taken of all the means likely reasonably to be used either by the controller or by any other person to identify the said person …”

There was inconsistency among the EU countries on the level of pro-activity required by a data controller in order to render an IP address “personal data”. So, for example, the United Kingdom’s definition of “personal data”: “data which relate to a living individual who can be identified – (a) from those data, or (b) from those data and other information which is in the possession of, or is likely to come into the possession of, the data controller” (emphasis added). Not so in Germany and, according to a White & Case report on the ECJ case, not so according to the Advocate General, whose position was that “the mere possibility that such a request [for further identifying information] could be made is sufficient.”

Which then circles things back to the question at the top, namely: Are Anonymized and De-Identified Data the Same? They are not the same. That part is easy to say. The harder part is determining which is which, especially with the ease of re-identifying presumably scrubbed data sets. More on this topic shortly.

License Plate Numbers: a valuable data-point in big-data retention

Posted August 6, 2015 by Rob Ellis 0

What can you get from a license plate number?

At first glance, a person’s license plate number may not be considered that valuable a piece of information. When tied to a formal Motor Vehicle Administration (MVA) request it can yield the owner’s name, address, type of vehicle, vehicle identification number, and any lienholders associated with the vehicle. While this does reveal some sensitive information, such as a likely home address, there are generally easier ways to go about gathering that information. Furthermore, states have made efforts to protect such data, revealing owner information only to law enforcement officials or certified private investigators. The increasing use of Automated License Plate Readers (ALPRs), however, is proving to reveal a treasure trove of historical location information that is being used by law enforcement and private companies alike. Also, unlike historical MVA data, policies and regulations surrounding ALPRs are in their infancy and provide much lesser safeguards for protecting personal information.

ALPR – what is it?

Consisting of either a stationary or mobile-mounted camera, ALPRs use pattern recognition software to scan up to 1,800 license plates per minute, recording the time, date and location a particular car was encountered.

Website Policies and Terms: What You Lose if You Don’t Read Them

Posted July 15, 2015 by Roman Vayner 0

When was the last time you actually read the privacy policy or terms of use of your go-to social media website or you favorite app? If you’re a diligent internet user (like me), it might take you an average of 10 minutes to skim a privacy policy before clicking “ok” or “I agree.” But after you click “ok,” have you properly consented to all the ways in which your information may be used?

As consumers become more aware of how companies profit from the use of their personal information, the way a company discloses its data collection methods and obtains consent from its users becomes more important, both to the company and to users. Some critics even advocate voluntarily paying social media sites like Facebook in exchange for more control over how their personal information is used. In other examples, courts have scrutinized whether websites can protect themselves against claims that they misused users’ information, simply because they presented a privacy policy or terms of service to a consumer, and the user clicked “ok.”

The concept of “clickable consent” has gained more attention because of the cross-promotional nature of many leading websites and mobile apps.

Cookies For Sale? How Websites Obtain Permission to Track and Sell Online User Data

Posted February 19, 2013 by Andrew Mirsky 0

Have you ever wondered how websites get your permission to “install” a cookie on your computer, and then sell the data associated with it? The simple answer… when you accept their terms and conditions, you give them the keys to your data.

There is a marketplace in this country for technology companies, advertisers, media firms and other enterprises to purchase consumers’ cookie “identifiers” and their associated information, allowing those organizations to know where you are, and what you are doing, online. Almost always, this information is used solely for tracking website analytics, sign-in permissions and for other advertising purposes. A cookie is “placed” onto a website user’s computer through the user’s browser, typically by publishers or their third party partners. The cookie then collects information – pages that you visit, sign-in information, profile information, what you click, what purchases you make, what you read, etc. When this data is sold (if it is sold), most of this information is not personally identifiable, but some of it can be.

In this blog, the first of a few on the topic of cookies, I will briefly explain the process of how and when websites get your permission to install cookies on user’s computers, and how they use the resulting data collected.

First of all, what is a cookie? Google has a two nice working definition that we can use:

(https://support.google.com/chrome/bin/answer.py?hl=en&answer=95647&topic=14666&ctx=topic)

SaaS: Software License or Service Agreement? Start with Copyright

Posted July 16, 2012 by Andrew Mirsky 0

SaaS, short for “Software as a Service”, is a software delivery model that grants users access to a program while the software itself and its accompanying data are stored off-site, on a vendor’s (or another third party’s) servers. A user accesses the program via the internet, and the access is provided as a service. Hence … “Software as a Service”.

In terms of user interface functionality, a SaaS service – typically accessed via a subscription model – is identical to a traditional software model in which a user purchases (or more typically, licenses) a physical copy of the software for installation on and access via the user’s own computer. And in enterprise structures, the software is installed on an organization’s servers and accessed via dedicated “client” end machines, under one of many client-server setups. In that sense, SaaS is much like the traditional client-server enterprise model where servers in both cases will likely be offsite, the difference being that SaaS servers are owned and managed by the software owner. The “cloud” really just refers to the invisibility of the legal and operational relationship of the servers to the end user, since even in traditional client-server structures servers might very likely be offsite and accessed only via internet.

MegaUpload – Where is my Data?

Posted June 26, 2012 by Andrew Mirsky 0

A not-insignificant consequence of the federal government’s move in January to shut down the popular file-sharing site MegaUpload is that customers are blocked from being able to access their files.

First, some background. In January, the government charged that MegaUpload and its founder Kim Dotcom operated an organization dedicated to copyright infringement, or in other words operated for the purpose of a criminal enterprise. The site provided a number of online services related to file storage and viewing, which (among other things) allowed users to download copyrighted material. The government also claimed in its indictment that the site was also used for other criminal purposes including money laundering.

Not surprisingly, the file-sharing activities caught the unpleased eye of prominent content ownership groups

MediaTech Law

By MIRSKY & COMPANY, PLLC

Legal Issues in Ad Tech: De-Identified vs. Anonymized in a World of Big Data

Legal Issues in Ad Tech: IP Addresses Are Personal Data, Says the EU (well … sort of)

Legal Issues in Ad Tech: Who Owns Marketing Performance Data?

Legal Issues in Ad Tech: Anonymized and De-Identified Data