Self-Contradictory Policies Used in Android App User Data Collection

A large number of Android app privacy policies make use of self-contradictory language in relation to their user data collection practices, an academic study has revealed.

Published toward the end of last year, ‘PolicyLint: Investigating Internal Privacy Policy Contradictions on Google Play’ was a joint effort between a team of researchers from North Carolina State University, the University of Illinois at Urbana-Champaign, and the IBM TJ Watson Research Center. Its aim was to determine the extent to which user data collection of apps is faithfully reflected in privacy policies.

The study relied on natural language processing algorithms to investigate how words, their meanings, and the context in which they’re used can relate to one another with respects to user data collection and the contents of Android app privacy policies. In order to go about doing this, the researchers created a tool of their own to take on the challenge.

The app — which the researchers named ‘PolicyLint’ — parsed through the privacy policies of 11,430 Play Store apps. It discovered that as many as 1,618 of the apps sampled (14.2% of the total number) used policies that “contain contradictions” which “may be indicative of misleading statements.”

When user data collection becomes misleading

The researchers took their findings a step further by doing much of the fieldwork themselves. By manually verifying PolicyLint’s findings in relation to a sample of 510 of the contradictions found, the team confirmed “concerning trends” which included “misleading presentation, attempted redefinition of common understandings of terms, conflicts in regulatory definitions, and ‘laundering’ of tracking information.”

One example of such contradictions, according to the researchers, is the fact that many Android app privacy policies claim that they do not engage in any user data collection at all. This claim goes on to be upended in later sections, where the policy admits to in fact be engaging in user data collection. Such policies, according to the study, often go on to state that they collect emails or customer names.

The researchers provide the following example as one such contradiction, which was taken from the privacy policy of a keyboard emoji Android app with more than 50 million downloads. The policy stated in one section:

Since we do not collect Personal Information, we may not use your personal information in any way.

This stands in direct contrast to the following statement made later in the policy:

For users that opt in to Emoji Keyboard Cloud, we will collect your email address, basic demographic information and information concerning the words and phrases that you use (“Language Modeling Data”) to enable services such as personalization, prediction synchronization and backup.

The researchers point out that such statements are “clearly contradictory” and “arguably misleading” in nature. They add that many Android app designers have a tendency to use “blanket statements” such as these in relation to how they collected user data, only to allude to their true methods of user data collection in subsequent sections.

Android app use of policy templates

Although the PolicyLint tool was created, in part, to “allow a human analyst to manually inspect the statements and analyze intent,” the researchers themselves made no attempt within the scope of their study to guess as to the intent behind misleading policies.

In spite of this, however, the research team did indeed uncover a major reason for why contradictions show up so often in the first place in Android app privacy policies about user data collection. The study indicated that 59 of the apps analyzed had used online services to auto generate their privacy policies. The researchers concluded that, in these cases, it had been the online services themselves that were the source of the contradictory statements, rather than the developers.

However, when it comes to the remainder of the self-contradictory privacy policies discovered, the fault lies squarely on the Android app designers themselves. In effect, this means that those app makers are susceptible to fines from by the Federal Trade Commission (FTC), or other data protection authorities in the European Union (EU).

The research team even went so far as to try to notify the Android app developers behind the 510 self-contradictory privacy policies which had been verified. The team found contact emails for 260 of those 510 developers, whom they emailed to let know about the results of the study. Out of those 260 email addresses, the researchers reported a vast majority (244) as having received the email, whereas 16 of the email addresses ended up being dead ends.

Of those 244 developers who received the email, 11 responded to the research team, and only three developers corrected their policies. The others either disagreed with the researchers, claimed that the policies that had been researched were, in fact, outdated, or they merely responded with “no comment or clarification.”

Despite the fact that the researchers, evidently, did not manage to succeed in any meteoric transformation of Android app privacy policies for user data collection, the research team do nevertheless see the development of PolicyLint as a step in the right direction. They conclude optimistically about the results of their findings, believing that they “lay the foundation to help ensure the soundness of automated policy analysis and identify potentially deceptive policies.”