The missing clauses in Google’s “Customer Match”

In September Google announced “Customer Match”, a new tool for advertisers to target their existing customer using their email addresses. “Customer match” is almost like Facebook’s “Custom Audiences” but Google and Facebook seem engaged in “a privacy race to the bottom” and Google may have taken the lead.

Targeting email addresses

Advertisers aim at targeting prospects and existing customers. While remarketing offers them the opportunity to target potential buyer, advertisers were so far not able to differentiate between their existing  customers and new prospects. They also lacked the possibility to target their “loyal” clients (i.e. those who have subscribed to a loyalty card) because there is no link between the cookie IDs assigned to their browser by ad-networks and their loyalty card number or even their online customer account. “Custom Audience” and “Customer Match” (thereafter “Customer targeting”) create a bridge between the email addresses  used to create a “Best Buy” account or CVS loyalty card and Google and Facebook accounts.

Via “Customer targeting”, advertiser will be able to pull the information they gathered about your shopping habits and leverage it to target you on Facebook and on Google. The advertiser won’t send directly ads that they want to show you and attach it to your address. Instead they will create group of “audiences” by creating groups of email addresses of  their customers. They will send those hashed email addresses to Facebook (or Google) which will check to see if those hashed email addresses match those of registered users.

Technically, Facebook does not see the email address but just the hash. So if you’re not in their user database they will not be able to know that you’re a “Best Buy” customer. That being said, the technical guarantee may not be sufficient considering the computational  resources of giants like Google and Facebook that could generate many hashes to brute force the hashed email address and retrieve the lists of customers. In fact, in another context Google seems to admit this and required that Google Analytics user don’t send hashed identifier like email addresses or phone numbers.

Therefore the only guarantees are contractual; they are the engagements that Google and Facebook take when they receive email addresses (or phone numbers). Facebook and Google are committed to not retrieving the email addresses of people that are not registered to their services. Similarly their contractual clauses prevents them from keeping those lists of hashed identifiers for more than a week (that would remain largely enough for them to break most of them).

Facebook ToS

Facebook Terms of Service are quite constraining for Facebook itself as they more or less prohibit Facebook from doing anything with the hashed email addresses other than using them to help an advertiser reach its audience. Therefore, Facebook cannot add information to the profile of its users. In fact Facebook specifically forbid appending “Custom Audience” data to users’ profiles. Furthermore, Facebook won’t let an advertiser target the audience of another advertiser. For instance “Target” should not be capable to target “Best Buy” customers. Facebook adopts a data processor position with respect to Custom Audience, the advertiser being the data controller.

Except of "Custom Audience ToS" https://www.facebook.com/ads/manage/customaudiences/tos.php

Except of “Custom Audience ToS” https://www.facebook.com/ads/manage/customaudiences/tos.php

 

Google Customer Match

Google took another approach with its service. Google did not include clauses to prevent them from appending “Customer Match” data to user’s profiles. The restrictions only impact the list of email addresses , but there is no restrinction on the use of the list of matched profiles which can therefore be used by Google.

Customer Match conditions from https://support.google.com/adwords/answer/6276125

“Customer Match” conditions from https://support.google.com/adwords/answer/6276125

 

In fact, Google implicitly admitted that these data will be appended to user profiles when it modified its Privacy Policy in August to include data obtained from partners in Google Accounts data.While the change remained unnoticed then, it became clearly more critical after “Customer Match” was announced.

Change made to google's privacy policy on August 19th

Change made to Google’s privacy policy on August 19th

 

Consequences of Google posture

Google’s decision to include “Customer Match” data in its user accounts will impact user’s privacy and also advertiser’s competition.

  • Since the data will be included in the account, it means that Google will have a more comprehensive view of its users which is a big step to merge offline and online data (also known as data onboarding). This may have significant negative impact as it puts Google at the center of all these data-flows… until Facebook announces its riposte.
  • On the up-side, this could be beneficial for transparency because users could be made aware of the advertisers targeting them if Google shows these data on privacy dashboards (that’s a big if).
  • However, because Google is a data controller with respect to ‘Customer Match”, advertisers may be reluctant to share information about their customers knowing that it could potentially be re-used by competitors or by Google itself. Not only Google could share these data with other advertisers, thus allowing competitors to target each others audience to stir-up the demand and thus the price, but Google could also be tempted to use the data for its direct benefit.

Acknowledgement

Thanks to Armand Heslot for providing feedback on a draft.

Facebook may violate the FTC settlement in a few days

Update: Facebook started to show the announced prompt and ask for user consent.

Almost a year after it removed the option for 90% of its members, Facebook informed on Wednesday the remaining 10% that they’ll remove the “Who can search my timeline by name”  setting in a few days. Removing this setting si likely a violation of the 2011 FTC settlement.

Timeline concealed to the public

A month ago Facebook announced that they’ll prompt user to get their consent before removing the setting [1] but they finally decided to just inform users with an email and a very short notice displayed above the News Feed.

fbcomparison

In the mail sent to its members, Facebook argues that when they created this setting “the only way to find [them] on Facebook was to search for [their ]specific name. Now, people can come across [their] Timeline in other ways: for example if a friend tags [them] in a photo, which links to [their]Timeline, or if people search for phrases like “People who like The Beatles,” or “People who live in Seattle,” in Graph Search”. However, I’m confident that some users – including me — are not tagged in public photo, do not like public content and have no friend whose “friends list” is public.

Timelines of these users will not appear in public Graph Search results Facebook and there is no public link that could be used to find them. As a matter of fact, people who are not my friends (or friends of friends)  can’t even know if I have a Facebook account. As for today, the only solution to find my Facebook Timeline is to test the 1.2 billion userID numbers. In addition to be time consuming, this exhaustive search would violate Facebook Terms of Services.

Private vs Nonpublic

A Timeline page is public because any user can load its content but Timelines URLs (i.e. usernames) are not public since not anyone can find them: without the search functionality, it is not possible to retrieve the Timeline associated to a specific user. Timelines URLs are like unlisted phone numbers or Google Docs shared with “anyone with the link”. These documents may not be seen as private but I would not define them as public (i.e. I’d be unpleasantly surprised to see them used in an endorsed advertisement). I do not claim that Timelines are private, only that they are “nonpublic user information” .

Why Facebook could violate the FTC settlement

The FTC settlement does not focus on user private information but cover the entire nonpublic user information (e.g. a user ID to which access is restricted by a privacy setting). Indeed, Section II-A of the 2011 settlement requires that Facebook “prior to any sharing of a user’s nonpublic user information by [Facebook] with any third party, which materially exceeds the restrictions imposed by a user’s privacy setting (s), shall […] obtain the user’s affirmative express consent”.

Facebook will not only remove the possibility to select who can look-up timelines, they will set the setting to its default values “Everyone”. Hence, Facebook will modify settings of users who set it to a more restricted audience. Obviously the two lines message Facebook displayed and the email they sent to the affected members does not offer a valid solution to get an affirmative express consent. So Facebook will certainly violate the FTC settlement in a few days.

[1] Coincidentally, Facebook made this announcement about 5 hours after I tweeted that they should get an informed consent.

 

Your hidden friends, betrayed by their like

Graph Search as a privacy tool

According to Facebook, Graph Search not only helps people finding information about their friends, it also helps them to know what information they reveal about themself. I find this objective questionable especially in France where many people are still not aware that Graph Search even exist [1] and yet have their profiles searchable by anyone in the US. Yet, Graph Search is certainly very useful and educative about what could go wrong with tagging and shared content.

The issue of the Friend List

When Facebook announced Graph Search in January, I was surprised by their decision to not show friends lists that could be recomposed by browsing timelines. Recomposing part of someone friends list was time consuming but possible if you spent time scrolling down the timeline.

Last July update of Graph Search makes it even simpler to retrieve list of friends of people who hide it. Indeed, Graph Search now allows you to search who liked or commented on photos. Since some content is only visible to my friends, only they can comment or like my pictures. Having a list of people who liked or commented on my photos is like having a list of my friends with who I share things on Facebook. Some people that I do not know commented on my photos, but that’s a negligible fraction.

GraphSearchpng

Unwanted side effects

Surprisingly, it seems that you can even know if someone liked a photo you don’t have access to. Indeed, in some circumstances, you cannot see which picture has been liked; you only know that someone liked a picture (see bellow). It goes against Facebook claim that Graph Search only gives you access to information you already had.

Update: In fact, the person who liked the picture is not searchable but she appears in the search results because she liked a public photo.

unlikedphotos

The picture liked by the first person is accessible, not the second one

Another annoying effect is that queries like “People who liked photos by me” returns a list of people with who I’m no longer friend. And it’s pretty easy to spot these people because they are systematically at the end of the result list.

How bad is it?

To measure the fraction of the friend list that could be retrieved through Graph Search, I listed the number of results that were listed when I search for:

  • Q1 :“People who liked photos by X”
  • Q2:“People who commented on photos by X”
  • Q3:”People who uploaded photos liked by X”
  • Q4: “People who uploaded photos of X”

Unfortunately, Graph Search does not (yet?) support ‘OR’ queries so there is no easy way to quantify the overlap between these four queries . I reported numbers of confirmed retrieved friends (using the “mutual friend” filter) and  the total number of retrieved people because it also includes former friends. I compare that to the number of friends I have (and I thank my friends who did not hide their friends list).

X Q1 Q2 Q3 Q4 N Friends Ratio
me 59 ( 73)  43 (45)   42(54) 19(20)  207 28.50%

I made some tests on a few  friends and I obtained similar results [2], queries Q1 and Q3 are the more effective queries in general. On average, Graph Search returns 30% of friends, plus some former friends. I guess I could retrieve up to 40-50% by combining the four queries. It’s problematic because many people assume that their friend’s lists are safe, but this safety goes away when they share likable photos or when they like photos.

Since “Like” visibility is public, you can even retrieve some friends of people with who you have no connection. I can imagine many circumstances where having your list of friends publicly available is very problematic.

What can you do?

Unfortunately, you cannot prevent your friends from liking content you share with them. Likes are not like tag or comments: they cannot be removed. The only current solution is to not share “likeable” content or to ask to people to not like it, but that’s very counter intuitive on Facebook. In the end, you can only hide friends who don’t “like” you.

Another solution is to obfuscate the list of people who liked your pictures. I probably rely too much on obfuscation, but asking people you don’t know to like your photos is currently the only technical solution to prevent stalkers from quickly retrieving your friends.

Acknowledgements
: Thanks to my stalked friends who do not share their friends lists, they motivated this post. Thanks to those who do share their list, they helped me to make this post relevant.

[1] If you have not yet enabled “Graph Search”, I recommand you to do so. See http://www.fredzone.org/comment-activer-le-graph-search-de-facebook-929

[2] I’ll post more results when I’ll get their consent

Facebook Graph Search: Showing what is not Shared

When Facebook announced Graph Search, they emphasized that they designed it with privacy in mind and yet made two different statements. First, M. Zuckerberg said that it’ll give access only to « things that people have shared with you »   while T. Stocky said that «[you] can only search for what [you] can already see on Facebook».
I define the « content I share with you» as the content you can see on my timeline which is in fact a subset of the content you could see about me on Facebook. But Facebook has a different definition and considers « content shared » as everything about me that is visible, even if it normally requires a considerable effort to find it.

Finding pictures with Graph Search

Facebook made it clear that hiding photos on your timeline is no longer enough to prevent people from seeing them. With graph search, it’s now very simple to find all the photos of someone that are visible to you.
For instance, if one of your friends is tagged on pictures that he decided to remove from its timeline, these “hidden” photos will appear in graph search if you have access to them.
It was already possible to find « hidden » pictures a friend was tagged on but it required a considerable amount of time and effort: you had to go through the list of all his friends and check their pictures in case your friend might appear on some of them. Unless you were really creepy, your friends were safe to assume that most of their « hidden » pictures would not be viewed by you. That’s no longer the case and to control who can see your “hidden” pictures you’ll have to delete tags or ask your friends to limit the pictures visibility .

Removing tag is not the solution

Tags is not only a feature used to annotate content, it’s also used to know when someone comments a picture you appear on.  If you delete the tag, you lose the possibility to quickly know how people react to a photo. Not removing a tag is different than sharing a picture. Assuming that people want to share every picture they’re tagged on is wrong, especially when they’re a « share » button that allows them to do precisely that.
Unlike posts on your timeline, tags don’t have to be reviewed before they appear in Graph Search. To control photos of you that will appear in graph search, you have to frequently visit Facebook and remove unwanted tags. You have no option to proactively control your image on Facebook other than relying on your friends to not tag you without your consent.

The case of friends list

Strangely Facebook did not adopt the same definition of “sharing” with the “Friends list”. Assuming we’re friends; if you’ve decided to hide your friend list from your timeline, I can try to recompose it by visiting each of your friend’s timeline and check that you appear as a mutual friend. It would require knowing some of your friends first, but that’s fairly easy if they posted something on your timeline. By iterating this process, I could retrieve a subset of your friends. Like photos you are tagged on, this subset is presumably shared by you but won’t appear if I search for the list of your friends.