A View-Based Protection Model to Prevent SNS API Inference Attacks

Extensibility of social networks has had a significant impact in  their large popularity. However, this comes with the price of exposing  user information to 3rd-party extensions. Permission-based access  control mechanisms can control access to user information, but they  cannot control inference of private information from public information.


Modern social computing platforms (e.g., Facebook) are extensible.  Third-party developers deploy extensions (e.g., Facebook applications)  that augment the functionalities of the underlying platforms. A platform  API is provided by social networks so that third-party extensions could  connect to the social graphs and access user information. Although this  resulted in a drastic growth in the popularity of social networks, it  also raised serious concerns about potential misuse of user information  that has been made accessible thought the API. Without doubt, there are  various privacy concerns associated to every information system.  However, when an information system provides tools for third-parties to  systematically access and harvest its content, then the privacy concerns  are significantly heightened. This motivated me to focus my PhD thesis  on addressing privacy threats in social computing platform that are  derived from third-party extensions.

Social network providers put a lot of efforts into protecting privacy  of their users. Permission-based authorization mechanisms are employed  to allow users determine different levels of access to their information  for other users in the social graph as well as third-party extensions.  However, these protection mechanisms fail to prevent the inference of  users’ private information from their public information. This type of  privacy breach is generally called inference attacks. We coined the name SNS API inference attacks for  the inferences that are made based on the information accessible  through the platform API of Social Network Systems (SNS). I conducted an  empirical study to demonstrate the inadequacy of the existing  mechanisms in protecting user privacy. In this empirical study, I  developed a third-party application for Facebook platform API and asked  424 Facebook users to subscribe to my application. The application then  executed several sample inference algorithms against the participants’  user profile. The success rates of the sample algorithms were evaluated  to alarmingly large figures. For instance, one of the algorithms could  successfully infer the youngest sibling for 69% of the participants. The  complete result of this experiment was reported in [1].

Significance of the Problem

A naïve interlocutor may argue that the above issue has already been  addressed by the permission-based access control mechanism, in that  third-party extensions cannot access user information without seeking  the required permissions. If a user does not trust a third-party  application, then she shall not authorize it or use it. This argument  presumes that ordinary users have the necessary information and  expertise to judge whether the applications they subscribe to are  benign. In reality, most of the third-party applications are developed  by developers who are not widely known to the user community. Not only  that, the application is running on an untrusted server, meaning that  there is no mechanism to monitor if the application is malicious. It is  therefore not always possible for a user to assess if she can trust an  application. It is our position that security-by-disclaimer is not a  meaningful protection strategy. An interlocutor may also claim that SNS  API inference attacks are but another minor privacy violation that does  not warrant our attention. I disagree for two reasons. First, analyzing  the threats of any security or privacy concern must be accompanied by  assessing the number of potential victims. If one develops a website  with around 100 registered users, revealing their registration  information means violating the privacy of only 100 users. However, when  the number of potential victims reaches 50 million, then we are facing a  problem with costly consequences. Popular Facebook applications may  command a monthly active user count of 50 million. This implies that an  inference attack with a meagre success rate of 10% leads to privacy  violations of 5 million victims. Second, SNS API inference attacks can  be employed as a building block for conducting more dangerous security  attacks. For instance, a well-known alternative authentication mechanism  is to ask users a security question such as, “what is the name of your  youngest sibling?”, “who is your favorite author?”, etc. Due to the  nature of information that people upload to their SNS user profiles,  answers to these security questions can usually be harvested  systematically by launching inference attacks. The ability to answer a  victim’s security questions is the first step of identity theft.  Therefore, inference attacks could be an initial step in the launching  of more dangerous attacks. Now, who is best positioned to launch covert  inference attacks? The answer is third-party extension developers.

View-based Protection

The discussion above shows that controlling access is insufficient  for preventing SNS API inference attacks. The reason is that there  exists statistical correlation between sensitive information (which the  user attempts to hide) and accessible information (which the user allows  access). A malicious third-party application can exploit this  correlation to infer sensitive information from the information that is  legitimately accessible under the access control model. The key to  protection is thus the breaking of correlation rather than simply  denying access to sensitive information. In my PhD thesis, I advocate a  view-based protection model. Under this model, when a third-party  application A queries the profile P of user uthrough the API, the query Q is not evaluated against P itself. Instead, P first undergoes a sanitizing transformation T(P), before Q is evaluated against the sanitized profile. The transformation T is called a view, which is specified by the user and/or the platform. T is  thus an enforcement-layer privacy policy. A view may eliminate certain  attributes (access control), or probabilistically transform the profile  with the aim of perturbing the statistical correlation between sensitive  and accessible information. In other words, view-based protection  subsumes access control. The mathematical formulation of privacy and  utility goals, and the proof method for establishing that a given view  satisfies the two goals, are the topics of a recent paper [2] that I published with the help of my supervisors, Philip Fong, and Reyhaneh Safavi-Naini.

Challenge of View Materialization

How shall one implement view-based protection in an efficient manner? A naive approach is to compute T(P) every time P is queried. The problem is that P can be large (imagine everything in one’s timeline, photo albums, etc.), thereby causing even the most innocent query Q to be penalized in performance. Another approach is to have the SNS store both P and T(P). The problem is that T(P) will have to be recomputed every time P is updated (which happens frequently). Not only that, T is specific to the user u and the application A, meaning that the SNS needs to store a T(P) for every application A that user u subscribes  to – a space inefficient option. In another paper, which is still under  peer review for publication, I propose a middle way. The computation  of T(P) is called the materialization of view T. I argue that materialization should be performed in a lazy manner, at the time of query. To see this, the query Q may not access all components of profile P. Instead of eagerly applying T to the entire profile P, we apply T to the parts of P that are visible to query Q. A simple query that involves only a small fragment of profile P will  therefore incur only a meager amount of materialization, thereby  preventing the performance penalty of the above approach. As T is  computed at the time of query, there is no need to maintain multiple  materialized views, thereby preventing the view maintenance problem of  approach 2. In [3], I propose a language-independent enforcement  mechanism that materializes a view in a lazy manner. Moreover, I present  a new type of state machines to model sanitizing transformation so that  they could fit in the proposed enforcement framework. The state  machines are composable, which means complex transformations can be  built by composing simpler transformations. Via another experiment, the  performance and effectiveness of the view-based protection model is also  demonstrated. Please refer to my thesis for further details [3].

I am highly thankful to my supervisors, Dr. Philip Fong and Dr.  Reyhaneh Safavi-Naini for their wonderful support and contributions.

  1. Seyed Hossein Ahmadinejad and Philip W. L. Fong. Unintended Disclosure of Information: Inference Attacks by Third-Party Extensions to Social Network Systems. Computers and Security, 44:75-91, July 2014. Elsevier.
  2. Seyed Hossein Ahmadinejad, Philip W. L. Fong, and Rei Safavi-Naini. Privacy and Utility of Inference Control Mechanisms for Social Computing Applications.  To appear in Proceedings of the 11th ACM Asia Conference on Computer  and Communication Security (ASIACCS’2016), Xi’an, China, May 30 – June  3, 2016.
  3. Seyed Hossein Ahmadinejad. Fong. A View-Based Protection Model to Prevent Inference Attacks by Third-Party Extensions to Social Computing Platforms. PhD Thesis, University of Calgary, Calgary AB, Canada, 2015.