[TSRML 2022] Private Data Leakage via Exploiting Access Patterns of Sparse Features in Deep Learning-based Recommendation Systems

Abstract

Sparse and dense features are used in the deep learning-based recommendation models to carry user’s private information and this private data is often protected by the service providers using methods like memory encryption or hashing. In this paper, it is shown that irrespective of the protection used, the attacker would still be able to learn information about which entry of the sparse feature is non-zero through the concept of embedding table access pattern posing a big threat to the security of customer’s sensitive data.

Problem Definition

Deep learning-based recommendation system models exploit different types of information related to the user including user attributes, user preferences, user behavior, social interaction and other contextual information to help the customers with better recommendations and companies with increased revenues. Now, there are two types of features as inputs to a deep neural network to make predictions of items a user may like, namely sparse and dense features.

The sparse sparce features contain only a few non-zero features whereas the dense features contain a large number of non-zero attributes.

These features store the information of a user as well the items in different forms. Sparse features are the discrete and categorical attributes associated with users and items whereas the dense features are the continuous and numerical ones. These features contain information which is personal to the user and is protected by the service provider with the help of memory encryption with hardware such as Intel’s SGX. However, we will look into some of the attacks an attacker may proceed with, with the purpose of stealing user’s personal information where methods like encryption or hash functions may not be useful and information like which entries of the sparse features may be non-zero can be leaked. This is because sparse features have to be projected into the lower dimensional space through an embedding table where the index of the non-zero entries are used as an index for an embedding table lookup. It is shown in this paper, how this leakage could be enough threat to the sensitive information of the users.

Embedding table is a data structure used to represent and store embeddings of these sparse features, and access patterns means to study how the users interact with the items. So, embedding table access patterns means accessing the patterns of the sparse features embedded into an embedding table.

It is demonstrated in the paper how it is possible to identify or extract sensitive information of a user with the help of embedding table access patterns under 4 different types of attacks:

Identification Attack: identifying a user by with the help of combinations of unidentifiable features
Sensitive Attribute Attack: identifying a user by analyzing the user-item interaction behavior
Re-Identification Attack: identifying a user by tracking the same user by analyzing their interaction history.
Hash Inversion with Frequency-based Attack: showing how hashing the sensitive information may not be able to protect it against the attacks, by demonstrating a hash inversion attack based on access frequency.

Working of the DLRM Model and Threat Model

The figure above shows the operation of the representative recommendation model, DLRM. In DLRM, the dense features go through a bottom MLP (multi-layer perceptron) layer whereas the sparse features go through an embedding table layer and get converted into lower-dimensional dense features. Then, these two outputs go through a feature interaction layer and then through a top MLP layer to predict the likelihood of an interaction. Embedding tables play a pivotal role in transforming the sparse categorical feature into a dense numerical representation. Let’s consider an example to understand this. Consider a scenario where users and movies are represented by categorical features such as user IDs and movie genres, respectively. To effectively utilize these features within a deep learning model, embedding tables are employed. These tables act as large lookup tables, with each row corresponding to a unique category or ID. To convert sparse features into dense representations, a lookup operation is performed using the non-zero entries in the sparse feature as an index. For instance, to convert a specific user’s ID into a dense representation, the corresponding row in the user embedding table is accessed, and similarly, for movie genres, the relevant row in the genre embedding table is retrieved. The outcome of these operations is a dense vector, a numerical representation of the user or genre in a multi-dimensional space. These dense representations are subsequently utilized in the recommendation system’s deep learning model, enabling accurate and personalized movie recommendations based on user preferences and movie genres. This process highlights the critical role of embedding tables.

Threat Model

Now, even when the entire dense and sparse features are fully encrypted and are processed under a secure environment, there is a possibility to learn which index holds a non-zero entry by looking at the table access patterns, resulting in compromising with the sensitive user data. To understand the threat model, let’s assume the scenario. Let’s say a user shared their sensitive information with the service provider to get accurate recommendations from the system. Now, this sensitive information is fully protected with the Intel SGX team, but the access pattern of the embedding table is revealed, more specifically revealing which entries of the table are non-zero. The figure below, demonstrates our threat model.

Like this, even when the information is kept safe with the honest-but-curious service provider, the access pattern of the embedding table can help reveal that sensitive information.

Overview

As also mentioned earlier, we will test out our theory of being able to extract sensitive information with the help of embedding table access patterns, with the help of four different types of attacks.

Attack	Goal	Assumptions	Evaluation Metric
Identification	Finding the identity of the users	Attacker observes accesses, Has prior knowledge about distribution of accesses	K-anonymity
Sensitive Attribute	Extracting the sensitive user attributes	Attacker observes accesses, Has prior knowledge about distribution of accesses	Ambiguity
Re-Identification	Tracking users over time based on interaction history	Attacker observes accesses	Precision and Recall
Hash Inversion with Frequency-based Attack	Finding users raw feature values	Attacker observes accesses, Has prior knowledge about distribution of accesses	Inversion Accuracy

This table shows a summary of all the attacks, their goals and basic assumptions which have been used to prove the point of the sensitive information not being safe. Each of these attacks are discussed in detail.

Identification Attack with Static User Feature

User profile attributes, such as gender, city, etc. are usually static in nature i.e., they don’t change with time. or the frequency of change is very low. We can categorize such features into two parts - identifiable and unidentifiable features.

Identifiable features are the features that are capable of directly or indirectly revealing the identity if the user. For example, name, city of residence, userID, etc.

Unidentifiable features are the features that can’t directly expose the identity of the user but can still provide valuable information. For example, gender, education level, search keywords, etc.

Now due to strict regulations, most of the recommendation systems don’t usually collect and use the identifiable features. So, the question arises that are the unidentifiable features enough to make an accurate assumption of who the user might be?

Evaluation setup

To find out whether the unidentifiable features are enough to find out about the user, an open-source dataset by Alibaba has been used, containing static user features, such as userID, groupID, gender, age group, shopping depth, occupation, city level, etc. of around 11.4M users.

Attack Method

After removing all the identifiable features from the dataset, we will be left with 2.1M possible combinations of the unidentifiable features, which would make any user believe that their identity is anonymous or that revealing any of the remaining unidentifiable features, won’t reveal their identity. However, in contrast to the user’s belief, it is observed that in the real world only 1120 combinations of these static feature values are possible based on the real open-source data. We refer to these 1120 combinations as user buckets.

User buckets are unique combinations of the feature values. The motto of the attacker can now be said to be able to recognize the users based on their unique combinations of features.

Taking the user bucket number as the x-axis and the percentage of the users per bucket as the y-axis, we plot the histogram to represent how the user distributions follow the long-tail pattern.

It can be seen from the histogram that there are only a few users in the bucket 600-1120 and in fact there are only 989 users on average across all these buckets and the last 56 buckets have only 1 user. So, these seemingly unidentifiable features may give away the user’s identity by allowing the attacker to launch an identification attack to extract the unique userID and identify the user with a high certainty.

Evaluation Metric

The evaluation metric we have used for this analysis is the K-anonymity.

K-anonymity is a privacy property that measure how well the user privacy is preserved.

If a user’s bucket number is revealed and there are K users in the same bucket then the probability of finding the user is 1/K. The table given below summarizes the number of users with anonymity level below K in the identification attacks.

1-anonymity user means that this is the only user having this particular set of feature values.

Evaluation Results

As shown in the above table, among 56 of the bucket users, there is only 1 user with the specific combination of static features which implies that an attacker can identify these users with 1-anonymity if they can observe this combination of feature values.

Sensitive Attribute attack by Dynamic User Features

In this type of attack, we would see that even when the user’s hide, how the sensitive attributes such as age, gender, interest, etc. can be inferred by analyzing their user-item interaction behavior and how these sensitive features leak through other non-sensitive features through the concept of cross-correlations.

Evaluation setup

For the purpose of evaluation, we have used the Alibaba Ads Display dataset, which contains user-item interactions. This dataset contains around 723,268,134 tuples and each tuple contains information about the userID, categoryID, brand and btag (browse, cart, favour, buy)

Attack Method

Let’s understand the attack method with the help of an example, say in the data, there are 7 age groups and 5 different brands. The user-item interaction based on the age-group and the brand is given below in the connection graph.

Now, from the above graph a basic idea of the people belonging to a particular age group can be made. The user may not want to reveal their age, but the adversary may deduce their age with a high probability based on the type of brand the user has interacted with. In general, we can say that the attacker uses their prior knowledge o popularity of the items between different demographic groups. Then, based on this prior information, they link the query to the demographic who formed most of the accesses to that item. The task of knowing the prior information is not that big of a deal.

Evaluation Metric

The evaluation metric we have used for this analysis is called ambiguity. It helps in determining the likelihood with which an adversary fails to predict a user’s static sparse feature by just viewing their interactions with items. The ambiguity for each item is defined as follows: ambiguity(i) = 100% - max(frequency(i)) where, frequency = distribution vector of all accesses to brand i by different user groups.

An ambiguity(i) = 0 indicates if a user has interacted with item i, the attacker can successfully determine the user’s sparse features.

Evaluation Results

In the graphs shown below, the x-axis shows the percentage of ambiguity where a value of 0 indicates that there is no ambiguity and this brand is always accessed by only one user bucket, whereas a higher ambiguity value depicts that brands are more popular across multiple user buckets. In figure 5(A), it ca be seen that the more than 17% of brands are only accessed by 1 user bucket represented by the leftmost tall bar of PDF, meaning that the attacker can determine the user bucket using those brand interactions. On the other hand, in the CDF curve, for 38% of the brands, the attacker can predict the user bucket with a success rate of greater than 50%. Similarly, the age and gender group versus the ambiguity are shown by graph 5(B) and 5(C) respectively.

Re-Identification Attack

In the re-identification attack, the attacker focuses on tracking the same user over time by just analyzing their interaction history.

Re-identification attack is different from the identity resolution attack, as in the identity resolution attack the aim is to link the users across different system, potentially involving cross-referencing user’s information.

Under this attack, we study two important things:

if the history of the purchases of a user can be used as a tracking identifier for the user.
if an attacker can re-identify the same user who sent queries over time by only tracking the history of their purchases, with no access to the static sparse features.

Evaluation setup

For the evaluation, we have used the Taobao dataset, and have separated about 9M purchase interactions among more than 723M user-item interactions. Then they have formatted the data into a time-series data structure, as shown below: user1 = (time1,item1), (time4,item10), (time500,item20) user2 = (time3,item100), (time20,item100) . . . . user_X = (time5,item75), (time20,item50), (time100,item75), (time400,item1), (time420,item10) Now for each set of consecutive items purchased by an user, we create a list of users who have the same set of consecutive purchases in exactly that order. We refer to these sets of consecutive recent purchases as keys.

Multiple users may have the same key.

The goal of this attack is to use m most recent purchases mad by a user to track them across different interactions sessions. To evaluation setup of this attack can be carried out as follows:

randomly select a timestamp and a user
for that selected user, we check m most recent purchases at that selected timestamp and form a key
we then look up this key in the recent item purchase history dataset
- if the same sequence of m most recent items appear on another user at the same timestamp, this means those recent purchases are not unique or that specific user at that time and hence, doesn’t represent a single user.
- if the m items purchase history only belongs to that specific user, the duration of the time in which this key forms the most recent purchases of the user is extracted.
the same is repeated for many random time stamps and users to obtain 200,000 samples.

By plotting the data, we may notice that the 3, 4 and 5 most recent purchases uniquely identify users with 99% probability.

Attack Method

For the period of time the recent purchases remain the same, every query sent by the user has the same list of recent purchases, i.e., most recent items purchased by a user usually do not change with a very high frequency. The attacker uses this knowledge to launch the attack. So, the attacker first selects a time threshold. This time threshold is chosen to help the attacker to decide if the queries come from the same user or not.

It helps in the way that if the time difference between receiving them is less than the time threshold and two distinct queries received by the cloud have the same most recent purchases, the attacker will predict that they come from the same user. Otherwise, it is assumed that queries come from two different users.

Evaluation Metrics

We use the concept of precision and recall to measure the accuracy of this attack:

Precision = $\frac{TP}{(TP+FP)}$

Recall = $\frac{TP}{(TP+FN)}$

where, TP = True Positives FP = False Positives FN = False Negetives

Precision is asking the question “Of all the items model has predicted positive, how many are truly positive?” whereas recall is asking the question “Of all the truly positive items in the dataset, how many did the model correctly identify?”

Evaluation Result

In evaluating precision and recall tradeoffs, we start with a low time threshold, resulting in high precision but few false positives, and increase it gradually. As the threshold increases for greater recall, false positives also rise, reducing precision. This is due to instances where different users generate the same key over time. When the two most recent purchases were used, it was found that around 4.5 million keys occur approximately 8 million times, leading to false positives in the experiments. The choice of the optimal threshold depends on whether the attacker prioritizes higher recall or precision. The figure given few steps above illustrates this tradeoff from 1 second to 277 hours, showing that a threshold of 11 days achieves a recall of 1.0 with a 0.02 drop in precision. This allows the attacker to correctly link queries from the same users but results in a 2% misclassification rate for queries with identical keys at some point in their purchase history. The high precision and recall values, indicate how an attacker can track users who send queries to the recommendation model over time.

Hash Inversion with Frequency-Based Attack

Utilizing a hash function on indices before performing a table lookup for embeddings is a crucial performance optimization. In this context, we examine the influence of hashing on information leakage. Specifically, we investigate how an attacker could potentially reconstruct the original values of sparse features, even when hashing is applied to embedding indices. The process involves remapping users’ raw data to post-hash values for indexing the embedding tables, as illustrated in the figure below.

Evaluation Setup

For the purpose of evaluation, Taobao, Kaggle and Criteo datasets have been used. The training and testing datasets have been selected for all 3 datasets. The training set samples forms the prior distribution ad the test sample are use for the evaluation.

Attack Method

An adversary can initiate attacks by gathering observed index frequencies, utilizing prior knowledge about feature value distributions, and discerning the mapping between input and output of a hash. Here its illustrated how a system with hashed input values, where the hash function is expressed as $output = (input + mask_{add}) mod P$, where P is the hash size, can be compromised. We represent the frequency of potential inputs to the hash function as $x_1, x_2, …., x_N$ for N scenarios and its output frequency as $y_1, y_2, …., y_P$ for a hash size P. We construct the matrix $M ∈ R^{P X P}$, where each column signifies a distinct Mask value ([0, P − 1]). Essentially, for each mask value, we calculate the outcome frequencies and construct this matrix. As demonstrated, increasing the mask value by 1 shifts the column values, rendering the Matrix M a Toeplitz Matrix. Since a single column in this matrix is shifted and repeated, the order of forming this matrix is $O(P)$. Matrix M can be written as:

$y_ 1$	$y_ {P-1}$	…	$y_ 2$
$y_ 2$	$y_ 1$	6	…	$y_ 3$
…	…	…	…	…
$y_ P$	$y_ {P-2}$	…	$y_ 1$

The attacker’s objective is to reverse the hash using the input distribution and observations of the output distribution. It’s crucial to note that the input and output datasets should be independent. We define $a_t$ as the distribution of embedding table accesses (post-hash) at time t. To decipher the mask, the attacker must identify the mask used by the hash function. This involves solving the optimization problem outlined below: $min_i ||(m_i-a_t)||^2 = min_i(||m_i||^2 +||a_t||^2-2m_i^Ta_t)$ In this equation, $m_i$ represents the vector containing the frequencies of the output values when mask i is used, resulting in its absolute value being constant. Doing similar for $||a_t||$, the optimization problem can be simplified to:

$P_{bar} = arg max_i (m_i^Ta_t) for i ∈ [0,P-1] $ $P_{bar} = arg max_i(M^Ta_t)$

The matrix-vector product computation, typically $O(P^2)$, is optimized to O(P log P) due to the Toeplitz nature of matrix M (Strang, 1986). In executing the attack, two sets are established: one for extracting the known distribution and another for frequency matching. The attacker successfully reverses the hash function, deduces the key, and further reverses post-hash indices to reveal raw sparse feature values. This enables the attacker to reverse engineer the post-hash value, identifying the most frequent pre-hash values based on input distributions. The entire process involves key discovery, post-hash index reversal, and deciphering raw feature values, showcasing the vulnerability of the system to this frequency-based attack.

Evaluation Metric

Accuracy in this case would be the probability that the attacker correctly identifies an input raw value from the post-hash value. Let the function g(y) be the attacker’s estimate of the input, given the output query y, $g(y) = argmin_ x Prob(x)$ such that $h(x)_ {estimated}$ is te attackers estimation of the hash function. Now, we can say that: $Accuracy = Prob_ {x~P_ x} (x = g(h(x)))$ Here, h(x) is the true hash function, and the probability is over the distribution of the input query. Introducing the concept of top K accuracy, a measure indicating the likelihood that the input query falls within the top K predictions of the attacker, we formally define this by using the notation $S(y)_ {hat} = [x \vert h(x)_ {hat} = y]$, representing the set of possible inputs given an output query y according to the attacker’s estimation of the hash function. We then define $g_K(y) = [x ∈ S(y)_ {hat} | Prob(x)$ is in top K probabilities] as the top K members of $S(y)_ {hat}$ with the largest probability. In essence, $g_ K(y)$ signifies the set of the attacker’s top K predictions for the input query, formalizing the notion of top K accuracy within the context of the attacker’s estimations. $Accuracy_ {top K} = Prob_{x~P_ x} (x ∈ g_ K (h(x)))$

Evaluation Result

The key observation for this attack can be stated that if an attacker observed the frequency of queries, they can reconstruct the values of raw features with high accuracy by knowing the distributions of the pre-hash values and type of the hash function.

Potential Solutions

One approach to obfuscating the embedded table access pattern, mentioned by the authors, is to use Oblivious RAM (ORAM). In a way, ORAM succeeds in hiding the information about the real blocks from the attacker. However, the overhead of ORAM is unlikely to be acceptable for real-time application such as recommendation system inference due to Service Level Agreement (SLA). Moreover, it also relies on pre-determined sequence of accesses in training and is not applicable for inference.

Conclusion

In this paper, the authors shed light on the information leakage through sparse features in deep learning-based recommendation system and how even the access patterns can be a big threat to privacy. They did this under 5 different types of attacks.

Author Information Section

Anurag Yadav

Affiliation: Complex System Design Laboratory (csdlab-kaist.github.io)
Department: Data Science
Contact: anusim2003@kaist.ac.kr