Cognitive Biases and AI

Choosing the Right Weights in Algorithm

In our previous discussions (123, 4), we’ve explored the intricate relationship between cognitive biases and algorithms, and discussed the importance of weights in algorithms.

Recommender systems often predict and shape our online experience by assigning importance, or ‘weights,’ to our actions like clicks, likes, and shares.

Despite their critical role, there’s a surprising lack of research on how to choose these weights effectively.

Cornell Tech researchers Smitha Milli, Emma Pierson, and Nikhil Garg’s pivotal study, ‘Choosing the Right Weights: Balancing Value, Strategy, and Noise in Recommender Systems,’ provide a much-needed theoretic framework to guide this complex decision-making process.

How Facebook Adjusted Emoji Reaction Weights to Combat Misinformation and Toxicity

The selection of weights in recommender systems can have unforeseen consequences. As reported by The Washington Post, Facebook initially weighted emoji reactions five times more than a thumbs-up.

This was especially significant for the ‘angry’ reaction, which inadvertently promoted a surge in misinformation, toxicity, and poor-quality content on the platform.

As a result, Facebook progressively reduced its weight, from fivefold, to four, then to one and a half, and ultimately to the same level as a thumbs-up, to mitigate these adverse outcomes.

This example highlights that weight selection in recommender systems is often guided by the human judgment of staff.

Optimal Weight Selection in Recommender Systems

Milli et al. (2023) delve into the complex task of determining optimal weights within recommender systems, considering the strategic responses of users and content creators to these weightings. They introduce a game-theoretic model where content producers compete for users’ attention, and the recommender system ranks producers based on a linear combination of predictions of various user behaviors.

They evaluate user behaviors through three pivotal lenses: value-faithfulness, strategy-robustness, and noisiness.

‘Value-faithfulness’ is a measure of how well a behavior, like a ‘like’ over a mere ‘click’, signals a user’s true preference for content.

‘Strategy-robustness’ examines the challenges content producers face when attempting to manipulate user behaviors for their benefit, such as crafting clickbait titles to boost engagement.

‘Noisiness’ indicates the variability in predictions that arise from limited sample sizes.

One might assume the best user experience comes from prioritizing the most value-faithful behaviors. However, the reality is more complex. Highly indicative behaviors may also carry higher unpredictability, influencing the algorithm’s performance. Optimal weight selection, therefore, requires a strategic balance between reinforcing value-faithfulness and reducing noise in predictions.

Case Studies in E-commerce and Social Media

The paper showcases the practical application of its theoretical framework through case studies on e-commerce platforms, and social networks such as TikTok and Twitter, highlighting their strategic use.

It recommends beginning with an inventory of user behaviors and formulating theoretical rankings for each across the three dimensions of the model. The subsequent table presents a hypothetical evaluation of three pivotal behaviors across these platforms for illustrative purposes.

For example, the study analyzes three behaviors on Twitter: ‘like’, ‘retweet’, and ‘reply’, all of which are actively used in the platform’s algorithms.

The researchers establish rankings for value-faithfulness and strategy-robustness: with ‘like’ at the top, followed by ‘retweet’, and then ‘reply’ (like > RT > reply).

‘Likes’ are deemed most reflective of user value and the most challenging for content creators to manipulate.

‘Retweets’ rank lower as they are often used to express disagreement, especially with added comments.

‘Replies’ are positioned at the bottom because offensive tweets might attract few likes but numerous replies.

Conversely, in terms of noisiness — the likelihood of prediction errors — the ranking is inverted: ‘reply’ is the noisiest, followed by ‘retweet’, and ‘like’ is the least noisy.

Hypothesized behavior ranking
e-commerceTikTokTwitter
Value-faithfulnessorder > cart > clicklike > comment > playlike > RT > reply
Strategy-robustnessorder > cart > clicklike > comment > playlike > RT > reply
Noisinessorder > cart > clickcomment > like > playreply > RT > like

Anticipating and Mitigating Negative Outcomes

By thoroughly assessing these factors, platforms can proactively foresee and prevent potential negative consequences.

Recognizing the lesser strategy-robustness of certain behaviors is crucial. For instance, overemphasizing replies or retweets may inadvertently prompt content creators to game the system.

They might resort to posting contentious messages that, while generally disliked, still garner substantial engagement in the form of replies or retweets.

Such insights are vital for maintaining the integrity and quality of content on these platforms.

Game Theory and Platform Safety: Bridging Theory and Practice

Milli et al.’s paper strikes a personal chord with me, blending my academic background in game theory with my current role as a platform safety expert.

It’s a remarkable demonstration of how game theory principles can be effectively applied to enhance platform safety research. This intersection of academic theory and practical application underscores the broader relevance and impact of game theory in the realm of digital platform safety.