FAT* 2018 Conference Session 5 Summary on FAT Recommenders, Etc.

This session of FAT 2018 is about Recommenders, etc. Recommender systems are algorithmic tools for identifying items of interest to users. They are usually deployed to help mitigate information overload. Internet-scale item spaces offer many more choices than humans can process, diminishing the quality of their decision-making abilities. Recommender systems alleviate this problem by allowing users to more quickly focus on items likely to match their particular tastes. They are deployed across the modern Internet, suggesting products in e-commerce sites, movies and music in streaming media platforms, new connections on social networks, and many more types of items.

This session explains what Fairness, Accountability, and Transparency means in the context of recommendation. The session also includes a paper that talks about predictive policing, which is defined as ‘Given historical crime incident data for a collection of regions, decide how to allocate patrol officers to areas to detect crime.’

The Conference on Fairness, Accountability, and Transparency (FAT), which would be held on the 23rd and 24th of February, 2018 is a multi-disciplinary conference that brings together researchers and practitioners interested in fairness, accountability, and transparency in socio-technical systems.

The FAT 2018 conference will witness 17 research papers, 6 tutorials, and 2 keynote presentations from leading experts in the field. This article covers research papers pertaining to the 5th session that is dedicated to FAT Recommenders, etc.

Paper 1: Runaway Feedback Loops in Predictive Policing

Predictive policing systems are increasingly being used to determine how to allocate police across a city in order to best prevent crime. To update the model, discovered crime data (e.g., arrest counts) are used. Such systems have been empirically shown to be susceptible to runaway feedback loops, where police are repeatedly sent back to the same neighborhoods regardless of the true crime rate.

This paper is in response to this system, where the authors have developed a mathematical model of predictive policing that proves why this feedback loop occurs.The paper also empirically shows how this model exhibits such problems, and demonstrates ways to change the inputs to a predictive policing system (in a black-box manner) so the runaway feedback loop does not occur, allowing the true crime rate to be learned.

Key takeaways:

The results stated in the paper establish a link between the degree to which runaway feedback causes problems and the disparity in crime rates between areas.
The paper also demonstrates ways in which reported incidents of crime (reported by residents) and discovered incidents of crime (directly observed by police officers dispatched as a result of the predictive policing algorithm) interact.
In this paper, the authors have used the theory of urns (a common framework in reinforcement learning) to analyze existing methods for predictive policing.
There are formal as well as empirical results which shows why these methods will not work. Subsequently, the authors have also provided remedies that can be used directly with these methods in a black-box fashion that improve their behavior, and provide theoretical justification for these remedies.

Paper 2: All The Cool Kids, How Do They Fit In? Popularity and Demographic Biases in Recommender Evaluation and Effectiveness

There have been many advances in the information retrieval evaluation, which demonstrate the importance of considering the distribution of effectiveness across diverse groups of varying sizes. This paper addresses this question, ‘do users of different ages or genders obtain similar utility from the system, particularly if their group is a relatively small subset of the user base?’

The authors have applied this consideration to recommender systems, using offline evaluation and a utility-based metric of recommendation effectiveness to explore whether different user demographic groups experience similar recommendation accuracy. The paper shows that there are demographic differences in measured recommender effectiveness across two data sets containing different types of feedback in different domains; these differences sometimes, but not always, correlate with the size of the user group in question. Demographic effects also have a complex— and likely detrimental—interaction with popularity bias, a known deficiency of recommender evaluation.

Key takeaways:

The paper presents an empirical analysis of the effectiveness of collaborative filtering recommendation strategies, stratified by the gender and age of the users in the data set. The authors applied widely-used recommendation techniques across two domains, musical artists and movies, using publicly-available data.

The paper explains whether recommender systems produced equal utility for users of different demographic groups.
The authors made use of publicly available datasets, they compared the utility, as measured with nDCG, for users grouped by age and gender. Regardless of the recommender strategy considered, they found significant differences for the nDCG among demographic groups.

Paper 3: Recommendation Independence

In this paper the authors have showcased new methods that can deal with variance of recommendation outcomes without increasing the computational complexity. These methods can more strictly remove the sensitive information, and experimental results demonstrate that the new algorithms can more effectively eliminate the factors that undermine fairness. Additionally, the paper also explores potential applications for independence enhanced recommendation, and discuss its relation to other concepts, such as recommendation diversity.

Key takeaways from the paper:

The authors have developed new independence-enhanced recommendation models that can deal with the second moment of distributions without sacrificing computational efficiency.
The paper also explores applications in which recommendation independence would be useful, and reveal the relation of independence to the other concepts in recommendation research.
It also presents the concept of recommendation independence, and discuss how the concept would be useful for solving real-world problems.

Paper 4: Balanced Neighborhoods for Multi-sided Fairness in Recommendation

In this paper, the authors examine two different cases of fairness-aware recommender systems: consumer-centered and provider-centered. The paper explores the concept of a balanced neighborhood as a mechanism to preserve personalization in recommendation while enhancing the fairness of recommendation outcomes. It shows that a modified version of the Sparse Linear Method (SLIM) can be used to improve the balance of user and item neighborhoods, with the result of achieving greater outcome fairness in real-world datasets with minimal loss in ranking performance.

Key takeaways:

In this paper, the authors examine applications in which fairness with respect to consumers and to item providers is important.
They have shown that variants of the well-known sparse linear method (SLIM) can be used to negotiate the tradeoff between fairness and accuracy.
This paper also introduces the concept of multisided fairness, relevant in multisided platforms that serve a matchmaking function.
It demonstrates that the concept of balanced neighborhoods in conjunction with the well-known sparse linear method can be used to balance personalization with fairness considerations.

If you’ve missed our summaries on the previous sessions, visit the article links to be on track.

Session 1: Online Discrimination and Privacy

Session 2: Interpretability and Explainability

Session 3: Fairness in Computer Vision and NLP

Session 4: Fair Classification