A company is about to implement a use case in the data science area. Now nothing stands in the way of a data-driven business. However, fairness is often overlooked in the models.
Imagine that your company is at the end of its journey towards a first successful use case in data science. The infrastructure has been created, the models trained, and the process transformed. The results are excellent, and the step towards a data-driven business is proudly announced. What can go wrong now? Most data science use cases have a crucial blind spot: the fairness of the models and the resulting consequences.
Table of Contents
Design AI Algorithms Based On Fairness
Almost all companies are faced with creating added value from their data. Many still fail when faced with the challenge of selecting, implementing, and operationalizing the right use cases for data science and artificial intelligence. Often the results are not good enough due to the quality of the data. Because there is a lack of expertise in data science, the processes have not been adapted accordingly. Therefore, it is not surprising that whether the algorithms’ decisions take fairness into account is left out in most cases.
On the one hand, there is a lack of awareness of the topic. On the other hand, in the know-how of how algorithms can be designed based on fairness. The question arises as to how companies can ensure that their data science & AI use cases are free from discrimination.
How Does It Come About That Algorithms Discriminate?
The basic assumption is often that algorithms make objective decisions, i.e., based on numbers, data, and facts. This assumption is not wrong, but it ignores the fact that the data basis on which algorithms are trained often contains real existing discrimination and is thus transferred during training. If, for example, an algorithm is to filter applications according to which of them are promising, then the algorithm will be based on the previous settings in the company, which form the database for the training.
As early as 2014, Amazon had the experience that the algorithmic system for the company’s recruitment process did not evaluate applicants in software development in a gender-neutral manner. And thus followed the previous recruitment pattern in which male applicants and recruits were overrepresented. The discovery of this fact caused an outcry in the media and led to an enormous loss of reputation for Amazon.
When Is A Decision Fair, And When Is It Unfair?
Whether made by a human or a machine, the fairness of a decision is not always clear. For example, one could consider a decision to be fair if all the groups felt they received a positive or negative decision in equal parts. One could think of a credit decision here and classify it as suitable. So when both men and women get a loan with similar odds. It is also conceivable to judge the decision as fair if the quotas of commitments for both groups are at the same level. They have provided that the people qualify for it.
In the loan example, this would mean that the decision as to whether a person receives a loan is not evenly distributed across both groups across the board, but only for those who qualify for a loan. In this case, this represents a more realistic and more economic definition of fairness. This definition is also not optimal for the example described. The whitepaper ” Relevance of fair algorithms for Company ” explains how the correct explanation is selected for the respective use case is presented in the “Relevance of fair algorithms for Company.”
The credit example shows that different criteria for fairness can be used for the same question. The choice of the respective standard by which the right of a model is to be measured depends heavily on the business context. The sensitive attributes that need to be protected can also differ depending on the application.
This Is How Fairness Gets Into Algorithms
To ensure the fairness of algorithms, there are technical possibilities in development on the one hand and organizational levers in the company on the other. Specialized options are essentially based on the following three steps of a classic machine learning pipeline consisting of data preprocessing, modeling, and (result) post-processing:
- They Are Preprocessing: Transforming the data before the model training, for example, via resampling. This means that data points from underrepresented groups are artificially added to the training data set to weigh them more heavily.
- In-Processing: Adjust the model during training, for example, via regularization. This means that the model receives a penalty term for decisions that do not correspond to the definition of fairness and thus gets the incentive to optimize the fairness of the decision in training.
- Post-Processing: Adapting the model results after exercise, for example, using Equalized Odds Processing. Based on probability, the model results are corrected to balance the effects under fairness aspects.
Steps On The Road To Awareness
In the company organization, too, there are important levers and guard rails for fairness awareness, which pave the way for non-discriminatory algorithms even before the actual solution development:
- Identify Stakeholders: who will be affected by the decision? Who is involved in the development and application?
- Raise Awareness In The Development Process: Does the development team understand the issue of fairness?
- Define A Specific Definition Of Fairness For Each Business Context: Which fairness definition is suitable for the modeled decision?
- Introduction Of An Audit Process For Algorithmic Solutions: How is it checked whether the algorithm makes fair decisions?
- Pay Close Attention To Research And Regulation: How is legislation changing, and new methods are being developed?
The aim should be to develop a company and business-specific bias impact statement, which is an integral part of algorithmic solution developments in the company. This can be adjusted depending on the organization’s requirements but must be followed consistently and stringently for all development processes. It ensures responsible handling of decision templates created by algorithms or fully automated decisions in the company. In this way, it can be avoided that the algorithms used act unethically or violate applicable law.
Automated Fairness Put To The Test
Automated fairness is work and should be checked regularly. It is true that even human-led decision-making processes contain errors and sometimes systematically disadvantage groups or individuals. However, these are often accepted or classified as individual cases, especially because, in contrast to AI, they are usually not systematically recorded or analyzed. Algorithms can make decisions more fairly and more transparently than humans. But this requires intensive observation and constant adjustment of the models. If a company neglects this, existing inequalities in the data can be exacerbated. A risk that companies can no longer afford. Because it is also to be expected that the regulation in the field of artificial intelligence will intervene more strongly at the national and transnational levels in the future, this is also shown by the recent action of the EU.