Human-centered AI Archives

A successful data culture is the key for companies to derive the greatest possible benefit from the constantly growing amount of data. The trend towards making decisions based on data is unstoppable. But how do managers manage to empower their teams to use data effectively?

A vibrant data culture: the fuel for corporate success

Data culture is more than just a buzzword – it is the basis for data-driven decision-making. When all departments in a company use data to improve work processes and decision-making, an atmosphere is created in which competent handling of data is standard practice.

Why is this so important? Data is fuel for business success: 76 percent of participants in the BARC Data Culture Survey 22 stated that their company strives for a data culture. And 75 percent of managers see data culture as the most important competence.

The decisive role of managers

An established data culture is not only a success factor for the company, but also a way to promote innovation and motivate employees. Managers play a central role here by acting as trailblazers and actively supporting change. They must clearly communicate the benefits of a data culture, establish clear guidelines for data protection and data quality and offer targeted training and regular communication on progress. Clear responsibility for data culture is crucial, as 31% of companies with a weak data culture do not have a dedicated department or person with this responsibility.

Challenges and solutions

The path to a successful data culture is littered with hurdles. Managers have to face various challenges:

Resistance to change: A transition to a data culture can be met with resistance. Managers must clearly communicate the benefits and offer training to involve their employees in the change process.
Lack of data governance: Guidelines and standards for handling data are crucial. If these are missing, data quality will be reduced in the worst case. This leads to wrong decisions. This is where data cleansing and validation methods and regular audits are needed.
Concerns regarding data protection: Data protection and data access are often in conflict. Clear guidelines and security measures must be introduced in order to gain the trust of employees.
Lack of resources and support: Without the necessary resources, building a data culture can fail. Companies must provide targeted training and demonstrate the business benefits in economic metrics to gain the support of their executives

Best practices for a strong data culture

To effectively establish a data culture, companies can rely on the following best practices:

Critical thinking: Promoting critical thinking and ethical standards is crucial. Data and AI solutions will become tools of everyday life everywhere. Therefore, human intelligence remains the most important skill in dealing with technology.

Measuring and planning: Data culture can only be built up step by step. Companies should measure and evaluate data-driven behavior in order to assess progress. The stronger the data culture, the more omnipresent data-driven decision-making becomes.

Establishment of key roles: Companies should create special functions or roles for employees who link the data strategy with the corporate strategy and act as central multipliers to promote the data culture among employees.

The development of a strong data culture requires clear leadership, clear guidelines and the commitment of the entire organization. Managers play a crucial role in successfully shaping the change to a data-driven culture.

Building a strong data culture: our strategic approach

At statworx, we specialize in establishing robust data cultures in companies. Our strategy is based on proven frameworks, best practices and our extensive experience to lay the foundations for a successful data culture in your organization.

Data Culture Strategy: Working hand in hand with our clients’ teams, we develop the strategic roadmap required to foster a thriving data culture. This includes building the foundational structures that are essential to maximizing the potential of your company’s data.
Data culture training: We focus on empowering your workforce with the skills and knowledge to operate in the data and AI space. Our training programs aim to equip employees with the skills that are essential for building a strong data culture. This enables companies to realize the full potential of data and artificial intelligence.
Change management and support: Embedding a data culture requires sustained change management efforts. We work with client teams to establish long-term change programs aimed at initiating and consolidating a robust data culture within the organization. Our goal is to ensure that the transformation remains embedded in the DNA of the organization to ensure continued success.

With our comprehensive range of services, we strive to lead companies into a future where data becomes a strategic asset that unlocks new opportunities and enables informed decision-making at all levels. We have written about this in detail in our white paper “Data culture as a management task in companies“ (currently only available in German), which also contains a data culture checklist. Our services relating to data culture can be found on our Data Culture topic page. Tarik Ashry

At the beginning of December, the central EU institutions reached a provisional agreement on a legislative proposal to regulate artificial intelligence in the so-called trilogue. The final legislative text with all the details is now being drafted. As soon as this has been drawn up and reviewed, the law can be officially adopted. We have compiled the current state of knowledge on the AI Act.

As part of the ordinary legislative procedure of the European Union, a trilogue is an informal interinstitutional negotiation between representatives of the European Parliament, the Council of the European Union and the European Commission. The aim of a trialogue is to reach a provisional agreement on a legislative proposal that is acceptable to both the Parliament and the Council, the co-legislators. The provisional agreement must then be adopted by each of these bodies in formal procedures.

Legislation with a global impact

A special feature of the upcoming law is the so-called market location principle: according to this, companies worldwide that offer or operate artificial intelligence on the European market or whose AI-generated output is used within the EU will be affected by the AI Act.

Artificial intelligence is defined as machine-based systems that can autonomously make predictions, recommendations or decisions and thus influence the physical and virtual environment. This applies, for example, to AI solutions that support the recruitment process, predictive maintenance solutions and chatbots such as ChatGPT. The legal requirements that different AI systems must fulfill vary greatly depending on their classification into risk classes.

The risk class determines the legal requirements

The EU’s risk-based approach comprises a total of four risk classes:

low,
limited,
high,
and unacceptable risk.

These classes reflect the extent to which artificial intelligence jeopardizes European values and fundamental rights. As the term “unacceptable” for a risk class already indicates, not all AI systems are permissible. AI systems that belong to the “unacceptable risk” category are prohibited by the AI Act. The following applies to the other three risk classes: the higher the risk, the more extensive and stricter the legal requirements for the AI system.

We explain below which AI systems fall into which risk class and which requirements are associated with them. Our assessments are based on the information contained in the “AI Mandates” document dated June 2023. At the time of publication, this document was the most recently published, comprehensive document on the AI Act.

Ban on social scoring and biometric remote identification

Some AI systems have a significant potential to violate human rights and fundamental principles, which is why they are categorized as “unacceptable risk”. These include:

Real-time based remote biometric identification systems in publicly accessible spaces (exception: law enforcement agencies may use them to prosecute serious crimes but only with judicial authorization);
Biometric remote identification systems in retrospect (exception: law enforcement authorities may use them to prosecute serious crimes but only with judicial authorization);
Biometric categorization systems that use sensitive characteristics such as gender, ethnicity or religion;
Predictive policing based on so-called profiling – i.e. profiling based on skin color, suspected religious affiliation and similarly sensitive characteristics – geographical location or previous criminal behavior;
Emotion recognition systems for law enforcement, border control, the workplace and educational institutions;
Arbitrary extraction of biometric data from social media or video surveillance footage to create facial recognition databases;
Social scoring leading to disadvantage in social contexts;
AI that exploits the vulnerabilities of a particular group of people or uses unconscious techniques that can lead to behaviors that cause physical or psychological harm.

These AI systems are to be banned from the European market under the AI Act. Companies whose AI systems could fall into this risk class should urgently address the upcoming requirements and explore options for action. This is because a key result of the trilogue is that these systems will be banned just six months after official adoption.

Numerous requirements for AI with risks to health, safety and fundamental rights

The “high risk” category includes all AI systems that are not explicitly prohibited but nevertheless pose a high risk to health, safety or fundamental rights. The following areas of application and use are explicitly mentioned:

Biometric and biometric-based systems that do not fall into the “unacceptable risk” risk class;
Management and operation of critical infrastructure;
Education and training;
Access and entitlement to basic private and public services and benefits;
Employment, human resource management and access to self-employment;
Law enforcement;
Migration, asylum and border control;
Administration of justice and democratic processes

These AI systems are subject to comprehensive legal requirements that must be implemented prior to commissioning and observed throughout the entire AI life cycle:

Assessment to evaluate the effects on fundamental and human rights
Quality and risk management
Data governance structures
Quality requirements for training, test and validation data
Technical documentation and record-keeping obligations
Fulfillment of transparency and provision obligations
Human supervision, robustness, security and accuracy
Declaration of conformity incl. CE marking obligation
Registration in an EU-wide database

AI systems that are used in one of the above-mentioned areas but do not pose a risk to health, safety, the environment or fundamental rights are not subject to the legal requirements. However, this must be proven by informing the competent national authority about the AI system. The authority then has three months to assess the risks of the AI system. The AI can be put into operation within these three months. However, if the examining authority classifies it as high-risk AI, high fines may be imposed.

A special regulation also applies to AI products and AI safety components of products whose conformity is already being tested by third parties on the basis of EU legislation. This is the case for AI in toys, for example. In order to avoid overregulation and additional burdens, these will not be directly affected by the AI Act.

AI with limited risk must comply with transparency obligations

AI systems that interact directly with humans fall into the “limited risk” category. This includes emotion recognition systems, biometric categorization systems and AI-generated or modified content that resembles real people, objects, places or events and could be mistaken for real (“deepfakes”). For these systems, the draft law provides for the obligation to inform consumers about the use of artificial intelligence. This should make it easier for consumers to actively decide for or against their use. A code of conduct is also recommended.

No legal requirements for AI with low risk

Many AI systems, such as predictive maintenance or spam filters, fall into the “low risk” category. Companies that only offer or use such AI solutions will hardly be affected by the AI Act. This is because there are currently no legal requirements for such applications. Only a code of conduct is recommended.

Generative AI such as ChatGPT is regulated separately

Generative AI models and basic models with a wide range of possible applications were not included in the original draft of the AI Act. The regulatory possibilities of such AI models have therefore been the subject of particularly intense debate since the launch of ChatGPT by OpenAI. According to the European Council’s press statement of December 9, these models are now to be regulated on the basis of their risk. In principle, all models must implement transparency requirements. Foundation models with a particular risk – so-called “high-impact foundation models” – will also have to meet additional requirements. How exactly the risk of AI models will be assessed is currently still open. Based on the latest document, the following possible requirements for high-impact foundation models can be estimated:

Quality and risk management
Data governance structures
Technical documentation
Fulfillment of transparency and information obligations
Ensuring performance, interpretability, correctability, security, cybersecurity
Compliance with environmental standards
Cooperation with downstream providers
Registration in an EU-wide database

Companies should prepare for the AI Act now

Even though the AI Act has not yet been officially adopted and we do not yet know the details of the legal text, companies should prepare for the transition phase now. In this phase, AI systems and associated processes must be designed to comply with the law. The first step is to assess the risk class of each individual AI system. If you are not yet sure which risk classes your AI systems fall into, we recommend our free AI Act Quick Check. It will help you to assess the risk class.

More information:

Lunch & Learn „Done Deal“ (only available in German)
Lunch & Learn „Alles, was du über den AI Act Wissen musst “ (only available in German)
Factsheet AI Act

Sources:

Press statement of the European Council: „Artificial intelligence act: Council and Parliament strike a deal on the first rules for AI in the world“
AI Mandates (June 2023)
“General approach” of the Council of the European Union: https://www.consilium.europa.eu/en/press/press-releases/2022/12/06/artificial-intelligence-act-council-calls-for-promoting-safe-ai-that-respects-fundamental-rights/
Legislative proposal (“AI Act”) of the European Commission: https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX%3A52021PC0206
Ethical guidelines for trustworthy AI: https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai

the byte: Our concept

We envisioned the byte to be an immersive experience, including AI in as many elements of the experience as possible. Everything, from the menu to the cocktails, music, branding, and art on the wall: everything was AI-generated. Bringing AI into all of these components also pushed me far beyond of what I typically do, namely helping large companies with their data & AI challenges.

Branding

Before creating the menu, we developed the visual identity of our project. We decided on a “lo-fi” appeal, using a pixelated font in combination with AI-generated visuals of plates and dishes. Our key visual, a neon-lit white plate, was created using DALL-E 2 and was found across all of our marketing materials:

Location

We hosted the byte in one of Frankfurt’s coolest restaurant event locations: Stanley, a restaurant location that features approx. 60 seats and a fully-fledged bar inside the restaurant (ideal for our AI-generated cocktails). The atmosphere is rather dark and cozy, with dark marble walls, highlighted with white carpets on the table, and a big red window that lets you see the kitchen from outside.

The menu

The heart of our concept was a 5-course menu that we designed to elevate the classical Frankfurter cuisine with the multicultural and diverse influences of Frankfurt (for everyone, who knows the Frankfurter kitchen, I am sure you know that this was not an easy task).

Using GPT-4 and some prompt engineering magic, we generated several menu candidates that were test-cooked by the experienced Stanley kitchen crew (thank you, guys for this great work!) and then assembled into a final menu. Below, you can find our prompt to create the menu candidates:

“Create a 5-course menu that elevates the classical Frankfurter kitchen. The menu must be a fusion of classical Frankfurter cuisine combined with the multicultural influences of Frankfurt. Describe each course, its ingredients as well as a detailed description of each dish’s presentation.”

Surprisingly, only minor adjustments were necessary to the recipes, even though some AI creations were extremely adventurous! This was our final menu:

Handkäs’ Mousse with Pickled Beetroot on Roasted Sourdough Bread
Next Level Green Sauce (with Cilantro and Mint) topped with a Fried Panko Egg
Cream Soup from White Asparagus with Coconut Milk and Fried Curry Fish
Currywurst (Beef & Vegan) by Best Worscht in Town with Carrot-Ginger-Mash and Pine Nuts
Frankfurt Cheesecake with Äppler Jelly, Apple Foam and Oat-Pecanut-Crumble

My favorite was the “Next Level” Green Sauce, an oriental twist of the classical 7-herb Frankfurter Green Sauce topped with a fried panko egg. Yummy! Below you can see the menu out in the wild 🍲

AI Cocktails

Alongside the menu, we also prompted GPT to create recipes that twisted famous cocktail classics to match our Frankfurt fusion theme. The results:

Frankfurt Spritz (Frankfurter Äbbelwoi, Mint, Sparkling Water)
Frankfurt Mule (Variation of a Moscow Mule with Calvados)
The Main (Variation of a Swimming Pool Cocktail)

My favorite was the Frankfurt Spritz, as it was fresh, herbal, and delicate (see pic below):

AI Host: Ambrosia the Culinary AI

An important part of our concept was “Ambrosia”, an AI-generated host that guided the guests around the evening, explaining the concept and how the menu was created. We thought it was important to manifest the AI as something the guests can experience. We hired a professional screenwriter for the script and used murf.ai to create several text-2-speech assets that were played at the beginning of the dinner and in-between courses.

Note: Ambrosia starts talking at 0:15.

AI Music

Music plays an important role for the vibe of an event. We decided to use mubert, a generative AI start-up that allowed us to create and stream AI music in different genres, such as “Minimal House” for a progressive vibe throughout the evening. After the main course, a DJ took over and accompanied our guests into the night 💃🍸

Mit dem Laden des Videos akzeptieren Sie die Datenschutzerklärung von YouTube.
Mehr erfahren

Video laden

YouTube immer entsperren

AI Art

Throughout the restaurant, we placed AI-generated art pieces by the local AI artist Vladimir Alexeev (a.k.a. “Merzmensch”), here are some examples:

AI Playground

As an interactive element for the guests, we created a small web app that takes the first name of a person and transforms it into a dish, including a reasoning why that name perfectly matches the dish 🙂 You can try it out here: Playground

Launch

The byte was officially announced at the S-O-U-P festival press conference in early May 2023. We also launched additional marketing activities through social media and our friends and family networks. As a result, the byte was fully booked for three days straight, and we got broad media coverage in various gastronomy magazines and the daily press. The guests were (mostly) amazed by our AI creations, and we received inquiries from other European restaurants and companies interested in exclusively booking the byte as an experience for their employees 🤩 Nailed it!

Closing and Next Steps

Creating the byte together with Jonathan and James was an outstanding experience. It further encouraged me that AI will transform not only our economy but all aspects of our daily lives. There is massive potential at the intersection of creativity, culture, and AI that is currently only being tapped.

We definitely want to continue the byte in Frankfurt and other cities in Germany and Europe. Moreover, James, Jonathan, and I are already thinking of new ways to bring AI into culture and society. Stay tuned! 😏

The byte was not just a restaurant; it was an immersive experience. We wanted to create something that had never been done before and did it – in just eight weeks. And that’s the inspiration I want to leave you with today:

Trying new things that move you out of your comfort zone is the ultimate source of growth. You never know what you’re capable of until you try. So, go out there and try something new, like building an AI-powered pop-up restaurant. Who knows, you might surprise yourself. Bon apétit!

Impressions

Media

FAZ: https://www.faz.net/aktuell/rhein-main/pop-up-resturant-the-byte-wenn-chatgpt-das-menue-schreibt-18906154.html

Genuss Magazin: https://www.genussmagazin-frankfurt.de/gastro_news/Kuechengefluester-26/Interview-James-Ardinast-KI-ist-die-Zukunft-40784.html

Frankfurt Tipp: https://www.frankfurt-tipp.de/ffm-aktuell/s/ugc/deutschlands-erstes-ai-restaurant-the-byte-in-frankfurt.html

Foodservice: https://www.food-service.de/maerkte/news/the-byte-erstes-ki-restaurant-vor-dem-start-55899?crefresh=1 Sebastian Heinz

The Hidden Risks of Black-Box Algorithms

Reading and evaluating countless resumes in the shortest possible time and making recommendations for suitable candidates – this is now possible with artificial intelligence in applicant management. This is because advanced AI technologies can efficiently analyze even large volumes of complex data. In HR management, this not only saves valuable time in the pre-selection process but also enables applicants to be contacted more quickly. Artificial intelligence also has the potential to make application processes fairer and more equitable.

However, real-world experience has shown that artificial intelligence is not always “fair”. A few years ago, for example, an Amazon recruiting algorithm stirred up controversy for discriminating against women when selecting candidates. Additionally, facial recognition algorithms have repeatedly led to incidents of discrimination against People of Color.

One reason for this is that complex AI algorithms independently calculate predictions and results based on the data fed into them. How exactly they arrive at a particular result is not initially comprehensible. This is why they are also known as black-box algorithms. In Amazon’s case, the AI determined suitable applicant profiles based on the current workforce, which was predominantly male, and thus made biased decisions. In a similar way, algorithms can reproduce stereotypes and reinforce discrimination.

Principles for Trustworthy AI

The Amazon incident shows that transparency is highly relevant in the development of AI solutions to ensure that they function ethically. This is why transparency is also one of the seven statworx Principles for trustworthy AI. The employees at statworx have collectively defined the following AI principles: Human-centered, transparent, ecological, respectful, fair, collaborative, and inclusive. These serve as orientations for everyday work with artificial intelligence. Universally applicable standards, rules, and laws do not yet exist. However, this could change in the near future.

The European Union (EU) has been discussing a draft law on the regulation of artificial intelligence for some time. Known as the AI Act, this draft has the potential to be a gamechanger for the global AI industry. This is because it is not only European companies that are targeted by this draft law. All companies offering AI systems on the European market, whose AI-generated output is used within the EU, or operate AI systems for internal use within the EU would be affected. The requirements that an AI system must meet depend on its application.

Recruiting algorithms are likely to be classified as high-risk AI. Accordingly, companies would have to fulfill comprehensive requirements during the development, publication, and operation of the AI solution. Among other things, companies are required to comply with data quality standards, prepare technical documentation, and establish risk management. Violations may result in heavy fines of up to 6% of global annual sales. Therefore, companies should already start dealing with the upcoming requirements and their AI algorithms. Explainable AI methods (XAI) can be a useful first step. With their help, black-box algorithms can be better understood, and the transparency of the AI solution can be increased.

Unlocking the Black Box with Explainable AI Methods

XAI methods enable developers to better interpret the concrete decision-making processes of algorithms. This means that it becomes more transparent how an algorithm has formed patterns and rules and makes decisions. As a result, potential problems such as discrimination in the application process can be discovered and corrected. Thus, XAI not only contributes to greater transparency of AI but also favors its ethical use and thus increases the conformity of an AI with the upcoming AI Act.

Some XAI methods are even model-agnostic, i.e. applicable to any AI algorithm from decision trees to neural networks. The field of research around XAI has grown strongly in recent years, which is why there is now a wide variety of methods. However, our experience shows that there are large differences between different methods in terms of the reliability and meaningfulness of their results. Furthermore, not all methods are equally suitable for robust application in practice and for gaining the trust of external stakeholders. Therefore, we have identified our top 3 methods based on the following criteria for this blog post:

Is the method model agnostic, i.e. does it work for all types of AI models?
Does the method provide global results, i.e. does it say anything about the model as a whole?
How meaningful are the resulting explanations?
How good is the theoretical foundation of the method?
Can malicious actors manipulate the results or are they trustworthy?

Our Top 3 XAI Methods at a Glance

Using the above criteria, we selected three widely used and proven methods that are worth diving a bit deeper into: Permutation Feature Importance (PFI), SHAP Feature Importance, and Accumulated Local Effects (ALE). In the following, we explain how each of these methods work and what they are used for. We also discuss their advantages and disadvantages and illustrate their application using the example of a recruiting AI.

Efficiently Identify Influencial Variables with Permutation Feature Importance

The goal of Permutation Feature Importance (PFI) is to find out which variables in the data set are particularly crucial for the model to make accurate predictions. In the case of the recruiting example, PFI analysis can shed light on what information the model relies on to make its decision. For example, if gender emerges as an influential factor here, it can alert the developers to potential bias in the model. In the same way, a PFI analysis creates transparency for external users and regulators. Two things are needed to compute PFI:

An accuracy metric such as the error rate (proportion of incorrect predictions out of all predictions).
A test data set that can be used to determine accuracy.

In the test data set, one variable after the other is concealed from the model by adding random noise. Then, the accuracy of the model is determined over the transformed test dataset. From there, we conclude that those variables whose concealment affects model accuracy the most are particularly important. Once all variables are analyzed and sorted, we obtain a visualization like Figure 1. Using our artificially generated sample data set, we can derive the following: Work experience did not play a major role in the model, but ratings from the interview were influencial.

Figure 1 – Permutation Feature Importance using the example of a recruiting AI (data artificially generated).

A great strength of PFI is that it follows a clear mathematical logic. The correctness of its explanation can be proven by statistical considerations. Furthermore, there are hardly any manipulable parameters in the algorithm with which the results could be deliberately distorted. This makes PFI particularly suitable for gaining the trust of external observers. Finally, the computation of PFI is very resource efficient compared to other XAI methods.

One weakness of PFI is that it can provide misleading explanations under some circumstances. If a variable is assigned a low PFI value, it does not always mean that the variable is unimportant to the issue. For example, if the bachelor’s degree grade has a low PFI value, this may simply be because the model can simply look at the master’s degree grade instead since they are usually similar. Such correlated variables can complicate the interpretation of the results. Nonetheless, PFI is an efficient and useful method for creating transparency in black-box models.

Strengths	Weaknesses
Little room for malicious manipulation of results	Does not consider interactions between variables
Efficient computation

Uncover Complex Relationships with SHAP Feature Importance

SHAP Feature Importance is a method for explaining black box models based on game theory. The goal is to quantify the contribution of each variable to the prediction of the model. As such, it closely resembles Permutation Feature Importance at first glance. However, unlike PFI, SHAP Feature Importance provides results that can account for complex relationships between multiple variables.

SHAP is based on a concept from game theory: Shapley values. Shapley values are a fairness criterion that assigns a weight to each variable that corresponds to its contribution to the outcome. This is analogous to a team sport, where the winning prize is divided fairly among all players, according to their contribution to the victory. With SHAP, we can look at every individual obversation in the data set and analyze what contribution each variable has made to the prediction of the model.

If we now determine the average absolute contribution of a variable across all observations in the data set, we obtain the SHAP Feature Importance. Figure 2 illustrates the results of this analysis. The similarity to the PFI is evident, even though the SHAP Feature Importance only places the rating of the job interview in second place.

Figure 2 – SHAP Feature Importance using the example of a recruiting AI (data artificially generated).

A major advantage of this approach is the ability to account for interactions between variables. By simulating different combinations of variables, it is possible to show how the prediction changes when two or more variables vary together. For example, the final grade of a university degree should always be considered in the context of the field of study and the university. In contrast to the PFI, the SHAP Feature Importance takes this into account. Also, Shapley Values, once calculated, are the basis of a wide range of other useful XAI methods.

However, one weakness of the method is that it is more computationally expensive than PFI. Efficient implementations are available only for certain types of AI algorithms like decision trees or random forests. Therefore, it is important to carefully consider whether a given problem requires a SHAP Feature Importance analysis or whether PFI is sufficient.

Strengths	Weaknesses
Little room for malicious manipulation of results	Calculation is computationally expensive
Considers complex interactions between variables

Focus in on Specific Variables with Accumulated Local Effects

Accumulated Local Effects (ALE) is a further development of the commonly used Partial Dependence Plots (PDP). Both methods aim at simulating the influence of a certain variable on the prediction of the model. This can be used to answer questions such as “Does the chance of getting a management position increase with work experience?” or “Does it make a difference if I have a 1.9 or a 2.0 on my degree certificate?”. Therefore, unlike the previous two methods, ALE makes a statement about the model’s decision-making, not about the relevance of certain variables.

In the simplest case, the PDP, a sample of observations is selected and used to simulate what effect, for example, an isolated increase in work experience would have on the model prediction. Isolated means that none of the other variables are changed in the process. The average of these individual effects over the entire sample can then be visualized (Figure 3, above). Unfortunately, PDP’s results are not particularly meaningful when variables are correlated. For example, let us look at university degree grades. PDP simulates all possible combinations of grades in bachelor’s and master’s programs. Unfortunately, this results in cases that rarely occur in the real world, e.g., an excellent bachelor’s degree and a terrible master’s degree. The PDP has no sense for unreaslistic cases, and the results may suffer accordingly.

ALE analysis, on the other hand, attempts to solve this problem by using a more realistic simulation that adequately represents the relationships between variables. Here, the variable under consideration, e.g., bachelor’s grade, is divided into several sections (e.g., 6.0-5.1, 5.0-4.1, 4.0-3.1, 3.0-2.1, and 2.0-1.0). Now, the simulation of the bachelor’s grade increase is performed only for individuals in the respective grade group. This prevents unrealistic combinations from being included in the analysis. An example of an ALE plot can be found in Figure 3 (below). Here, we can see that ALE identifies a negative impact of work experience on the chance of employment, which PDP was unable to find. Is this behavior of the AI desirable? For example, does the company want to hire young talent in particular? Or is there perhaps an unwanted age bias behind it? In both cases, the ALE plot helps to create transparency and to identify undesirable behavior.

Figure 3- Partial Dependence Plot and Accumulated Local Effects using a Recruiting AI as an example (data artificially generated).

In summary, ALE is a suitable method to gain insight into the influence of a certain variable on the model prediction. This creates transparency for users and even helps to identify and fix unwanted effects and biases. A disadvantage of the method is that ALE can only analyze one or two variables together in the same plot, meaningfully. Thus, to understand the influence of all variables, multiple ALE plots must be generated, which makes the analysis less compact than PFI or a SHAP Feature Importance.

Strengths	Weaknesses
Considers complex interactions between variables	Only one or two variables can be analyzed in one ALE plot
Little room for malicious manipulation of results

Build Trust with Explainable AI Methods

In this post, we presented three Explainable AI methods that can help make algorithms more transparent and interpretable. This also favors meeting the requirements of the upcoming AI Act. Even though it has not yet been passed, we recommend to start working on creating transparency and traceability for AI models based on the draft law as soon as possible. Many Data Scientists have little experience in this field and need further training and time to familiarize with XAI concepts before they can identify relevant algorithms and implement effective solutions. Therefore, it makes sense to familiarize yourself with our recommended methods preemptively.

With Permutation Feature Importance (PFI) and SHAP Feature Importance, we demonstrated two techniques to determine the relevance of certain variables to the prediction of the model. In summary, SHAP Feature Importance is a powerful method for explaining black-box models that considers the interactions between variables. PFI, on the other hand, is easier to implement but less powerful for correlated data. Which method is most appropriate in a particular case depends on the specific requirements.

We also introduced Accumulated Local Effects (ALE), a technique that can analyze and visualize exactly how an AI responds to changes in a specific variable. The combination of one of the two feature importance methods with ALE plots for selected variables is particularly promising. This can provide a theoretically sound and easily interpretable overview of the model – whether it is a decision tree or a deep neural network.

The application of Explainable AI is a worthwhile investment – not only to build internal and external trust in one’s own AI solutions. Rather, we expect that the skillful use of interpretation-enhancing methods can help avoid impending fines due to the requirements of the AI Act, prevents legal consequences, and protects those affected from harm – as in the case of incomprehensible recruiting software.
Our free AI Act Quick Check helps you assess whether any of your AI systems could be affected by the AI Act: https://www.statworx.com/en/ai-act-tool/

Sources & Further Information:

https://www.faz.net/aktuell/karriere-hochschule/buero-co/ki-im-bewerbungsprozess-und-raus-bist-du-17471117.html (last opened 03.05.2023)
https://t3n.de/news/diskriminierung-deshalb-platzte-amazons-traum-vom-ki-gestuetzten-recruiting-1117076/ (last opened 03.05.2023)
For more information on the AI Act: https://www.statworx.com/en/content-hub/blog/how-the-ai-act-will-change-the-ai-industry-everything-you-need-to-know-about-it-now/
Statworx principles: https://www.statworx.com/en/content-hub/blog/statworx-ai-principles-why-we-started-developing-our-own-ai-guidelines/
Christoph Molnar: Interpretable Machine Learning: https://christophm.github.io/interpretable-ml-book/ Max Hilsdorf, Julia Rettig

Image Sources
AdobeStock 566672394 – by TheYaksha

Last December, the European Council published a dossier outlining the Council’s preliminary position on the draft law known as the AI Act. This new law is intended to regulate artificial intelligence (AI) and thus becomes a game-changer for the entire tech industry. In the following, we have compiled the most important information from the dossier, which is the current official source on the planned AI Act at the time of publication.

A legal framework for AI

Artificial intelligence has enormous potential to improve and ease all our lives. For example, AI algorithms already support early cancer detection or translate sign language in real time, thereby eliminating language barriers. But in addition to the positive effects, there are risks, as the latest deep fakes from Pope Francis or the Cambridge Analytica scandal illustrate.

The European Union (EU) is currently drafting legislation to regulate artificial intelligence to mitigate the risks of artificial intelligence. With this, the EU wants to protect consumers and ensure the ethically acceptable use of artificial intelligence. The so-called “AI Act” is still in the legislative process but is expected to be passed in 2023 – before the end of the current legislative period. Companies will then have two years to implement the legally binding requirements. Violations will be punished with fines of up to 6% of global annual turnover or €30,000,000 – whichever is higher. Therefore, companies should already start addressing the upcoming legal requirements now.

Legislation with global impact

The planned AI Act is based on the “location principle, ” meaning that not only European companies will be affected by the amendment. Thus, all companies that offer AI systems on the European market or also operate them for internal use within the EU are affected by the AI Act – with a few exceptions. Private use of AI remains untouched by the regulation so far.

Which AI systems are affected?

The definition of AI determines which systems will be affected by the AI Act. For this reason, the AI definition of the AI Act has been the subject of controversial debate in politics, business, and society for a considerable time. The initial definition was so broad that many “normal” software systems would also have been affected. The current proposal defines AI as any system developed through machine learning or logic- and knowledge-based approaches. It remains to be seen whether this definition will ultimately be adopted.

7 Principles for trustworthy AI

The “seven principles for trustworthy AI” are the most important basis of the AI Act. A group of experts from research, the digital economy, and associations developed them on behalf of the European Commission. They include not only technical aspects but also social and ethical factors that can be used to classify the trustworthiness of an AI system:

Human action & oversight: decision-making should be supported without undermining human autonomy.
Technical Robustness & security: accuracy, reliability, and security must be preemptively ensured.
Data privacy & data governance: handling of data must be legally secure and protected.
Transparency: interaction with AI must be clearly communicated, as must its limitations and boundaries.
Diversity, non-discrimination & fairness: Avoidance of unfair bias must be ensured throughout the entire AI lifecycle.
Environmental & societal well-being: AI solutions should have a positive impact on the environment and society as possible.
Accountability: responsibilities for the development, use, and maintenance of AI systems must be defined.

Based on these principles, the AI Act’s risk-based approach was developed, allowing AI systems to be classified into one of four risk classes: low, limited, high, and unacceptable risk.

Four risk classes for trustworthy AI

The risk class of an AI system indicates the extent to which an AI system threatens the principles of trustworthy AI and which legal requirements the system must fulfill – provided the system is fundamentally permissible. This is because, in the future, not all AI systems will be allowed on the European market. For example, most “social scoring” techniques are assessed as “unacceptable” and will not be allowed by the new law.

For the other three risk classes, the rule of thumb is that the higher the risk of an AI system, the higher the legal requirements for it. Companies that offer or operate high-risk systems will have to meet the most requirements. For example, AI used to operate critical (digital) infrastructure or used in medical devices is considered such. To bring these to market, companies will have to observe high-quality standards for the used data, set up a risk management, affix a CE mark, and more.

AI systems in the “limited risk” class are subject to information and transparency obligations. Accordingly, companies must inform users of chatbots, emotion recognition systems, or deep fakes about the use of artificial intelligence. Predictive maintenance or spam filters are two examples of AI systems that fall into the lowest-risk category “low risk”. Companies that exclusively offer or use such AI solutions will hardly be affected by the upcoming AI Act. There are no legal requirements for these applications yet.

What companies can do for now

Even though the AI Act is still in the legislative process, companies should act now. The first step is to clarify how they will be affected by the AI Act. To help you do this, we have developed the AI Act Quick Check. With this free tool, AI systems can be quickly assigned to a risk class free of charge, and requirements for the system can be derived. Finally, it can be used as a basis to estimate how extensive the realization of the AI Act will be in your own company and to take initial measures. Of course, we are also happy to support you in evaluating and solving company-specific challenges related to the AI Act. Please do not hesitate to contact us!

AI Act Tool AI Act Fact Sheet

Benefit from our expertise!

Of course, we are happy to support you in evaluating and solving company-specific challenges related to the AI Act. Please do not hesitate to contact us!

Links & Sources:

„General approach“ of the Council of the European Union: https://www.consilium.europa.eu/en/press/press-releases/2022/12/06/artificial-intelligence-act-council-calls-for-promoting-safe-ai-that-respects-fundamental-rights/
Proposal (AI Act) of the European Commission: https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX%3A52021PC0206
Ethic guidelines for trustworthy AI: https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai
Current status of the legislative process: https://eur-lex.europa.eu/procedure/EN/2021_106?qid=1657016300941&sortOrder=des
More information: https://digital-strategy.ec.europa.eu/en/policies/european-approach-artificial-intelligence

Julia Rettig

Experimenting with image classification through the gender lens

In the first part of our series we discussed a simple question: How would our looks change if we were to move images of us across the gender spectrum? Those experiments lead us the idea of creating gender neutral face images from existing photos. Is there a “mid-point” where we perceive ourselves as gender-neutral? And – importantly – at what point would an AI perceive a face as such?

Becoming more aware of technology we use daily

Image classification is an important topic. Technology advances daily and is employed in a myriad of applications – often without the user being aware of how the technology works. A current example is the Bold Glamour filter on TikTok. When applied on female-looking faces, facial features and amount of makeup change drastically. In contrast to this, male-looking faces change much less. This difference suggests that the data used to develop the AI behind the filters was unbalanced. The technology behind it is most likely based on GANs, like the one we explore in this article.

As a society of conscious citizens, all of us should have a grasp of the technology that makes this possible. To help establish more awareness we explore face image generation and classification through a gender lens. Rather than explore several steps along the spectrum, this time our aim is to generate gender-neutral versions of faces.

How to generate gender-neutral faces using StyleGAN

Utilizing a deep learning-based classifier for gender identification

To determine a point at which a face’s gender is considered neutral is anything but trivial. After relying on our own (of course not bias-free) interpretation of gender in faces, we quickly realized that we needed a more consistent and less subjective solution. As AI-specialists, we immediately thought of data-driven approaches. One such approach can be implemented using a deep learning-based image classifier.

These classifiers are usually trained on large datasets of labelled images to distinguish between given categories. In the case of face classification, categories like gender (usually only female and male) and ethnicity are commonly found in classifier implementations. In practice, such classifiers are often criticized for their potential for misuse and biases. Before discussing examples of those problems, we will first focus on our less critical application scenario. For our use-case, face classifiers allow us to fully automate the creation of gender-neutral face images. To achieve this, we can implement a solution in the following way:
We use a GAN-based approach to generate face images that look like a given input image and then use the latent directions of the GAN to move the image towards a more female or male appearance. You can find all a detailed exploration of this process in the first part of our series. Building on top of this approach, we want to focus on the usage of a binary gender classifier to fully automate the search of a gender-neutral appearance.

For that we use the classifier developed by Furkan Gulsen to guess the gender of the GAN-generated version of our input image. The classifier outputs a value between zero and one to represent the likelihood of the image depicting a female or male face respectively. This value tells us in which direction (more male or more female) to move to approach a more gender-neutral version of the image. After taking a small step in the identified direction we repeat the process until we get to a point at which the classifier can no longer confidently identify the face’s gender but deems both male and female genders equally likely.
Below you will find a set of image pairs that represent our results. On the left, the original image is shown. On the right, we see the gender-neutral version of our input image, that the classifier interpreted as equally likely to be male as female. We tried to repeat the experiment for members of different ethnicities and age groups.

Results: original input and AI-generated gender-neutral output

Are you curious how the code works or what you would look like? You can try out the code we used to generate these image pairs by going to this link. Just press on each play button one by one and wait until you see the green checkmark.

Image processing note: Image processing note: We used an existing GAN, image encoder, and face classifier to generate gender-neutral output. A detailed exploration of this process can be found here

Perceived gender-neutrality seems to be a result of mixed facial features

Above, we see the original portraits of people on the left and their gender-neutral counterpart – created by us – on the right. Subjectively, some feel more “neutral” than others. In several of the pictures, particularly stereotypical gender markers remain, such as makeup for the women and a square jawline for the men. Outputs we feel turned out rather convincing are images 2 and 4. Not only do these images feel more difficult to “trace back” to the original person, but it is also much harder to decide whether it looks more male or female. One could argue that the gender-neutral faces are a balanced toned-down mix of male and female facial features. For example, with image 2 when singling out and focusing on the gender-neutral version the eye and mouth area seems more female, while the jawline and face shape seem more male. In the gender-neutral version of image 3, the face alone may look quite neutral, but the short hair distracts from this, rendering the whole impression in the direction of male.

Training sets for image generation have been heavily criticized for not being representative of the existing population, especially regarding the underrepresentation of examples for different ethnicities and genders. Despite “cherry-picking” and a limited range of examples, we feel that our approach did not bring worse examples for women or non-white people in the results above.

Societal implications of such models

When talking about the topic of gender perception, we should not forget that people may feel they belong to a gender different from their biological sex. In this article, we use gender classification models and interpret the results. However, our judgements will likely differ from other peoples’ perception. This is an essential consideration in the implementation of such image classification models and one we must discuss as a society.

How can technology treat everybody equal?

A study by the Guardian found that images of females portrayed in the same situations as males are more likely to be considered racy by AI classification services offered by Microsoft, Google, and AWS. While the results of the investigation are shocking, they come as no surprise. For a classification algorithm to learn what constitutes sexually explicit content, a training set of image-label pairs must be created. Human labellers perform this task. They are influenced by their own societal bias, for example more quickly associating depictions women with sexuality. Moreover, criteria such as “raciness” are hard to quantify let alone define.

While these models may not explicitly be trained to discriminate between genders there is little doubt that they propagate undesirable biases against women originating from their training data. Similarly, societal biases that affect men can be passed on to AI models, too, resulting in discrimination against males. When applied to millions of online images of people, the issue of gender disparity is amplified.

Use in criminal law enforcement poses issues

Another scenario of misuse of image classification technology exists in the realm of law enforcement. Misclassification is problematic and proven prevalent in an article by The Independent. When Amazon’s Recognition software was used at the default 80% confidence level in a 2018 study, the software falsely matched 105 out of 1959 participants with mugshots of criminals. Seeing the issues with treatment of images depicting males and females above, one could imagine a disheartening scenario when judging actions of females in the public space. If men and women are judged differently for performing the same actions or being in the same positions, it would impact everybody’s right to equal treatment before the law. Bayerischer Rundfunk, a German media outlet, published an interactive page (only in German) where AI classification services’ differing classifications can be compared to one’s own assessment .

Using gender-neutral images to circumvent human bias

Besides the positive societal potentials of image classification, we also want to address some possible practical applications arising from being able to cover more than just two genders. An application that came to our minds is the use of “genderless” images to prevent human bias. Such a filter would imply losing individuality, so they would only be applicable in contexts where the benefit of reducing bias outweighs the cost of that loss.

Imagining a browser extension for the hiring process

HR screening could be an area where gender-neutral images may lead to less gender-based discrimination. Gone are the times of faceless job applications: if your LinkedIn profile has a profile picture it is 14 times more likely to get viewed. When examining candidate profiles, recruiters should ideally be free of subconscious, unintentional gender bias. Human nature prevents this. One could thus imagine a browser extension that generates a gender-neutral version of profile photos on professional social networking sites like LinkedIn or Xing. This could lead to more parity and neutrality in the hiring process, where only skills and character should count, and not one’s gender – or one’s looks for that matter (pretty privilege).

Conclusion

We set out to automatically generate gender-neutral versions from any input face image.

Our implementation indeed automates the creation of gender-neutral faces. We used an existing GAN, image encoder and face image classifier. Our experiments with real peoples’ portraits show that the approach works well in many cases and produces realistically looking face images that clearly resemble the input image while remaining gender neutral.

In some cases, we still found that the supposedly neutral images contain artifacts from technical glitches or still have their recognizable gender. Those limitations likely arise from the nature of the GANs latent space or the lack of artificially generated images in the classifiers training data. We are confident that further work can resolve most of those issues for real-world applications.

Society’s ability to have an informed discussion on advances in AI is crucial

Image classification has far-reaching consequences should be evaluated and discussed by society, not just a few experts. Any image classification service that is used to sort people into categories should be examined closely. What must be avoided is that members of society come to harm. Establishing responsible use of such systems, governance and constant evaluation are essential. An additional solution could be creating structures for the reasoning behind decisions using Explainable AI best practices to lay out why certain decisions were made. As a company in the field of AI, we at statworx look to our AI-principles as a guide.

Image Sources:

AdobeStock 210526825 – Wayhome Studio
AdobeStock 243124072 – Damir Khabirov
AdobeStock 387860637 – insta_photos
AdobeStock 395297652 – Nattakorn
AdobeStock 480057743 – Chris
AdobeStock 573362719 – Xavier Lorenzo

AdobeStock 546222209 – Rrose Selavy Isabel Hermes, Alexander Müller

In a fast-paced and data-driven world, the management of information and knowledge is essential. Businesses in particular rely on making knowledge accessible internally as quickly, clearly, and concisely as possible. Knowledge management is the process of creating, extracting, and utilizing knowledge to improve business performance. It includes methods that help organizations identify and extract knowledge, distribute it, and use it to better achieve their goals. However, this can be a complex and challenging task, especially in large companies.

Natural Language Processing (NLP) promises to provide a solution. This technology has the potential to revolutionize the knowledge strategy of companies. NLP is a branch of artificial intelligence that deals with the interaction between computers and human language. By using NLP, companies can gain insights from large amounts of unstructured text data and convert them into actionable knowledge.

In this blog post, we examine how NLP can improve knowledge management and how companies can use NLP to perform complex processes quickly, safely, and automatically. We explore the benefits of using NLP in knowledge management, the various NLP techniques used, and how companies can use NLP to achieve their goals better with artificial intelligence.

Case Study for effective knowledge management

Using the example of email correspondence in a construction project, we illustrate the application and added value of natural language processing. We use two emails as specific examples that were exchanged during the construction project: an order confirmation for ordered items and a complaint about their quality.

For a new building, the builder requested quotes for products from a variety of suppliers, including thermal insulation. Eventually, they were ordered from a supplier. In an email, the supplier clarifies the ordered items, their properties and costs, and confirms the delivery on a specified date. Later, the builder discovers that the quality of the delivered products does not meet the expected standards. The builder informs the supplier of this in a written complaint, also via email. The text of these emails contains a wealth of information that can be extracted, processed, and further processed using NLP methods to improve understanding. Due to the large number of different offers and interactions, manual processing is very time-consuming, and programmatic evaluation of the communication provides a remedy.

Next, we introduce a knowledge management pipeline that gradually checks these two emails for their content and provides users with the maximum benefit through text processing. Click on the interactive boxes to see how the Knowledge Management Pipeline works!

Klicken Sie auf den unteren Button, um den Inhalt von zu laden.

Inhalt laden

Summary (Task: Summarization)

In the first step, the content of each text can be summarized and brought to the point in a few sentences. This reduces the text to important information and knowledge, removes irrelevant information such as platitudes and repetitions, and greatly reduces the amount of text to be read.

Especially with long emails, the added value of summary alone is enormous: listing the important content as bullet points saves time, prevents misunderstandings, and avoids overlooking important details.

General summaries are already helpful, but with the latest language models, NLP can do much more. In a general summary, the text length is reduced as much as possible while maintaining the same information density. Large language models can not only produce a general summary but also customize this process to specific needs of employees. For example, facts can be highlighted, or technical jargon can be simplified. In particular, summaries can be performed for a specific audience, such as a specific department within the company.

Different departments and roles require different types of information. This is why summaries are particularly useful when tailored to the interests of a specific department or role. For example, the two emails in our case study contain information that is relevant to the legal, operations, or finance department in different ways. Therefore, the next step is to create a separate summary for each department:

Klicken Sie auf den unteren Button, um den Inhalt von zu laden.

Inhalt laden

This makes it even easier for users to identify and understand the information that is relevant to them, while also drawing the right conclusions for their work.

Generative NLP models not only allow texts to be condensed to the essential, but also provide explanations for ambiguities and details. An example of this is the explanation of a regulation mentioned only by an acronym in the confirmation of an order, whose details the user may not be familiar with. This eliminates the need for a tedious online search for a suitable explanation.

Klicken Sie auf den unteren Button, um den Inhalt von zu laden.

Inhalt laden

Knowledge Extraction (Task: NER, Sentiment Analysis, Classification)

The next step is to systematically categorize the emails and their contents. This allows incoming emails to be automatically assigned to the correct mailboxes, annotated with metadata, and collected in a structured way.

For example, emails received on a customer service account can be automatically classified into defined categories (complaints, inquiries, suggestions, etc.). This eliminates the manual categorization of emails, which reduces the likelihood of incorrect categorizations and ensures more robust processes.

Within these categories, the contents of emails can be further divided using semantic content analysis, for example, to determine the urgency of a request. More on that later.

Klicken Sie auf den unteren Button, um den Inhalt von zu laden.

Inhalt laden

Once the emails are correctly classified, metadata can be extracted and created from each text using “Named Entity Recognition (NER).”

NER allows entities in texts to be identified and named. Entities can be people, places, organizations, dates, or other named objects. Regarding email inboxes and their contents, NER can be useful in extracting important information and connections within the texts. By identifying and categorizing entities, relevant information can be quickly found and classified.

In the case of complaints, NER can be used to identify the names of the product, the customer, and the seller. This information can then be used to solve the problem or make changes to the product to avoid future complaints.

NER can also help automatically highlight relevant facts and connections in emails after they are classified. For example, if an order is received as an email from a customer, NER can extract the relevant information, enrich the email with metadata, and automatically forward it to the appropriate salesperson.

Similarity (Task: Semantic Similarity)

Successful knowledge management first requires identifying and gathering relevant data, facts, and documents in a targeted manner. This has been a particularly challenging task with unstructured text data such as emails, which are also stored in information silos (i.e. in mailboxes). To better capture the content of incoming emails and their overlaps, methods for semantic analysis of text can be employed. “Semantic Similarity Analysis” is a technology used to understand the meaning of texts and measure the similarities between different texts.

In the context of knowledge management, semantic analysis can help group emails and identify those that relate to the same topic or contain similar requests. This can increase the productivity of customer support teams by allowing them to focus on important tasks, rather than spending a lot of time manually sorting or searching through emails.

In addition, semantic analysis can help identify trends and patterns in incoming emails that may indicate problems or opportunities for improvement in the company. These insights can then be used to proactively address customer needs or improve processes and products.

Answer Generation (Task: Text Generation)

Finally, emails need to be answered. Those who have already experimented with text suggestions in email programs know that this task is not yet ready for automation. However, generative models can help answer emails faster and more accurately. A generative language model can quickly and reliably generate response templates based on incoming emails, which then only need to be supplemented, completed and checked by the person processing them. It is important to carefully check each response before sending it, as generative models are known to hallucinate results, i.e. generate convincing answers that contain errors upon closer examination. Here too, AI systems can at least partially remedy the situation by using a “control model” to verify the facts and statements of these “response models” for accuracy.

Klicken Sie auf den unteren Button, um den Inhalt von zu laden.

Inhalt laden

Conclusion

Natural Language Processing (NLP) offers companies numerous opportunities to improve their knowledge management strategies. NLP enables us to extract precise information from unstructured text and optimize the processing and provision of knowledge for employees.

By applying NLP methods to emails, documents, and other text sources, companies can automatically categorize, summarize, and reduce content to the most important information. This allows employees to quickly and easily access important information without having to wade through long pages of text. This saves time, reduces error-proneness, and contributes to making better business decisions.

At the example of a construction project, we demonstrated how NLP can be used in practice to process emails more efficiently and improve knowledge management. The application of NLP techniques, such as summarizing and specifying information for specific departments, can help companies better achieve their goals and improve their performance.

The application of NLP in knowledge management offers great advantages for companies. It can help automate processes, improve collaboration, increase efficiency, and optimize decision-making quality. Companies that integrate NLP into their knowledge management strategy can gain valuable insights that enable them to better navigate an increasingly complex business environment.

Image source: AdobeStock 459537717 Oliver Guggenbühl, Jonas Braun

Artificially enhancing face images is all the rage

What can AI contribute?

In recent years, image filters have become wildly popular on social media. These filters let anyone adjust their face and the surroundings in different ways, leading to entertaining results. Often, filters enhance facial features that seem to match a certain beauty standard. As AI experts, we asked ourselves what is possible to achieve in the topic of face representations using our tools. One issue that sparked our interest is gender representations. We were curious: how does the AI represent gender differences when creating these images? And on top of that: can we generate gender-neutral versions of existing face images?

Using StyleGAN on existing images

When thinking about what existing images to explore, we were curious to see how our own faces would be edited. Additionally, we decided to use several celebrities as inputs – after all, wouldn’t it be intriguing to observe world-famous faces morphed into different genders?

Currently, we often see text-prompt-based image generation models like DALL-E in the center of public discourse. Yet, the AI-driven creation of photo-realistic face images has long been a focus of researchers due to the apparent challenge of creating natural-looking face images. Searching for suitable AI models to approach our idea, we chose the StyleGAN architectures that are well known for generating realistic face images.

Adjusting facial features using StyleGAN

One crucial aspect of this AI’s architecture is the use of a so-called latent space from which we sample the inputs of the neural network. You can picture this latent space as a map on which every possible artificial face has a defined coordinate. Usually, we would just throw a dart at this map and be happy about the AI producing a realistic image. But as it turns out, this latent space allows us to explore various aspects of artificial face generation. When you move from one face’s location on that map to another face’s location, you can generate mixtures of the two faces. And as you move in any arbitrary direction, you will see random changes in the generated face image.

This makes the StyleGAN architecture a promising approach for exploring gender representation in AI.

Can we isolate a gender direction?

So, are there directions that allow us to change certain aspects of the generated image? Could a gender-neutral representation of a face be approached this way? Pre-existing works have found semantically interesting directions, yielding fascinating results. One of those directions can alter a generated face image to have a more feminine or masculine appearance. This lets us explore gender representation in images.

The approach we took for this article was to generate multiple images by making small steps in each gender’s direction. That way, we can compare various versions of the faces, and the reader can, for example, decide which image comes closest to a gender-neutral face. It also allows us to examine the changes more clearly and look for unwanted characteristics in the edited versions.

Introducing our own faces to the AI

The described method can be utilized to alter any face generated by the AI towards a more feminine or masculine version. However, a crucial challenge remains: Since we want to use our own images as a starting point, we must be able to obtain the latent coordinate (in our analogy, the correct place on the map) for a given face image. Sounds easy at first, but the used StyleGAN architecture only allows us to go one way, from latent coordinate to generated image, not the other way around. Thankfully, other researchers have explored this very problem. Our approach thus heavily builds on the python notebook found here. The researchers built another “encoder”-AI that takes a face image as input and finds its corresponding coordinate in the latent space.

And with that, we finally have all parts necessary to realize our goal: exploring different gender representations using an AI. In the photo sequences below, the center image is the original input image. Towards the left, the generated faces appear more female; towards the right, they seem more male. Without further ado, we present the AI-generated images of our experiment:

Results: photo series from female to male

Unintended biases

After finding the corresponding images in the latent space, we generated artificial versions of the faces. We then moved them along the chosen gender direction, creating “feminized” and “masculinized” faces. Looking at the results, we see some unexpected behavior in the AI: it seems to recreate classic gender stereotypes.

Big smiles vs. thick eyebrows

Whenever we edited an image to look more feminine, we gradually see an opening mouth with a stronger smile and vice versa. Likewise, eyes grow larger and wide open in the female direction. The Drake and Kim Kardashian examples illustrate a visible change in skin tone from darker to lighter when moving along the series from feminine to masculine. The chosen gender direction appears to edit out curls in the female direction (as opposed to the male direction), as exemplified by the examples of Marylin Monroe and blog co-author Isabel Hermes. We also asked ourselves whether the lack of hair extension in Drake’s female direction would be remedied if we extended his photo series. Examining the overall extremes, eyebrows are thinned out and arched on the female and straighter and thicker on the male side. Eye and lip makeup increase heavily on faces that move in the female direction, making the area surrounding the eyes darker and thinning out eyebrows. This may be why we perceived the male versions we generated to look more natural than the female versions.

Finally, we would like to challenge you, as the reader, to examine the photo series above closely. Try to decide which image you perceive as gender-neutral, i.e., as much male as female. What made you choose that image? Did any of the stereotypical features described above impact your perception?

A natural question that arises from image series like the ones generated for this article is whether there is a risk that the AI reinforces current gender stereotypes.

Is the AI to blame for recreating stereotypes?

Given that the adjusted images recreate certain gender stereotypes like a more pronounced smile in female images, a possible conclusion could be that the AI was trained on a biased dataset. And indeed, to train the underlying StyleGAN, image data from Flickr was used that inherits the biases from the website. However, the main goal of this training was to create realistic images of faces. And while the results might not always look as we expect or want, we would argue that the AI did precisely that in all our tests.

To alter the images, however, we used the beforementioned latent direction. In general, those latent directions rarely change only a single aspect of the created image. Instead, like walking in a random direction on our latent map, many elements of the generated face usually get changed simultaneously. Identifying a direction that alters only a single aspect of a generated image is anything but trivial. For our experiment, the chosen direction was created primarily for research purposes without accounting for said biases. It can therefore introduce unwanted artifacts in the images alongside the intended alterations. Yet it is reasonable to assume that a latent direction exists that allows us to alter the gender of a face created by the StyleGAN without affecting other facial features.

Overall, the implementations we build upon use different AIs and datasets, and therefore the complex interplay of those systems doesn’t allow us to identify the AI as a single source for these issues. Nevertheless, our observations suggest that doing due diligence to ensure the representation of different ethnic backgrounds and avoid biases in creating datasets is paramount.

Abb. 7: Picture from “A Sex Difference in Facial Contrast and its Exaggeration by Cosmetics” by Richard Russel

Subconscious bias: looking at ourselves

A study by Richard Russel deals with human perception of gender in faces. Ask yourself, which gender would you intuitively assign to the two images above? It turns out that most people perceive the left person as male and the right person as female. Look again. What separates the faces? There is no difference in facial structure. The only difference is darker eye and mouth regions. It becomes apparent that increased contrast is enough to influence our perception. Suppose our opinion on gender can be swayed by applying “cosmetics” to a face. In that case, we must question our human understanding of gender representations and whether they are simply products of our life-long exposure to stereotypical imagery. The author refers to this as the “Illusion of Sex”.
This bias relates to the selection of latent “gender” dimension: To find the latent dimension that changes the perceived gender of a face, StyleGAN-generated images were divided into groups according to their appearance. While this was implemented based on yet another AI, human bias in gender perception might well have impacted this process and have leaked through to the image rows illustrated above.

Conclusion

Moving beyond the gender binary with StyleGANs

While a StyleGAN might not reinforce gender-related bias in and of itself, people still subconsciously harbor gender stereotypes. Gender bias is not limited to images – researchers have found the ubiquity of female voice assistants reason enough to create a new voice assistant that is neither male nor female: GenderLess Voice.

One example of a recent societal shift is the debate over gender; rather than binary, gender may be better represented as a spectrum. The idea is that there is biological gender and social gender. Being included in society as who they are is essential for somebody who identifies with a gender that differs from that they were born with.

A question we, as a society, must stay wary of is whether the field of AI is at risk of discriminating against those beyond the assigned gender binary. The fact is that in AI research, gender is often represented as binary. Pictures fed into algorithms to train them are either labeled as male or female. Gender recognition systems based on deterministic gender-matching may also cause direct harm by mislabelling members of the LGBTQIA+ community. Currently, additional gender labels have yet to be included in ML research. Rather than representing gender as a binary variable, it could be coded as a spectrum.

Exploring female to male gender representations

We used StyleGANs to explore how AI represents gender differences. Specifically, we used a gender direction in the latent space. Researchers determined this direction to display male and female gender. We saw that the generated images replicated common gender stereotypes – women smile more, have bigger eyes, longer hair, and wear heavy makeup – but importantly, we could not conclude that the StyleGAN model alone propagates this bias. Firstly, StyleGANs were created primarily to generate photo-realistic face images, not to alter the facial features of existing photos at will. Secondly, since the latent direction we used was created without correcting for biases in the StyleGANs training data, we see a correlation between stereotypical features and gender.

Next steps and gender neutrality

We asked ourselves which faces we perceived as gender neutral among the image sequences we generated. For original images of men, we had to look towards the artificially generated female direction and vice versa. This was a subjective choice. We see it as a logical next step to try to automate the generation of gender-neutral versions of face images to explore further the possibilities of AI in the topic of gender and society. For this, we would first have to classify the gender of the face to be edited and then move towards the opposite gender to the point where the classifier can no longer assign an unambiguous label. Therefore, interested readers will be able to follow the continuation of our journey in a second blog article in the coming time.

If you are interested in our technical implementation for this article, you can find the code here and try it out with your own images.

Resources

Photo Credits
AdobeStock 210526825 – Wayhome Studio
AdobeStock 243124072 – Damir Khabirov
AdobeStock 387860637 – insta_photos
AdobeStock 395297652 – Nattakorn
AdobeStock 480057743 – Chris
AdobeStock 573362719 – Xavier Lorenzo
AdobeStock 575284046 – Jose Calsina Isabel Hermes, Alexander Müller

Why we need AI Principles

Artificial intelligence has already begun and will continue to fundamentally transform our world. Algorithms increasingly influence how we behave, think, and feel. Companies around the globe will continue to adapt AI technology and rethink their current processes and business models. Our social structures, how we work, and how we interact with each other will change with the advancements of digitalization, especially in AI.

Beyond its social and economic influence, AI also plays a significant role in one of the biggest challenges of our time: climate change. On the one hand, AI can provide instruments to tackle parts of this urgent challenge. On the other hand, the development and the implementation of AI applications will consume a lot of energy and emit massive amounts of greenhouse gases.

Risks of AI

With the advancement of a technology that has such a high impact on all areas of our lives come huge opportunities but also big risks. To give you an impression of the risks, we just picked a few examples:

AI can be used to monitor people, for example, through facial recognition systems. Some countries are already using this technology extensively for a few years.
AI is used in very sensitive areas where minor malfunctions could have dramatic implications. Examples are autonomous driving, robot-assisted surgery, credit scoring, recruiting candidate selection, or law enforcement.
The Facebook and Cambridge Analytica scandal showed that data and AI technologies can be used to build psychographic profiles. These profiles allow microtargeting of individuals with customized content to influence elections. This example shows the massive power of AI technologies and its potential for abuse and manipulation.
With recent advancements in computer vision technology, deep learning algorithms can now be used to create deepfakes. Deepfakes are realistic videos or images of people doing or saying something they never did or said. Obviously, this technology comes with enormous risks.
Artificial intelligence solutions are often developed to improve or optimize manual processes. There will be use cases where this will lead to a replacement of human work. A challenge that cannot be ignored and needs to be addressed early.
In the past, AI models reproduced discriminating patterns of the data they were trained on. For example, Amazon used an AI system in their recruiting process that clearly disadvantaged women.

These examples make clear that every company and every person developing AI systems should reflect very carefully on the impact the system will or might have on society, specific groups, or even individuals.

Therefore, the big challenge for us is to ensure that the AI technologies we develop help and enable people while minimizing any forms of associated risks.

Why are there no official regulations in place in 2022?

You might be asking yourself why there is no regulation in place to address this issue. The problem with new technology, especially artificial intelligence, is that it advances fast, sometimes even too fast.

Recent releases of new language models like GPT-3 or computer vision models, for example, DALLE-2, exceeded the expectations of many AI experts. The abilities and applications of AI technologies will continually advance faster than regulation can. And we are not talking about months, but years.

It is fair to say that the EU made its first attempt in this direction by proposing a regulatory framework for artificial intelligence. However, they indicate that the regulation could apply to operators in the second half of 2024 at the earliest. That is years after the above-described examples became a reality.

Our approach: statworx AI Principles

The logical consequence of this issue is that we, as a company, must address this challenge ourselves. And therefore, we are currently working on the statworx AI Principles, a set of principles that guide us when developing AI solutions.

What we have done so far and how we got here

In our task force “AI & Society”, we started to tackle this topic. First, we scanned the market and found many interesting papers but concluded that none of them could be transferred 1:1 to our business model. Often these principles or guidelines were very fuzzy or too detailed and unsuitable for a consulting company that operates in a B2B setting as a service provider. So, we decided we needed to devise a solution ourselves.

The first discussions showed four big challenges:

On the one hand, the AI Principles must be formulated clearly and for a high-level audience so that non-experts also understand their meaning. On the other hand, they must be specific to be able to integrate them into our delivery processes.
As a service provider, we may have limited control and decision power about some aspects of an AI solution. Therefore, we must understand what we can decide and what is beyond our control.
Our AI Principles will only add sustainable value if we can act according to them. Therefore, we need to promote them in our projects to the customers. We recognize that budget constraints, financial targets, and other factors might work against the proper application of these principles as it will need additional time and money.
Furthermore, what is wrong and right is not always obvious. Our discussions showed that there are many different perceptions of the right and necessary things to do. This means we will have to find common ground on which we can all agree.

Our two key take-aways

A key insight from these thoughts was that we would need two things.

As a first step, we need high-level principles that are understandable, clear, and where everyone is on board. These principles act as a guiding idea and give orientation when decisions are made. In a second step, we will use them to derive best practices or a framework that translates these principles into concrete actions during all phases of our project delivery.

The second major thing we learned, is that it is tough to undergo this process and ask these questions but also that it is inevitable for every company that develops or uses AI technology.

What comes next

So far, we are nearly at the end of the first step. We will soon communicate the statworx AI Principles through our channels. If you are currently in this process, too, we would be happy to get in touch to understand what you did and learned.

References

https://www.nytimes.com/2019/04/14/technology/china-surveillance-artificial-intelligence-racial-profiling.html

https://www.nytimes.com/2018/04/04/us/politics/cambridge-analytica-scandal-fallout.html

https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G

https://digital-strategy.ec.europa.eu/en/policies/european-approach-artificial-intelligence

https://www.bundesregierung.de/breg-de/themen/umgang-mit-desinformation/deep-fakes-1876736

https://www.welt.de/wirtschaft/article173642209/Jobverlust-Diese-Jobs-werden-als-erstes-durch-Roboter-ersetzt.html

Jan Fischer Jan Fischer Jan Fischer Jan Fischer Jan Fischer Jan Fischer

Whether deliberate or unconscious, bias in our society makes it difficult to create a gender-equal world free of stereotypes and discrimination. Unfortunately, this gender bias creeps into AI technologies, which are rapidly advancing in all aspects of our daily lives and will transform our society as we have never seen before. Therefore, creating fair and unbiased AI systems is imperative for a diverse, equitable, and inclusive future. It is crucial not only to be aware of this issue but that we act now, before these technologies reinforce our gender bias even more, including in areas of our lives where we have already eliminated them.

Solving starts with understanding: To work on solutions to eliminate gender bias and all other forms of bias in AI, we first need to understand what it is and where it comes from. Therefore, in the following, I will first introduce some examples of gender-biased AI technologies and then give you a structured overview of the different reasons for bias in AI. I will present the actions needed towards fairer and more unbiased AI systems in a second step.

Sexist AI

Gender bias in AI has many faces and has severe implications for women’s equality. While Youtube shows my single friend (male, 28) advertisements for the latest technical inventions or the newest car models, I, also single and 28, have to endure advertisements for fertility or pregnancy tests. But AI is not only used to make decisions about which products we buy or which series we want to watch next. AI systems are also being used to decide whether or not you get a job interview, how much you pay for your car insurance, how good your credit score is, or even what medical treatment you will get. And this is where bias in such systems really starts to become dangerous.

In 2015, for example, Amazon’s recruiting tool falsely learned that men are better programmers than women, thus, not rating candidates for software developer jobs and other technical posts in a gender-neutral way.

In 2019, a couple applied for the same credit card. Although the wife had a slightly better credit score and the same income, expenses, and debts as her husband, the credit card company set her credit card limit much lower, which the customer service of the credit card company could not explain.

If these sexist decisions were made by humans, we would be outraged. Fortunately, there are laws and regulations against sexist behavior for us humans. Still, AI has somehow become above the law because an assumably rational machine made the decision. So, how can an assumably rational machine become biased, prejudiced, and racist? There are three interlinked reasons for bias in AI: data, models, and community.

Data is Destiny

First, data is a mirror of our society, with all our values, assumptions, and, unfortunately, also biases. There is no such thing as neutral or raw data. Data is always generated, measured, and collected by humans. Data has always been produced through cultural operations and shaped into cultural categories. For example, most demographic data is labeled based on simplified, binary female-male categories. When gender classification conflates gender in this way, data is unable to show gender fluidity and one’s gender identity. Also, race is a social construct, a classification system invented by us humans a long time ago to define physical differences between people, which is still present in data.

The underlying mathematical algorithm in AI systems is not sexist itself. AI learns from data with all its potential gender biases. For example, suppose a face recognition model has never seen a transgender or non-binary person because there was no such picture in the data set. In that case, it will not correctly classify a transgender or non-binary person (selection bias).

Or, as in the case of Google translate, the phrase “eine Ärztin” (a female doctor) is consistently translated into the masculine form in gender-inflected languages because the AI system has been trained on thousands of online texts where the male form of “doctor” was more prevalent due to historical and social circumstances (historical bias). According to Invisible Women, there is a big gender gap in Big Data in general, to the detriment of women. So if we do not pay attention to what data we feed these algorithms, they will take over the gender gap in the data, leading to serious discrimination of women.

Models need Education

Second, our AI models are unfortunately not smart enough to overcome the biases in the data. Because current AI models only analyze correlations and not causal structures, they blindly learn what is in the data. These algorithms inherent a systematical structural conservatism, as they are designed to reproduce given patterns in the data.

To illustrate this, I will use a fictional and very simplified example: Imagine a very stereotypical data set with many pictures of women in kitchens and men in cars. Based on these pictures, an image classification algorithm has to learn to predict the gender of a person in a picture. Due to the data selection, there is a high correlation between kitchens and women and between cars and men in the data set – a higher correlation than between some characteristic gender features and the respective gender. As the model cannot identify causal structures (what are gender-specific features), it thus falsely learns that having a kitchen in the picture also implies having women in the picture and the same for cars and men. As a result, if there’s a woman in a car in some image, the AI would identify the person as a man and vice versa.

However, this is not the only reason AI systems cannot overcome bias in data. It is also because we do not “tell” the systems that they should watch out for this. AI algorithms learn by optimizing a certain objective or goal defined by the developers. Usually, this performance measure is an overall accuracy metric, not including any ethical or fairness constraints. It is as if a child was to learn to get as much money as possible without any additional constraints such as suffering consequences from stealing, exploiting, or deceiving. If we want AI systems to learn that gender bias is wrong, we have to incorporate this into their training and performance evaluation.

Community lacks Diversity

Last, it is the developing community who directly or indirectly, consciously or subconsciously introduces their own gender and other biases into AI technologies. They choose the data, define the optimization goal, and shape the usage of AI.

While there may be malicious intent in some cases, I would argue that developers often bring their own biases into AI systems at an unconscious level. We all suffer from unconscious biases, that is, unconscious errors in thinking that arise from problems related to memory, attention, and other mental mistakes. In other words, these biases result from the effort to simplify the incredibly complex world in which we live.

For example, it is easier for our brain to apply stereotypic thinking, that is, perceiving ideas about a person based on what people from a similar group might “typically “be like (e.g., a man is more suited to a CEO position) than to gather all the information to fully understand a person and their characteristics. Or, according to the affinity bias, we like people most who look and think like us, which is also a simplified way of understanding and categorizing the people around us.

We all have such unconscious biases, and since we are all different people, these biases vary from person to person. However, since the current community of AI developers comprises over 80% white cis-men, the values, ideas, and biases creeping into AI systems are very homogeneous and thus literally narrow-minded. Starting with the definition of AI, the founding fathers of AI back in 1956 were all white male engineers, a very homogeneous group of people, which led to a narrow idea of what intelligence is, namely the ability to win games such as chess. However, from psychology, we know that there are a lot of different kinds of intelligence, such as emotional or social intelligence. Still, today, if a model is developed and reviewed by a very homogenous group of people, without special attention and processes, they will not be able to identify discrimination who are different from themselves due to unconscious biases. Indeed, this homogenous community tends to be the group of people who barely suffer from bias in AI.

Just imagine if all the children in the world were raised and educated by 30-year-old white cis-men. That is what our AI looks like today. It is designed, developed, and evaluated by a very homogenous group, thus, passing on a one-sided perspective on values, norms, and ideas. Developers are at the core of this. They are teaching AI what is right or wrong, what is good or bad.

Break the Bias in Society

So, a crucial step towards fair and unbiased AI is a diverse and inclusive AI development community. Meanwhile, there are some technical solutions to the mentioned data and model bias problems (e.g., data diversification or causal modeling). Still, all these solutions are useless if the developers fail to think about bias problems in the first place. Diverse people can better check each other’s blindspots, each other’s biases. Many studies show that diversity in data science teams is critical in reducing bias in AI.

Furthermore, we must educate our society on AI, its risks, and its chances. We need to rethink and restructure the education of AI developers, as they need as much ethical knowledge as technical knowledge to develop fair and unbiased AI systems. We need to educate the broad population that we all can also become part of this massive transformation through AI to contribute our ideas and values to the design and development of these systems.

In the end, if we want to break the bias in AI, we need to break the bias in our society. Diversity is the solution to fair and unbiased AI, not only in AI developing teams but across our whole society. AI is made by humans, by us, by our society. Our society with its structures brings bias in AI: through the data we produce, the goals we expect the machines to achieve and the community developing these systems. At its core, bias in AI is not a technical problem – it is a social one.

Positive Reinforcement of AI

Finally, we need to ask ourselves: do we want AI reflecting society as it is today or a more equal society of tomorrow? Suppose we are using machine learning models to replicate the world as it is today. In that case, we are not going to make any social progress. If we fail to take action, we might lose some social progress, such as more gender equality, as AI amplifies and reinforces bias back into our lives. AI is supposed to be forward-looking. But at the same time, it is based on data, and data reflects our history and present. So, as much as we need to break the bias in society to break the bias in AI systems, we need unbiased AI systems for social progress in our world.

Having said all that, I am hopeful and optimistic. Through this amplification effect, AI has raised awareness of old fairness and discrimination issues in our society on a much broader scale. Bias in AI shows us some of the most pressing societal challenges. Ethical and philosophical questions become ever more important. And because AI has this reinforcement effect on society, we can also use it for the positive. We can use this technology for good. If we all work together, it is our chance to remake the world into a much more diverse, inclusive, and equal place. Livia Eichenberger Livia Eichenberger Livia Eichenberger

Sexist AI

Data is Destiny

Models need Education

Community lacks Diversity

Break the Bias in Society

Positive Reinforcement of AI