en
                    array(2) {
  ["de"]=>
  array(13) {
    ["code"]=>
    string(2) "de"
    ["id"]=>
    string(1) "3"
    ["native_name"]=>
    string(7) "Deutsch"
    ["major"]=>
    string(1) "1"
    ["active"]=>
    int(0)
    ["default_locale"]=>
    string(5) "de_DE"
    ["encode_url"]=>
    string(1) "0"
    ["tag"]=>
    string(2) "de"
    ["missing"]=>
    int(0)
    ["translated_name"]=>
    string(6) "German"
    ["url"]=>
    string(62) "https://www.statworx.com/content-hub/blog/category/management/"
    ["country_flag_url"]=>
    string(87) "https://www.statworx.com/wp-content/plugins/sitepress-multilingual-cms/res/flags/de.png"
    ["language_code"]=>
    string(2) "de"
  }
  ["en"]=>
  array(13) {
    ["code"]=>
    string(2) "en"
    ["id"]=>
    string(1) "1"
    ["native_name"]=>
    string(7) "English"
    ["major"]=>
    string(1) "1"
    ["active"]=>
    string(1) "1"
    ["default_locale"]=>
    string(5) "en_US"
    ["encode_url"]=>
    string(1) "0"
    ["tag"]=>
    string(2) "en"
    ["missing"]=>
    int(0)
    ["translated_name"]=>
    string(7) "English"
    ["url"]=>
    string(68) "https://www.statworx.com/en/content-hub/blog/category/management-en/"
    ["country_flag_url"]=>
    string(87) "https://www.statworx.com/wp-content/plugins/sitepress-multilingual-cms/res/flags/en.png"
    ["language_code"]=>
    string(2) "en"
  }
}
                    
Contact

A successful data culture is the key for companies to derive the greatest possible benefit from the constantly growing amount of data. The trend towards making decisions based on data is unstoppable. But how do managers manage to empower their teams to use data effectively?

A vibrant data culture: the fuel for corporate success

Data culture is more than just a buzzword – it is the basis for data-driven decision-making. When all departments in a company use data to improve work processes and decision-making, an atmosphere is created in which competent handling of data is standard practice.

Why is this so important? Data is fuel for business success: 76 percent of participants in the BARC Data Culture Survey 22 stated that their company strives for a data culture. And 75 percent of managers see data culture as the most important competence.

The decisive role of managers

An established data culture is not only a success factor for the company, but also a way to promote innovation and motivate employees. Managers play a central role here by acting as trailblazers and actively supporting change. They must clearly communicate the benefits of a data culture, establish clear guidelines for data protection and data quality and offer targeted training and regular communication on progress. Clear responsibility for data culture is crucial, as 31% of companies with a weak data culture do not have a dedicated department or person with this responsibility.

Challenges and solutions

The path to a successful data culture is littered with hurdles. Managers have to face various challenges:

  • Resistance to change: A transition to a data culture can be met with resistance. Managers must clearly communicate the benefits and offer training to involve their employees in the change process.
  • Lack of data governance: Guidelines and standards for handling data are crucial. If these are missing, data quality will be reduced in the worst case. This leads to wrong decisions. This is where data cleansing and validation methods and regular audits are needed.
  • Concerns regarding data protection: Data protection and data access are often in conflict. Clear guidelines and security measures must be introduced in order to gain the trust of employees.
  • Lack of resources and support: Without the necessary resources, building a data culture can fail. Companies must provide targeted training and demonstrate the business benefits in economic metrics to gain the support of their executives

Best practices for a strong data culture

To effectively establish a data culture, companies can rely on the following best practices:

Critical thinking: Promoting critical thinking and ethical standards is crucial. Data and AI solutions will become tools of everyday life everywhere. Therefore, human intelligence remains the most important skill in dealing with technology.

Measuring and planning: Data culture can only be built up step by step. Companies should measure and evaluate data-driven behavior in order to assess progress. The stronger the data culture, the more omnipresent data-driven decision-making becomes.

Establishment of key roles: Companies should create special functions or roles for employees who link the data strategy with the corporate strategy and act as central multipliers to promote the data culture among employees.

The development of a strong data culture requires clear leadership, clear guidelines and the commitment of the entire organization. Managers play a crucial role in successfully shaping the change to a data-driven culture.

Building a strong data culture: our strategic approach

At statworx, we specialize in establishing robust data cultures in companies. Our strategy is based on proven frameworks, best practices and our extensive experience to lay the foundations for a successful data culture in your organization.

  1. Data Culture Strategy: Working hand in hand with our clients’ teams, we develop the strategic roadmap required to foster a thriving data culture. This includes building the foundational structures that are essential to maximizing the potential of your company’s data.
  2. Data culture training: We focus on empowering your workforce with the skills and knowledge to operate in the data and AI space. Our training programs aim to equip employees with the skills that are essential for building a strong data culture. This enables companies to realize the full potential of data and artificial intelligence.
  3. Change management and support: Embedding a data culture requires sustained change management efforts. We work with client teams to establish long-term change programs aimed at initiating and consolidating a robust data culture within the organization. Our goal is to ensure that the transformation remains embedded in the DNA of the organization to ensure continued success.

With our comprehensive range of services, we strive to lead companies into a future where data becomes a strategic asset that unlocks new opportunities and enables informed decision-making at all levels. We have written about this in detail in our white paper Data culture as a management task in companies (currently only available in German), which also contains a data culture checklist. Our services relating to data culture can be found on our Data Culture topic page. Tarik Ashry

We are at the beginning of 2024, a time of fundamental change and exciting progress in the world of artificial intelligence. The next few months are seen as a critical milestone in the evolution of AI as it transforms from a promising future technology to a permanent reality in the business and everyday lives of millions. Together with the AI Hub Frankfurt, the central AI network in the Rhine-Main region, we are therefore presenting our trend forecast for 2024, the AI Trends Report 2024.

The report identifies twelve dynamic AI trends that are unfolding in three key areas: Culture and Development, Data and Technology, and Transparency and Control. These trends paint a picture of the rapid changes in the AI landscape and highlight the impact on companies and society.

Our analysis is based on extensive research, industry-specific expertise and input from experts. We highlight each trend to provide a forward-looking insight into AI and help companies prepare for future challenges and opportunities. However, we emphasize that trend forecasts are always speculative in nature and some of our predictions are deliberately bold.

Directly to the AI Trends Report 2024!

What is a trend?

A trend is different from both a short-lived fashion phenomenon and media hype. It is a phenomenon of change with a “tipping point” at which a small change in a niche can cause a major upheaval in the mainstream. Trends initiate new business models, consumer behavior and forms of work and thus represent a fundamental change to the status quo. It is crucial for companies to mobilize the right knowledge and resources before the tipping point in order to benefit from a trend.

12 AI trends that will shape 2024

In the AI Trends Report 2024, we identify groundbreaking developments in the field of artificial intelligence. Here are the short versions of the twelve trends, each with a selected quote from our experts.

Part 1: Culture and development

From the 4-day week to omnimodality and AGI: 2024 promises great progress for the world of work, for media production and for the possibilities of AI as a whole.

Thesis I: AI expertise within the company
Companies that deeply embed AI expertise in their corporate culture and build interdisciplinary teams with tech and industry knowledge will secure a competitive advantage. Centralized AI teams and a strong data culture are key to success.

“Data culture can‘t be bought or dictated. You need to win the head, the heart and the herd. We want our employees to consciously create, use and share data and give them access to data, analytics and AI together with the knowledge and the mindset to run the business on data.”

Stefanie Babka, Global Head of Data Culture, Merck

 

Thesis II: 4-day working week thanks to AI
Thanks to AI automation in standard software and company processes, the 4-day working week has become a reality for some German companies. AI tools such as Microsoft’s Copilot increase productivity and make it possible to reduce working hours without compromising growth.

“GenAI will continue to drive automation in many areas. This will be the new benchmark for standard processes in all sectors. While this may have a positive impact on reducing working hours, we need to ensure that GenAI is used responsibly, especially in sensitive and customer-facing areas.”

Dr. Jean Enno Charton, Director Digital Ethics & Bioethics, Merck

 

Thesis III: AGI through omnimodal models
The development of omnimodal AI models that mimic human senses brings the vision of general artificial intelligence (AGI) closer. These models process a variety of inputs and extend human capabilities.

“Multimodal models trained on more than just text have shown that they are better able to draw conclusions and understand the world. We are excited to see what omnimodal models will achieve.”

Dr. Ingo Marquart, NLP Subject Matter Lead, statworx

 

Thesis IV: AI revolution in media production
Generative AI (GenAI) is transforming the media landscape and enabling new forms of creativity, but still falls short of transformational creativity. AI tools are becoming increasingly important for creatives, but it is important to maintain uniqueness against a global average taste.

“Those who integrate AI smartly will have a competitive advantage. There will be leaps in productivity in the areas of ideation, publishing and visuals. However, there will also be a lot of “low” and fake content (postings, messaging), so building trust will become even more important for brands. Social media tasks are shifting towards strategy, management and controlling.”

Nemo Tronnier, Founder & CEO, Social DNA

 

Part 2: Data and technology

In 2024, everything will revolve around data quality, open source models and access to processors. The operators of standard software such as Microsoft and SAP will benefit greatly because they occupy the interface to end users.

Thesis V: Challengers for NVIDIA
New players and technologies are preparing to shake up the GPU market and challenge NVIDIA’s position. Startups and established competitors such as AMD and Intel are looking to capitalize on the resource scarcity and long wait times that smaller players are currently experiencing and are focusing on innovation to break NVIDIA’s dominance.

“Contrary to popular belief, there isn’t really a shortage of AI accelerators if you count NVIDIA, Intel and AMD. The real problem is customer funding, as cloud providers are forced to offer available capacity with long-term contracts. This could change in 18 to 24 months when current deployments are sufficiently amortized. Until then, customers will have to plan for longer commitments.”

Norman Behrend, Chief Customer Officer, Genesis Cloud

 

Thesis VI: Data quality before data quantity
In AI development, the focus is shifting to the quality of the data. Instead of relying solely on quantity, the careful selection and preparation of training data and innovation in model architecture are becoming crucial. Smaller models with high-quality data can be superior to larger models in terms of performance.

“Data is not just one component of the AI landscape; having the right quality data is essential. Solving the ‘first-mile problem’ to ensure data quality and understanding the ‘last-mile problem’, i.e. involving employees in data and AI projects, are crucial for success.”

Walid Mehanna, Chief Data & AI Officer, Merck

 

Thesis VII: The year of the AI integrators
Integrators such as Microsoft, Databricks and Salesforce will be the winners as they bring AI tools to end users. The ability to seamlessly integrate into existing systems will be crucial for AI startups and providers. Companies that offer specialized services or groundbreaking innovations will secure lucrative niches.

“In 2024, AI integrators will show how they make AI accessible to end users. Their role is critical to the democratization of AI in the business world, enabling companies of all sizes to benefit from advanced AI. This development emphasizes the need for user-friendly and ethically responsible AI solutions.”

Marco Di Sazio, Head of Innovation, Bankhaus Metzler

 

Thesis VIII: The open source revolution
Open source AI models are competing with proprietary models such as OpenAI’s GPT and Google’s Gemini. With a community that fosters innovation and knowledge sharing, open source models offer more flexibility and transparency, making them particularly valuable for applications that require clear accountability and customization.

“Especially for SMEs, AI solutions are indispensable. Since a sufficient amount of data for a proprietary model is typically lacking, collaboration becomes crucial. However, the ability to adapt is essential in order to digitally advance your own business model.”

Prof. Dr. Christian Klein, Founder, UMYNO Solutions, Professor of Marketing & Digital Media, FOM University of Applied Sciences

 

Part 3: Transparency and control

The increased use of AI decision-making systems will spark an intensified debate on algorithm transparency and data protection in 2024 – in the search for accountability. The AI Act will become a locational advantage for Europe.

Thesis IX: AI transparency as a competitive advantage
European AI start-ups with a focus on transparency and explainability could become the big winners, as industries such as pharmaceuticals and finance already place high demands on the traceability of AI decisions. The AI Act promotes this development by demanding transparency and adaptability from AI systems, giving European AI solutions an edge in terms of trust.

“Transparency is becoming a key issue in the field of AI. This applies to the construction of AI models, the flow of data and the use of AI itself. It will have a significant impact on discussions about compliance, security and trust. The AI Act could even turn transparency and security into competitive advantages for European companies.”

Jakob Plesner, Attorney at Law, Gorrissen Federspiel

 

Thesis X: AI Act as a seal of quality
The AI Act positions Europe as a safe haven for investments in AI by setting ethical standards that strengthen trust in AI technologies. In view of the increase in deepfakes and the associated risks to society, the AI Act acts as a bulwark against abuse and promotes responsible growth in the AI industry.

“Companies facing technological change need a clear set of rules. By introducing a seal of approval for human-centered AI, the AI Act turns challenges into opportunities. The AI Act will become a blueprint internationally, giving EU companies a head start in responsible AI and making Europe a place for sustainable AI partnerships.”

Catharina Glugla, Head of Data, Cyber & Tech Germany, Allen & Overy LLP

 

Thesis XI: AI agents are revolutionizing consumption
Personal assistance bots that make purchases and select services will become an essential part of everyday life. Influencing their decisions will become a key element for companies to survive in the market. This will profoundly change search engine optimization and online marketing as bots become the new target groups.

“There will be several types of AI agents that act according to human intentions. For example, personal agents that represent an individual and service agents that represent an organization or institution. The interplay between them, such as personal-personal, personal-institutional and institutional-institutional, represents a new paradigm for economic activities and the distribution of value.”

Chi Wang, Principle Researcher, Microsoft Research

 

Thesis XII: Alignment of AI models
Aligning AI models with universal values and human intentions will be critical to avoid unethical outcomes and fully realize the potential of foundation models. Superalignment, where AI models work together to overcome complex challenges, is becoming increasingly important to drive the development of AI responsibly.

“Alignment is, at its core, an analytical problem that is about establishing transparency and control to gain user trust. These are the keys to effective deployment of AI solutions in companies, continuous evaluation and secure iteration based on the right metrics.”

Daniel Lüttgau, Head of AI Development, statworx

 

Concluding remarks

The AI Trends Report 2024 is more than an entertaining stocktake; it can be a useful tool for decision-makers and innovators. Our goal is to provide our readers with strategic advantages by discussing the impact of trends on different sectors and helping them set the course for the future.

This blog post offers only a brief insight into the comprehensive AI Trends Report 2024. We invite you to read the full report to dive deeper into the subject matter and benefit from the detailed analysis and forecasts.

To the AI Trends Report 2024! Tarik Ashry

At the beginning of December, the central EU institutions reached a provisional agreement on a legislative proposal to regulate artificial intelligence in the so-called trilogue. The final legislative text with all the details is now being drafted. As soon as this has been drawn up and reviewed, the law can be officially adopted. We have compiled the current state of knowledge on the AI Act.

As part of the ordinary legislative procedure of the European Union, a trilogue is an informal interinstitutional negotiation between representatives of the European Parliament, the Council of the European Union and the European Commission. The aim of a trialogue is to reach a provisional agreement on a legislative proposal that is acceptable to both the Parliament and the Council, the co-legislators. The provisional agreement must then be adopted by each of these bodies in formal procedures.

Legislation with a global impact

A special feature of the upcoming law is the so-called market location principle: according to this, companies worldwide that offer or operate artificial intelligence on the European market or whose AI-generated output is used within the EU will be affected by the AI Act.

Artificial intelligence is defined as machine-based systems that can autonomously make predictions, recommendations or decisions and thus influence the physical and virtual environment. This applies, for example, to AI solutions that support the recruitment process, predictive maintenance solutions and chatbots such as ChatGPT. The legal requirements that different AI systems must fulfill vary greatly depending on their classification into risk classes.

The risk class determines the legal requirements

The EU’s risk-based approach comprises a total of four risk classes:

  • low,
  • limited,
  • high,
  • and unacceptable risk.

These classes reflect the extent to which artificial intelligence jeopardizes European values and fundamental rights. As the term “unacceptable” for a risk class already indicates, not all AI systems are permissible. AI systems that belong to the “unacceptable risk” category are prohibited by the AI Act. The following applies to the other three risk classes: the higher the risk, the more extensive and stricter the legal requirements for the AI system.

We explain below which AI systems fall into which risk class and which requirements are associated with them. Our assessments are based on the information contained in the “AI Mandates” document dated June 2023. At the time of publication, this document was the most recently published, comprehensive document on the AI Act.

Ban on social scoring and biometric remote identification

Some AI systems have a significant potential to violate human rights and fundamental principles, which is why they are categorized as “unacceptable risk”. These include:

  • Real-time based remote biometric identification systems in publicly accessible spaces (exception: law enforcement agencies may use them to prosecute serious crimes but only with judicial authorization);
  • Biometric remote identification systems in retrospect (exception: law enforcement authorities may use them to prosecute serious crimes but only with judicial authorization);
  • Biometric categorization systems that use sensitive characteristics such as gender, ethnicity or religion;
  • Predictive policing based on so-called profiling – i.e. profiling based on skin color, suspected religious affiliation and similarly sensitive characteristics – geographical location or previous criminal behavior;
  • Emotion recognition systems for law enforcement, border control, the workplace and educational institutions;
  • Arbitrary extraction of biometric data from social media or video surveillance footage to create facial recognition databases;
  • Social scoring leading to disadvantage in social contexts;
  • AI that exploits the vulnerabilities of a particular group of people or uses unconscious techniques that can lead to behaviors that cause physical or psychological harm.

These AI systems are to be banned from the European market under the AI Act. Companies whose AI systems could fall into this risk class should urgently address the upcoming requirements and explore options for action. This is because a key result of the trilogue is that these systems will be banned just six months after official adoption.

Numerous requirements for AI with risks to health, safety and fundamental rights

The “high risk” category includes all AI systems that are not explicitly prohibited but nevertheless pose a high risk to health, safety or fundamental rights. The following areas of application and use are explicitly mentioned:

  • Biometric and biometric-based systems that do not fall into the “unacceptable risk” risk class;
  • Management and operation of critical infrastructure;
  • Education and training;
  • Access and entitlement to basic private and public services and benefits;
  • Employment, human resource management and access to self-employment;
  • Law enforcement;
  • Migration, asylum and border control;
  • Administration of justice and democratic processes

These AI systems are subject to comprehensive legal requirements that must be implemented prior to commissioning and observed throughout the entire AI life cycle:

  • Assessment to evaluate the effects on fundamental and human rights
  • Quality and risk management
  • Data governance structures
  • Quality requirements for training, test and validation data
  • Technical documentation and record-keeping obligations
  • Fulfillment of transparency and provision obligations
  • Human supervision, robustness, security and accuracy
  • Declaration of conformity incl. CE marking obligation
  • Registration in an EU-wide database

AI systems that are used in one of the above-mentioned areas but do not pose a risk to health, safety, the environment or fundamental rights are not subject to the legal requirements. However, this must be proven by informing the competent national authority about the AI system. The authority then has three months to assess the risks of the AI system. The AI can be put into operation within these three months. However, if the examining authority classifies it as high-risk AI, high fines may be imposed.

A special regulation also applies to AI products and AI safety components of products whose conformity is already being tested by third parties on the basis of EU legislation. This is the case for AI in toys, for example. In order to avoid overregulation and additional burdens, these will not be directly affected by the AI Act.

AI with limited risk must comply with transparency obligations

AI systems that interact directly with humans fall into the “limited risk” category. This includes emotion recognition systems, biometric categorization systems and AI-generated or modified content that resembles real people, objects, places or events and could be mistaken for real (“deepfakes”). For these systems, the draft law provides for the obligation to inform consumers about the use of artificial intelligence. This should make it easier for consumers to actively decide for or against their use. A code of conduct is also recommended.

No legal requirements for AI with low risk

Many AI systems, such as predictive maintenance or spam filters, fall into the “low risk” category. Companies that only offer or use such AI solutions will hardly be affected by the AI Act. This is because there are currently no legal requirements for such applications. Only a code of conduct is recommended.

Generative AI such as ChatGPT is regulated separately

Generative AI models and basic models with a wide range of possible applications were not included in the original draft of the AI Act. The regulatory possibilities of such AI models have therefore been the subject of particularly intense debate since the launch of ChatGPT by OpenAI. According to the European Council’s press statement of December 9, these models are now to be regulated on the basis of their risk. In principle, all models must implement transparency requirements. Foundation models with a particular risk – so-called “high-impact foundation models” – will also have to meet additional requirements. How exactly the risk of AI models will be assessed is currently still open. Based on the latest document, the following possible requirements for high-impact foundation models can be estimated:

  • Quality and risk management
  • Data governance structures
  • Technical documentation
  • Fulfillment of transparency and information obligations
  • Ensuring performance, interpretability, correctability, security, cybersecurity
  • Compliance with environmental standards
  • Cooperation with downstream providers
  • Registration in an EU-wide database

Companies should prepare for the AI Act now

Even though the AI Act has not yet been officially adopted and we do not yet know the details of the legal text, companies should prepare for the transition phase now. In this phase, AI systems and associated processes must be designed to comply with the law. The first step is to assess the risk class of each individual AI system. If you are not yet sure which risk classes your AI systems fall into, we recommend our free AI Act Quick Check. It will help you to assess the risk class.

More information:

Sources:

 

Read next …

 How the AI Act will change the AI industry: Everything you need to know about it now
 Unlocking the Black Box – 3 Explainable AI Methods to Prepare for the AI Act

… and explore new

How to Get Your Data Science Project Ready for the Cloud
 How to Scan Your Code and Dependencies in Python

Julia Rettig

Have you ever imagined a restaurant where AI powers everything? From the menu to the cocktails, hosting, music, and art? No? Ok, then, please click here.

If yes, well, it’s not a dream anymore. We made it happen: Welcome to “the byte” – Germany’s (maybe the world’s first) AI-powered Pop-up Restaurant!

As someone who has worked in data and AI consulting for over ten years, building statworx and the AI Hub Frankfurt, I have always thought of exploring the possibilities of AI outside of typical business applications. Why? Because AI will impact every aspect of our society, not just the economy. AI will be everywhere – in school, arts & music, design, and culture. Everywhere. Exploring these directions of AI’s impact led me to meet Jonathan Speier and James Ardinast from S-O-U-P, two like-minded founders from Frankfurt, who are rethinking how technology will shape cities and societies.

S-O-U-P is their initiative that operates at the intersection of culture, urbanity, and lifestyle. With their yearly “S-O-U-P Urban Festival” they connect creatives, businesses, gastronomy, and lifestyle people from Frankfurt and beyond.

When Jonathan and I started discussing AI and its impact on society and culture, we quickly came up with the idea of an AI-generated menu for a restaurant. Luckily, James, Jonathan’s S-O-U-P co-founder, is a successful gastro entrepreneur from Frankfurt. Now the pieces came together. After another meeting with James in one of his restaurants (and some drinks), we committed to launching Germany’s first AI-powered Pop-up Restaurant: the byte!

the byte: Our concept

We envisioned the byte to be an immersive experience, including AI in as many elements of the experience as possible. Everything, from the menu to the cocktails, music, branding, and art on the wall: everything was AI-generated. Bringing AI into all of these components also pushed me far beyond of what I typically do, namely helping large companies with their data & AI challenges.

Branding

Before creating the menu, we developed the visual identity of our project. We decided on a “lo-fi” appeal, using a pixelated font in combination with AI-generated visuals of plates and dishes. Our key visual, a neon-lit white plate, was created using DALL-E 2 and was found across all of our marketing materials:

Location

We hosted the byte in one of Frankfurt’s coolest restaurant event locations: Stanley, a restaurant location that features approx. 60 seats and a fully-fledged bar inside the restaurant (ideal for our AI-generated cocktails). The atmosphere is rather dark and cozy, with dark marble walls, highlighted with white carpets on the table, and a big red window that lets you see the kitchen from outside.

The menu

The heart of our concept was a 5-course menu that we designed to elevate the classical Frankfurter cuisine with the multicultural and diverse influences of Frankfurt (for everyone, who knows the Frankfurter kitchen, I am sure you know that this was not an easy task).

Using GPT-4 and some prompt engineering magic, we generated several menu candidates that were test-cooked by the experienced Stanley kitchen crew (thank you, guys for this great work!) and then assembled into a final menu. Below, you can find our prompt to create the menu candidates:

“Create a 5-course menu that elevates the classical Frankfurter kitchen. The menu must be a fusion of classical Frankfurter cuisine combined with the multicultural influences of Frankfurt. Describe each course, its ingredients as well as a detailed description of each dish’s presentation.”

Surprisingly, only minor adjustments were necessary to the recipes, even though some AI creations were extremely adventurous! This was our final menu:

  • Handkäs’ Mousse with Pickled Beetroot on Roasted Sourdough Bread
  • Next Level Green Sauce (with Cilantro and Mint) topped with a Fried Panko Egg
  • Cream Soup from White Asparagus with Coconut Milk and Fried Curry Fish
  • Currywurst (Beef & Vegan) by Best Worscht in Town with Carrot-Ginger-Mash and Pine Nuts
  • Frankfurt Cheesecake with Äppler Jelly, Apple Foam and Oat-Pecanut-Crumble

My favorite was the “Next Level” Green Sauce, an oriental twist of the classical 7-herb Frankfurter Green Sauce topped with a fried panko egg. Yummy! Below you can see the menu out in the wild 🍲

AI Cocktails

Alongside the menu, we also prompted GPT to create recipes that twisted famous cocktail classics to match our Frankfurt fusion theme. The results:

  • Frankfurt Spritz (Frankfurter Äbbelwoi, Mint, Sparkling Water)
  • Frankfurt Mule (Variation of a Moscow Mule with Calvados)
  • The Main (Variation of a Swimming Pool Cocktail)

My favorite was the Frankfurt Spritz, as it was fresh, herbal, and delicate (see pic below):

AI Host: Ambrosia the Culinary AI

An important part of our concept was “Ambrosia”, an AI-generated host that guided the guests around the evening, explaining the concept and how the menu was created. We thought it was important to manifest the AI as something the guests can experience. We hired a professional screenwriter for the script and used murf.ai to create several text-2-speech assets that were played at the beginning of the dinner and in-between courses.

Note: Ambrosia starts talking at 0:15.

AI Music

Music plays an important role for the vibe of an event. We decided to use mubert, a generative AI start-up that allowed us to create and stream AI music in different genres, such as “Minimal House” for a progressive vibe throughout the evening. After the main course, a DJ took over and accompanied our guests into the night 💃🍸

YouTube

Mit dem Laden des Videos akzeptieren Sie die Datenschutzerklärung von YouTube.
Mehr erfahren

Video laden

AI Art

Throughout the restaurant, we placed AI-generated art pieces by the local AI artist Vladimir Alexeev (a.k.a. “Merzmensch”), here are some examples:

AI Playground

As an interactive element for the guests, we created a small web app that takes the first name of a person and transforms it into a dish, including a reasoning why that name perfectly matches the dish 🙂 You can try it out here: Playground

Launch

The byte was officially announced at the S-O-U-P festival press conference in early May 2023. We also launched additional marketing activities through social media and our friends and family networks. As a result, the byte was fully booked for three days straight, and we got broad media coverage in various gastronomy magazines and the daily press. The guests were (mostly) amazed by our AI creations, and we received inquiries from other European restaurants and companies interested in exclusively booking the byte as an experience for their employees 🤩 Nailed it!

Closing and Next Steps

Creating the byte together with Jonathan and James was an outstanding experience. It further encouraged me that AI will transform not only our economy but all aspects of our daily lives. There is massive potential at the intersection of creativity, culture, and AI that is currently only being tapped.

We definitely want to continue the byte in Frankfurt and other cities in Germany and Europe. Moreover, James, Jonathan, and I are already thinking of new ways to bring AI into culture and society. Stay tuned! 😏

The byte was not just a restaurant; it was an immersive experience. We wanted to create something that had never been done before and did it – in just eight weeks. And that’s the inspiration I want to leave you with today:

Trying new things that move you out of your comfort zone is the ultimate source of growth. You never know what you’re capable of until you try. So, go out there and try something new, like building an AI-powered pop-up restaurant. Who knows, you might surprise yourself. Bon apétit!

Impressions

Media

FAZ: https://www.faz.net/aktuell/rhein-main/pop-up-resturant-the-byte-wenn-chatgpt-das-menue-schreibt-18906154.html

Genuss Magazin: https://www.genussmagazin-frankfurt.de/gastro_news/Kuechengefluester-26/Interview-James-Ardinast-KI-ist-die-Zukunft-40784.html

Frankfurt Tipp: https://www.frankfurt-tipp.de/ffm-aktuell/s/ugc/deutschlands-erstes-ai-restaurant-the-byte-in-frankfurt.html

Foodservice: https://www.food-service.de/maerkte/news/the-byte-erstes-ki-restaurant-vor-dem-start-55899?crefresh=1 Sebastian Heinz

The Hidden Risks of Black-Box Algorithms

Reading and evaluating countless resumes in the shortest possible time and making recommendations for suitable candidates – this is now possible with artificial intelligence in applicant management. This is because advanced AI technologies can efficiently analyze even large volumes of complex data. In HR management, this not only saves valuable time in the pre-selection process but also enables applicants to be contacted more quickly. Artificial intelligence also has the potential to make application processes fairer and more equitable.

However, real-world experience has shown that artificial intelligence is not always “fair”. A few years ago, for example, an Amazon recruiting algorithm stirred up controversy for discriminating against women when selecting candidates. Additionally, facial recognition algorithms have repeatedly led to incidents of discrimination against People of Color.

One reason for this is that complex AI algorithms independently calculate predictions and results based on the data fed into them. How exactly they arrive at a particular result is not initially comprehensible. This is why they are also known as black-box algorithms. In Amazon’s case, the AI determined suitable applicant profiles based on the current workforce, which was predominantly male, and thus made biased decisions. In a similar way, algorithms can reproduce stereotypes and reinforce discrimination.

Principles for Trustworthy AI

The Amazon incident shows that transparency is highly relevant in the development of AI solutions to ensure that they function ethically. This is why transparency is also one of the seven statworx Principles for trustworthy AI. The employees at statworx have collectively defined the following AI principles: Human-centered, transparent, ecological, respectful, fair, collaborative, and inclusive. These serve as orientations for everyday work with artificial intelligence. Universally applicable standards, rules, and laws do not yet exist. However, this could change in the near future.

The European Union (EU) has been discussing a draft law on the regulation of artificial intelligence for some time. Known as the AI Act, this draft has the potential to be a gamechanger for the global AI industry. This is because it is not only European companies that are targeted by this draft law. All companies offering AI systems on the European market, whose AI-generated output is used within the EU, or operate AI systems for internal use within the EU would be affected. The requirements that an AI system must meet depend on its application.

Recruiting algorithms are likely to be classified as high-risk AI. Accordingly, companies would have to fulfill comprehensive requirements during the development, publication, and operation of the AI solution. Among other things, companies are required to comply with data quality standards, prepare technical documentation, and establish risk management. Violations may result in heavy fines of up to 6% of global annual sales. Therefore, companies should already start dealing with the upcoming requirements and their AI algorithms. Explainable AI methods (XAI) can be a useful first step. With their help, black-box algorithms can be better understood, and the transparency of the AI solution can be increased.

Unlocking the Black Box with Explainable AI Methods

XAI methods enable developers to better interpret the concrete decision-making processes of algorithms. This means that it becomes more transparent how an algorithm has formed patterns and rules and makes decisions. As a result, potential problems such as discrimination in the application process can be discovered and corrected. Thus, XAI not only contributes to greater transparency of AI but also favors its ethical use and thus increases the conformity of an AI with the upcoming AI Act.

Some XAI methods are even model-agnostic, i.e. applicable to any AI algorithm from decision trees to neural networks. The field of research around XAI has grown strongly in recent years, which is why there is now a wide variety of methods. However, our experience shows that there are large differences between different methods in terms of the reliability and meaningfulness of their results. Furthermore, not all methods are equally suitable for robust application in practice and for gaining the trust of external stakeholders. Therefore, we have identified our top 3 methods based on the following criteria for this blog post:

  1. Is the method model agnostic, i.e. does it work for all types of AI models?
  2. Does the method provide global results, i.e. does it say anything about the model as a whole?
  3. How meaningful are the resulting explanations?
  4. How good is the theoretical foundation of the method?
  5. Can malicious actors manipulate the results or are they trustworthy?

Our Top 3 XAI Methods at a Glance

Using the above criteria, we selected three widely used and proven methods that are worth diving a bit deeper into: Permutation Feature Importance (PFI), SHAP Feature Importance, and Accumulated Local Effects (ALE). In the following, we explain how each of these methods work and what they are used for. We also discuss their advantages and disadvantages and illustrate their application using the example of a recruiting AI.

Efficiently Identify Influencial Variables with Permutation Feature Importance

The goal of Permutation Feature Importance (PFI) is to find out which variables in the data set are particularly crucial for the model to make accurate predictions. In the case of the recruiting example, PFI analysis can shed light on what information the model relies on to make its decision. For example, if gender emerges as an influential factor here, it can alert the developers to potential bias in the model. In the same way, a PFI analysis creates transparency for external users and regulators. Two things are needed to compute PFI:

  1. An accuracy metric such as the error rate (proportion of incorrect predictions out of all predictions).
  2. A test data set that can be used to determine accuracy.

In the test data set, one variable after the other is concealed from the model by adding random noise. Then, the accuracy of the model is determined over the transformed test dataset. From there, we conclude that those variables whose concealment affects model accuracy the most are particularly important. Once all variables are analyzed and sorted, we obtain a visualization like Figure 1. Using our artificially generated sample data set, we can derive the following: Work experience did not play a major role in the model, but ratings from the interview were influencial.


Figure 1 – Permutation Feature Importance using the example of a recruiting AI (data artificially generated).

A great strength of PFI is that it follows a clear mathematical logic. The correctness of its explanation can be proven by statistical considerations. Furthermore, there are hardly any manipulable parameters in the algorithm with which the results could be deliberately distorted. This makes PFI particularly suitable for gaining the trust of external observers. Finally, the computation of PFI is very resource efficient compared to other XAI methods.

One weakness of PFI is that it can provide misleading explanations under some circumstances. If a variable is assigned a low PFI value, it does not always mean that the variable is unimportant to the issue. For example, if the bachelor’s degree grade has a low PFI value, this may simply be because the model can simply look at the master’s degree grade instead since they are usually similar. Such correlated variables can complicate the interpretation of the results. Nonetheless, PFI is an efficient and useful method for creating transparency in black-box models.

Strengths Weaknesses
Little room for malicious manipulation of results Does not consider interactions between variables
Efficient computation

Uncover Complex Relationships with SHAP Feature Importance

SHAP Feature Importance is a method for explaining black box models based on game theory. The goal is to quantify the contribution of each variable to the prediction of the model. As such, it closely resembles Permutation Feature Importance at first glance. However, unlike PFI, SHAP Feature Importance provides results that can account for complex relationships between multiple variables.

SHAP is based on a concept from game theory: Shapley values. Shapley values are a fairness criterion that assigns a weight to each variable that corresponds to its contribution to the outcome. This is analogous to a team sport, where the winning prize is divided fairly among all players, according to their contribution to the victory. With SHAP, we can look at every individual obversation in the data set and analyze what contribution each variable has made to the prediction of the model.

If we now determine the average absolute contribution of a variable across all observations in the data set, we obtain the SHAP Feature Importance. Figure 2 illustrates the results of this analysis. The similarity to the PFI is evident, even though the SHAP Feature Importance only places the rating of the job interview in second place.


Figure 2 – SHAP Feature Importance using the example of a recruiting AI (data artificially generated).

A major advantage of this approach is the ability to account for interactions between variables. By simulating different combinations of variables, it is possible to show how the prediction changes when two or more variables vary together. For example, the final grade of a university degree should always be considered in the context of the field of study and the university. In contrast to the PFI, the SHAP Feature Importance takes this into account. Also, Shapley Values, once calculated, are the basis of a wide range of other useful XAI methods.

However, one weakness of the method is that it is more computationally expensive than PFI. Efficient implementations are available only for certain types of AI algorithms like decision trees or random forests. Therefore, it is important to carefully consider whether a given problem requires a SHAP Feature Importance analysis or whether PFI is sufficient.

Strengths Weaknesses
Little room for malicious manipulation of results Calculation is computationally expensive
Considers complex interactions between variables

Focus in on Specific Variables with Accumulated Local Effects

Accumulated Local Effects (ALE) is a further development of the commonly used Partial Dependence Plots (PDP). Both methods aim at simulating the influence of a certain variable on the prediction of the model. This can be used to answer questions such as “Does the chance of getting a management position increase with work experience?” or “Does it make a difference if I have a 1.9 or a 2.0 on my degree certificate?”. Therefore, unlike the previous two methods, ALE makes a statement about the model’s decision-making, not about the relevance of certain variables.

In the simplest case, the PDP, a sample of observations is selected and used to simulate what effect, for example, an isolated increase in work experience would have on the model prediction. Isolated means that none of the other variables are changed in the process. The average of these individual effects over the entire sample can then be visualized (Figure 3, above). Unfortunately, PDP’s results are not particularly meaningful when variables are correlated. For example, let us look at university degree grades. PDP simulates all possible combinations of grades in bachelor’s and master’s programs. Unfortunately, this results in cases that rarely occur in the real world, e.g., an excellent bachelor’s degree and a terrible master’s degree. The PDP has no sense for unreaslistic cases, and the results may suffer accordingly.

ALE analysis, on the other hand, attempts to solve this problem by using a more realistic simulation that adequately represents the relationships between variables. Here, the variable under consideration, e.g., bachelor’s grade, is divided into several sections (e.g., 6.0-5.1, 5.0-4.1, 4.0-3.1, 3.0-2.1, and 2.0-1.0). Now, the simulation of the bachelor’s grade increase is performed only for individuals in the respective grade group. This prevents unrealistic combinations from being included in the analysis. An example of an ALE plot can be found in Figure 3 (below). Here, we can see that ALE identifies a negative impact of work experience on the chance of employment, which PDP was unable to find. Is this behavior of the AI desirable? For example, does the company want to hire young talent in particular? Or is there perhaps an unwanted age bias behind it? In both cases, the ALE plot helps to create transparency and to identify undesirable behavior.


Figure 3- Partial Dependence Plot and Accumulated Local Effects using a Recruiting AI as an example (data artificially generated).

In summary, ALE is a suitable method to gain insight into the influence of a certain variable on the model prediction. This creates transparency for users and even helps to identify and fix unwanted effects and biases. A disadvantage of the method is that ALE can only analyze one or two variables together in the same plot, meaningfully. Thus, to understand the influence of all variables, multiple ALE plots must be generated, which makes the analysis less compact than PFI or a SHAP Feature Importance.

Strengths Weaknesses
Considers complex interactions between variables Only one or two variables can be analyzed in one ALE plot
Little room for malicious manipulation of results

Build Trust with Explainable AI Methods

In this post, we presented three Explainable AI methods that can help make algorithms more transparent and interpretable. This also favors meeting the requirements of the upcoming AI Act. Even though it has not yet been passed, we recommend to start working on creating transparency and traceability for AI models based on the draft law as soon as possible. Many Data Scientists have little experience in this field and need further training and time to familiarize with XAI concepts before they can identify relevant algorithms and implement effective solutions. Therefore, it makes sense to familiarize yourself with our recommended methods preemptively.

With Permutation Feature Importance (PFI) and SHAP Feature Importance, we demonstrated two techniques to determine the relevance of certain variables to the prediction of the model. In summary, SHAP Feature Importance is a powerful method for explaining black-box models that considers the interactions between variables. PFI, on the other hand, is easier to implement but less powerful for correlated data. Which method is most appropriate in a particular case depends on the specific requirements.

We also introduced Accumulated Local Effects (ALE), a technique that can analyze and visualize exactly how an AI responds to changes in a specific variable. The combination of one of the two feature importance methods with ALE plots for selected variables is particularly promising. This can provide a theoretically sound and easily interpretable overview of the model – whether it is a decision tree or a deep neural network.

The application of Explainable AI is a worthwhile investment – not only to build internal and external trust in one’s own AI solutions. Rather, we expect that the skillful use of interpretation-enhancing methods can help avoid impending fines due to the requirements of the AI Act, prevents legal consequences, and protects those affected from harm – as in the case of incomprehensible recruiting software.
Our free AI Act Quick Check helps you assess whether any of your AI systems could be affected by the AI Act: https://www.statworx.com/en/ai-act-tool/

Sources & Further Information:

https://www.faz.net/aktuell/karriere-hochschule/buero-co/ki-im-bewerbungsprozess-und-raus-bist-du-17471117.html (last opened 03.05.2023)
https://t3n.de/news/diskriminierung-deshalb-platzte-amazons-traum-vom-ki-gestuetzten-recruiting-1117076/ (last opened 03.05.2023)
For more information on the AI Act: https://www.statworx.com/en/content-hub/blog/how-the-ai-act-will-change-the-ai-industry-everything-you-need-to-know-about-it-now/
Statworx principles: https://www.statworx.com/en/content-hub/blog/statworx-ai-principles-why-we-started-developing-our-own-ai-guidelines/
Christoph Molnar: Interpretable Machine Learning: https://christophm.github.io/interpretable-ml-book/ Max Hilsdorf, Julia Rettig

Image Sources
AdobeStock 566672394 – by TheYaksha

Last December, the European Council published a dossier outlining the Council’s preliminary position on the draft law known as the AI Act. This new law is intended to regulate artificial intelligence (AI) and thus becomes a game-changer for the entire tech industry. In the following, we have compiled the most important information from the dossier, which is the current official source on the planned AI Act at the time of publication.

A legal framework for AI

Artificial intelligence has enormous potential to improve and ease all our lives. For example, AI algorithms already support early cancer detection or translate sign language in real time, thereby eliminating language barriers. But in addition to the positive effects, there are risks, as the latest deep fakes from Pope Francis or the Cambridge Analytica scandal illustrate.

The European Union (EU) is currently drafting legislation to regulate artificial intelligence to mitigate the risks of artificial intelligence. With this, the EU wants to protect consumers and ensure the ethically acceptable use of artificial intelligence. The so-called “AI Act” is still in the legislative process but is expected to be passed in 2023 – before the end of the current legislative period. Companies will then have two years to implement the legally binding requirements. Violations will be punished with fines of up to 6% of global annual turnover or €30,000,000 – whichever is higher. Therefore, companies should already start addressing the upcoming legal requirements now.

Legislation with global impact

The planned AI Act is based on the “location principle, ” meaning that not only European companies will be affected by the amendment. Thus, all companies that offer AI systems on the European market or also operate them for internal use within the EU are affected by the AI Act – with a few exceptions. Private use of AI remains untouched by the regulation so far.

Which AI systems are affected?

The definition of AI determines which systems will be affected by the AI Act. For this reason, the AI definition of the AI Act has been the subject of controversial debate in politics, business, and society for a considerable time. The initial definition was so broad that many “normal” software systems would also have been affected. The current proposal defines AI as any system developed through machine learning or logic- and knowledge-based approaches. It remains to be seen whether this definition will ultimately be adopted.

7 Principles for trustworthy AI

The “seven principles for trustworthy AI” are the most important basis of the AI Act. A group of experts from research, the digital economy, and associations developed them on behalf of the European Commission. They include not only technical aspects but also social and ethical factors that can be used to classify the trustworthiness of an AI system:

  1. Human action & oversight: decision-making should be supported without undermining human autonomy.
  2. Technical Robustness & security: accuracy, reliability, and security must be preemptively ensured.
  3. Data privacy & data governance: handling of data must be legally secure and protected.
  4. Transparency: interaction with AI must be clearly communicated, as must its limitations and boundaries.
  5. Diversity, non-discrimination & fairness: Avoidance of unfair bias must be ensured throughout the entire AI lifecycle.
  6. Environmental & societal well-being: AI solutions should have a positive impact on the environment and society as possible.
  7. Accountability: responsibilities for the development, use, and maintenance of AI systems must be defined.

Based on these principles, the AI Act’s risk-based approach was developed, allowing AI systems to be classified into one of four risk classes: low, limited, high, and unacceptable risk.

Four risk classes for trustworthy AI

The risk class of an AI system indicates the extent to which an AI system threatens the principles of trustworthy AI and which legal requirements the system must fulfill – provided the system is fundamentally permissible. This is because, in the future, not all AI systems will be allowed on the European market. For example, most “social scoring” techniques are assessed as “unacceptable” and will not be allowed by the new law.

For the other three risk classes, the rule of thumb is that the higher the risk of an AI system, the higher the legal requirements for it. Companies that offer or operate high-risk systems will have to meet the most requirements. For example, AI used to operate critical (digital) infrastructure or used in medical devices is considered such. To bring these to market, companies will have to observe high-quality standards for the used data, set up a risk management, affix a CE mark, and more.

AI systems in the “limited risk” class are subject to information and transparency obligations. Accordingly, companies must inform users of chatbots, emotion recognition systems, or deep fakes about the use of artificial intelligence. Predictive maintenance or spam filters are two examples of AI systems that fall into the lowest-risk category “low risk”. Companies that exclusively offer or use such AI solutions will hardly be affected by the upcoming AI Act. There are no legal requirements for these applications yet.

What companies can do for now

Even though the AI Act is still in the legislative process, companies should act now. The first step is to clarify how they will be affected by the AI Act. To help you do this, we have developed the AI Act Quick Check. With this free tool, AI systems can be quickly assigned to a risk class free of charge, and requirements for the system can be derived. Finally, it can be used as a basis to estimate how extensive the realization of the AI Act will be in your own company and to take initial measures. Of course, we are also happy to support you in evaluating and solving company-specific challenges related to the AI Act. Please do not hesitate to contact us!

AI Act Tool     AI Act Fact Sheet

 

Benefit from our expertise!

Of course, we are happy to support you in evaluating and solving company-specific challenges related to the AI Act. Please do not hesitate to contact us!

     

    Links & Sources:

      Julia Rettig

    Introduction

    Forecasts are of central importance in many industries. Whether it’s predicting resource consumption, estimating a company’s liquidity, or forecasting product sales in retail, forecasts are an indispensable tool for making successful decisions. Despite their importance, many forecasts still rely primarily on the prior experience and intuition of experts. This makes it difficult to automate the relevant processes, potentially scale them, and provide efficient support. Furthermore, experts may be biased due to their experiences and perspectives or may not have all the relevant information necessary for accurate predictions.

    These reasons have led to the increasing importance of data-driven forecasts in recent years, and the demand for such predictions is accordingly strong.

    At statworx, we have already successfully implemented a variety of projects in the field of forecasting. As a result, we have faced many challenges and become familiar with numerous industry-specific use cases. One of our internal working groups, the Forecasting Cluster, is particularly passionate about the world of forecasting and continuously develops their expertise in this area.

    Based on our collected experiences, we now aim to combine them in a user-friendly tool that allows anyone to obtain initial assessments for specific forecasting use cases depending on the data and requirements. Both customers and employees should be able to use the tool quickly and easily to receive methodological recommendations. Our long-term goal is to make the tool publicly accessible. However, we are first testing it internally to optimize its functionality and usefulness. We place special emphasis on ensuring that the tool is intuitive to use and provides easily understandable outputs.

    Although our Recommender Tool is still in the development phase, we would like to provide an exciting sneak peek.

    Common Challenges

    Model Selection

    In the field of forecasting, there are various modeling approaches. We differentiate between three central approaches:

    1. Time Series Models
    2. Tree-based Models
    3. Deep Learning Models

    There are many criteria that can be used when selecting a model. For univariate time series data with strong seasonality and trends, classical time series models such as (S)ARIMA and ETS are appropriate. On the other hand, for multivariate time series data with potentially complex relationships and large amounts of data, deep learning models are a good choice. Tree-based models like LightGBM offer greater flexibility compared to time series models, are well-suited for interpretability due to their architecture, and tend to have lower computational requirements compared to deep learning models.

    Seasonality

    Seasonality refers to recurring patterns in a time series that occur at regular intervals (e.g. daily, weekly, monthly, or yearly). Including seasonality in the modeling is important to capture these regular patterns and improve the accuracy of forecasts. Time series models such as SARIMA, ETS, or TBATS can explicitly account for seasonality. For tree-based models like LightGBM, seasonality can only be considered by creating corresponding features, such as dummies for relevant seasonalities. One way to explicitly account for seasonality in deep learning models is by using sine and cosine functions. It is also possible to use a deseasonalized time series. This involves removing the seasonality initially, followed by modeling on the deseasonalized time series. The resulting forecasts are then supplemented with seasonality by applying the process used for deseasonalization in reverse. However, this process adds another level of complexity, which is not always desirable.

    Hierarchical Data

    Especially in the retail industry, hierarchical data structures are common as products can often be represented at different levels of granularity. This frequently results in the need to create forecasts for different hierarchies that do not contradict each other. The aggregated forecasts must therefore match the disaggregated forecasts. There are various approaches to this. With top-down and bottom-up methods, forecasts are created at one level and then disaggregated or aggregated downstream. Reconciliation methods such as Optimal Reconciliation involve creating forecasts at all levels and then reconciling them to ensure consistency across all levels.

    Cold Start

    In a cold start, the challenge is to forecast products that have little or no historical data. In the retail industry, this usually refers to new product introductions. Since it is not possible to train a model for these products due to the lack of history, alternative approaches must be used. A classic approach to performing a cold start is to rely on expert knowledge. Experts can provide initial estimates of demand, which can serve as a starting point for forecasting. However, this approach can be highly subjective and cannot be scaled. Similarly, similar products or potential predecessor products can be referenced. Grouping of products can be done based on product categories or clustering algorithms such as K-Means. Using cross-learning models trained on many products represents a scalable option.

    Recommender Concept

    With our Recommender Tool, we aim to address different problem scenarios to enable the most efficient development process. It is an interactive tool where users can provide inputs based on their objectives or requirements and the characteristics of the available data. Users can also prioritize certain requirements, and the output will prioritize those accordingly. Based on these inputs, the tool generates methodological recommendations that best cover the solution requirements, depending on the available data characteristics. Currently, the outputs consist of a purely content-based representation of the recommendations, providing concrete guidelines for central topics such as model selection, pre-processing, and feature engineering. The following example provides an idea of the conceptual approach:

    The output presented here is based on a real project where the implementation in R and the possibility of local interpretability were of central importance. At the same time, new products were frequently introduced, which should also be forecasted by the developed solution. To achieve this goal, several global models were trained using Catboost. Thanks to this approach, over 200 products could be included in the training. Even for newly introduced products where no historical data was available, forecasts could be generated. To ensure the interpretability of the forecasts, SHAP values were used. This made it possible to clearly explain each prediction based on the features used.

    Summary

    The current development is focused on creating a tool optimized for forecasting. Through its use, we aim to increase efficiency in forecasting projects. By combining gathered experience and expertise, the tool will offer guidelines for modeling, pre-processing, and feature engineering, among other topics. It will be designed to be used by both customers and employees to quickly and easily obtain estimates and methodological recommendations. An initial test version will be available soon for internal use, but the tool is ultimately intended to be made accessible to external users as well. In addition to the technical output currently in development, a less technical output will also be available. The latter will focus on the most important aspects and their associated efforts. In particular, the business perspective in the form of expected efforts and potential trade-offs between effort and benefit will be covered by this.

     

     

    Benefit from our forecasting expertise!

    If you need support in addressing the challenges in your forecasting projects or have a forecasting project planned, we are happy to provide our expertise and experience to assist you.

       

      Image Source:

      AdobeStock 83282923 – Mego-studio Marlon Schumacher

      Recently, while working at statworx, I experienced a sense of déjà vu regarding the topic of data culture. As the Head of the AI Academy, my main responsibility is to convey my enthusiasm for artificial intelligence, programming, data, and cloud computing to my clients. This often requires projecting my passion for these subjects onto individuals who may have limited technical experience, and whose interests may not typically align with transformer models and functional programming

      This tension reminded me of something that happened before my professional career.

      All beginnings are difficult

      Prior to my passion for data and artificial intelligence, I was already a very enthusiastic (hobby) musician – with a special passion for the genre of Death Metal (Note: I don’t want to bother interested readers with more detailed genre descriptions here 😉). During my studies, I was a singer and guitarist in a Death Metal band. For those of you who are not familiar with Death Metal, it may seem like all those “off-key notes” and “growling” don’t require real skills – but let me assure you, it takes a lot of talent, and many people in this genre have years of hard work behind them.

      https://youtu.be/WGnXD0DME30?t=25

      When you listen to or, even better, watch this music, you are quickly impressed by how fast the musicians today race across their guitar fretboards. However, it’s essential to recognize that every musician faces a challenging beginning. Those who have learned an instrument can attest to this reality. Initially, it can be demanding to navigate through prescribed teaching materials and maintain the necessary drive to acquire techniques, with the ultimate goal of performing a decent piece of music. At first, it was very difficult for me to get excited about notes, rhythms, and finger exercises or to stay on task with the appropriate perseverance.

      Generiert mit DALL-E. Prompt: death metal concert with view from stage to crowd, guitar in the foreground with bokeh, photorealistic style

      Let’s get creative

      At the beginning, the songs were not particularly good or technically demanding, as I had not yet learned any significant guitar or singing skills. But then something happened: my motivation kicked in! I realized how these techniques and skills allowed me to express my own feelings and thoughts. It was as if I could create my own products.

      I wrote more and more songs and almost unnoticed learned important skills on the fretboard. It became my personal mission to stoically master all the necessary finger exercises in order to be able to play ever more complex structures. At the same time, I became part of bands and a local music scene where we inspired each other at concerts and kept motivating each other to write more complex and better material. Here, we also inspired more, mostly younger, music fans to try their hand at this music. They joined in, listened, and thought, “I want to be able to do that too!” So they started writing their own songs, learning their own techniques, and becoming part of a creative cultural scene.

      Skills alone are not everything

      One may wonder what this little excursion has to do with data culture. The above theme has also been reflected in my work with data culture. In our AI Academy, we mainly focus on topics related to data literacy and related skills. Initially, I made the same mistake in my thinking that hindered me when learning my instrument: skills are everything – or with skills, everything else will somehow come.

      I assumed that the skills taught are so important, so relevant, so productive, and especially so attractive to learners that after learning these skills, everything else will automatically follow. But that’s not the case. Over time, through our training, we have reached an ever-increasing circle of people, including those with different core competencies. These are people who cannot or do not want to be evangelists or enthusiasts for matrix algebra in their main activity.

      The following questions are always at the forefront here:

      “What does this have to do with me?”
      “What does this have to do with my work?”
      “How could this be valuable for me?”

      And just like in my story about songwriting, playing concerts, or exchanging ideas within a music scene, I also had the same experience with data and upskilling. Some of our most successful training formats, the AI Basics Workshop and Data Literacy Workshop, enable the most important topics and learnings around data and AI to be made usable for one’s own company – with the possibility of generating their own ideas for the use of these technologies together with experienced AI experts. This is not only about learning how AI works, but also about interactive and guided exploration:

      “What does this have to do with me?”
      “How can I create value for my environment with this?”
      “What problems does AI need to solve for me?”

      Motivating ideas

      At first, we noticed how enthusiastic training participants interacted with the content, and how the mood in our courses shifted much more towards a growth mindset:

      Not focusing on what I can already do, but rather asking what I still want to achieve and what I want to achieve.

      On the other hand, our courses quickly became popular with our customers’ employees. We were, of course, pleased with the word-of-mouth that contributed to the recognition of the high course quality and exciting topics. However, we did not anticipate that the ideas generated in the course would develop their own dynamic and, in many cases, generate even greater impact in the company than the course itself.

      Similar to concerts in the death metal scene, new enthusiasts could also be won over here. They realized that the person who successfully drives a use case forward was also at the beginning of data and AI not too long ago.

      “If others have achieved that, I want to try it too, and I’ll figure out how to learn the finger skills on the way.”

      Can – Do – Want – A constant cycle in the organization

      And so three important dimensions came together for us.

      1. Can – Mastery of skills such as good guitar playing, project management in data and AI, programming, or basic knowledge in data analysis.
      2. Do – Regular and ritualized work on the topic, conducting initial use cases, and exchanging ideas with others to learn the language interactively.
      3. Want – Creating sustainable motivation to achieve goals through initial successes, inspiring exchange, and a clear vision for the potential impact and value generation in the company.

      The three dimensions form a cycle in which each dimension depends on the others and has a positive effect on the other dimensions. If I improve my guitar playing skills, it will be easier for me to develop new ideas and share them successfully with others. This creates further motivation to tackle more skills and challenges.

      That is why data culture and death metal have a lot in common for me.

      Let’s connect if you’re interested in diving deeper into the topic of data culture, including its three dimensions “Can”, “Do”, and “Want”.

       

      More about AI Academy

      Image Source:

      AdobeStock 480687393 zamuruev

        David Schlepps

      A data culture is a key factor for effective data utilization

      With the increasing digitization, the ability to use data effectively has become a crucial success factor for businesses. This way of thinking and acting is often referred to as data culture and plays an important role in transforming a company into a data-driven organization. By promoting a data culture, businesses can benefit from the flexibility of fact-based decision-making and fully leverage the potential of their data. Such a culture enables faster and demonstrably better decisions and embeds data-driven innovation within the company.

      Although the necessity and benefits of a data culture appear obvious, many companies still struggle to establish such a culture. According to a study by New Vantage Partners, only 20% of companies have successfully developed a data culture so far. Furthermore, over 90% of the surveyed companies describe the transformation of culture as the biggest hurdle in the transformation towards a data-driven company.

      A data culture fundamentally changes the way of working

      The causes of this challenge are diverse, and the necessary changes permeate almost all aspects of everyday work. In an effective data culture, each employee preferably uses data and data analysis for decision-making and gives priority to data and facts over individual “gut feeling.” This way of thinking promotes the continuous search for ways to use data to identify competitive advantages, open up new revenue streams, optimize processes, and make better predictions. By adopting a data culture, companies can fully leverage the potential of their data and drive innovations throughout the organization. This requires recognizing data as an important driving force for decision-making and innovation. This ideal requires new demands on individual employee behavior. Additionally, this requires targeted support of this behavior through suitable conditions such as technical infrastructure and organizational processes.

      Three factors significantly shape the data culture

      To anchor a data culture sustainably within a company, three factors are crucial:

      1. Can| Skills
      2. Want | Attitude
      3. Do | Actions

      statworx uses these three factors to make the abstract concept of data culture tangible and to initiate targeted necessary changes. It is crucial to give equal attention to all factors and to consider them as holistically as possible. Initiatives for cultural development often limit themselves to the aspect of attitude and attempt to anchor specific values separate from other influencing factors. These initiatives usually fail due to the reality of companies that oppose them with their processes, lived rituals, practices, and values, and thus prevent the establishment of the culture (actively).

      We have summarized three factors of data culture in a framework for an overview.

      1. Can: Skills form the basis for effective data utilization

      Skills and competencies are the foundation for effective data management. These include both the methodological and technical skills of employees, as well as the organization’s ability to make data usable.

      Ensuring data availability is particularly important for data usability. The “FAIR” standard – Findable, Accessible, Interoperable, Reusable – provides a direction for the essential properties that support this, such as through technologies, knowledge management, and appropriate governance.

      At the level of employee skills, the focus is on data literacy – the ability to understand and effectively use data to make informed decisions. This includes a basic understanding of data types and structures, as well as collection and analysis methods. Data literacy also involves the ability to ask the right questions, interpret data correctly, and identify patterns and trends. Develop relevant competencies, such as through upskilling, targeted workforce planning, and hiring data experts.

      2. Want: A data culture can only flourish in a suitable value context.

      The second factor – Want – deals with the attitudes and intentions of employees and the organization as a whole towards the use of data. For this, both the beliefs and values of individuals and the community within the company must be addressed. There are four aspects are of central importance for a data culture:

      • Collaboration & community instead of competition & selective partnerships
      • Transparency & sharing instead of information concealment & data hoarding
      • Pilot projects & experiments instead of theoretical assessments
      • Openness & willingness to learn instead of pettiness & rigid thinking
      • Data as a central decision-making basis instead of individual opinion & gut feeling

      Example: Company without a data culture

      On an individual level, an employee is convinced that exclusive knowledge and data can provide an advantage. The person has also learned within the organization that this behavior leads to strategic advantages or opportunities for personal positioning, and has been rewarded for such behavior by superiors in the past. The person is therefore convinced that it is absolutely sensible and advantageous to keep data for oneself or within one’s own team and not share it with other departments. The competitive thinking and tendency towards secrecy are firmly anchored as a value.

      In general, behavior like described in the example restricts transparency throughout the entire organization and thereby slows down the organization. If not everyone has the same information, it is difficult to make the best possible decision for the entire company. Only through openness and collaboration can the true value of data in the company be utilized. A data-driven company is based on a culture of collaboration, sharing, and learning. When people are encouraged to exchange their ideas and insights, better decisions can be made.

      Even possible declarations of intent, such as mission statements and manifestos without tangible measures, will change little in the attitude of employees. The big challenge is to anchor the values sustainably and to make them the guiding action principle for all employees, which is actively lived in everyday business. If this succeeds, the organization is on the best way to create the required data mindset to bring an effective and successful data culture to life. Our transformation framework can help to establish and make these values visible.

      We recommend starting to build a data culture step by step because even small experimental projects create added value, serve as positive examples, and build trust. The practical testing of a new innovation, even only in a limited scope, usually brings faster and better results than a theoretical assessment. Ultimately, it is about placing the value of data at the forefront.

      3. Do: Behavior creates the framework and is simultaneously the visible result of a data culture.

      The two factors mentioned above ultimately aim to ensure that employees and the organization as a whole adapt their behavior. Only an actively lived data culture can be successful. Therefore, everyday behavior – Do – plays a central role in establishing a data culture.

      The behavior of an organization can be examined and influenced primarily in two dimensions.

      These factors are:

      1. Activities and rituals
      2. Structural elements of the organization

      Activities and rituals

      Activities and rituals refer to the daily collaboration between employees of an organization. They manifest themselves in all forms of collaboration, from meeting procedures to handling feedback and risks to the annual Christmas party. It is crucial which patterns the collaboration follows and which behavior is rewarded or punished.

      Experience shows that teams that are already familiar with agile methods such as Scrum find the transformation to data-driven decisions easier. Teams that follow strong hierarchies and act risk-averse, on the other hand, have more difficulty overcoming this challenge. One reason for this is that agile ways of working reinforce collaboration between different roles and thus create the foundation for a productive work environment. In this context, the role of leadership, especially senior leadership, is crucial. The individuals at the C-level must necessarily lead by example from the beginning, introduce rituals and activities, and act together as the central driver of the transformation.

      Structure elements of the organization

      While activities and rituals emerge from teams and are not always predetermined, the second dimension reflects a stronger formalization. It refers to the structure elements of an organization. These provide the formal framework for decisions and thus shape behavior, as well as the emergence and anchoring of values and attitudes.

      Internal and external structure elements are distinguished. Internal structure elements are mainly visible within the organization, such as roles, processes, hierarchy levels, or committees. By adapting and restructuring roles, necessary skills can be reflected within the company. Furthermore, rewards and promotions for employees can create an incentive to adopt and pass on the behavior themselves to colleagues. The division of the working environment is also part of the internal structure. Since the work in data-driven companies is based on close collaboration and requires individuals with different skills, it makes sense to create a space for open exchange that allows communication and collaboration.

      External structure elements reflect internal behavior outward. Thus, internal structure elements influence the perception of the company from the outside. This is reflected, for example, in clear communication, the structure of the website, job advertisements, and marketing messages.

      Companies should design their external behavior to be in line with the values of the organization and thus support their own structures. In this way, a harmonious alignment between the internal and external positioning of the company can be achieved.

      First small steps can already create significant changes

      Our experience has shown that the coordinated design of skills, willingness, and action results in a sustainable data culture. It is now clear that a data culture cannot be created overnight, but it is also no longer possible to do without it. It has proven useful to divide this challenge into small steps. With first pilot projects, such as establishing a data culture in just one team and initiatives for particularly committed employees who want to drive change, trust is created in the cultural shift. Positive individual experiences serve as a helpful catalyst for the transformation of the entire organization.

      The philosopher and visionary R. Buckminster Fuller once said, “You never change things by fighting the existing reality. To change something, build a new model that makes the existing model obsolete.” Because with the advancement of technology, companies must be able to adapt to fully tap their potential. This allows decisions to be made faster and more accurately than ever before, drives innovation, and increasingly optimizes processes. The sustainable establishment of a data culture will give companies a competitive advantage in the market. In the future, data culture will be an essential part of any successful business strategy. Companies that do not embrace this will be left behind.

      However, the use of data is a major problem for many companies. Often, data quality and data compilation stand in the way. Even though many companies already have data solutions, they are not optimally utilized. This means that much information remains unused and cannot be incorporated into decision-making.

       

      Sources:

      [1] https://hbr.org/2020/03/how-ceos-can-lead-a-data-driven-culture

      Image: AdobeStock 569760113 Annsophie Huber

      In a fast-paced and data-driven world, the management of information and knowledge is essential. Businesses in particular rely on making knowledge accessible internally as quickly, clearly, and concisely as possible. Knowledge management is the process of creating, extracting, and utilizing knowledge to improve business performance. It includes methods that help organizations identify and extract knowledge, distribute it, and use it to better achieve their goals. However, this can be a complex and challenging task, especially in large companies.

      Natural Language Processing (NLP) promises to provide a solution. This technology has the potential to revolutionize the knowledge strategy of companies. NLP is a branch of artificial intelligence that deals with the interaction between computers and human language. By using NLP, companies can gain insights from large amounts of unstructured text data and convert them into actionable knowledge.

      In this blog post, we examine how NLP can improve knowledge management and how companies can use NLP to perform complex processes quickly, safely, and automatically. We explore the benefits of using NLP in knowledge management, the various NLP techniques used, and how companies can use NLP to achieve their goals better with artificial intelligence.

      Case Study for effective knowledge management

      Using the example of email correspondence in a construction project, we illustrate the application and added value of natural language processing. We use two emails as specific examples that were exchanged during the construction project: an order confirmation for ordered items and a complaint about their quality.

      For a new building, the builder requested quotes for products from a variety of suppliers, including thermal insulation. Eventually, they were ordered from a supplier. In an email, the supplier clarifies the ordered items, their properties and costs, and confirms the delivery on a specified date. Later, the builder discovers that the quality of the delivered products does not meet the expected standards. The builder informs the supplier of this in a written complaint, also via email. The text of these emails contains a wealth of information that can be extracted, processed, and further processed using NLP methods to improve understanding. Due to the large number of different offers and interactions, manual processing is very time-consuming, and programmatic evaluation of the communication provides a remedy.

      Next, we introduce a knowledge management pipeline that gradually checks these two emails for their content and provides users with the maximum benefit through text processing. Click on the interactive boxes to see how the Knowledge Management Pipeline works!

      Klicken Sie auf den unteren Button, um den Inhalt von zu laden.

      Inhalt laden

      Summary (Task: Summarization)

      In the first step, the content of each text can be summarized and brought to the point in a few sentences. This reduces the text to important information and knowledge, removes irrelevant information such as platitudes and repetitions, and greatly reduces the amount of text to be read.

      Especially with long emails, the added value of summary alone is enormous: listing the important content as bullet points saves time, prevents misunderstandings, and avoids overlooking important details.

      General summaries are already helpful, but with the latest language models, NLP can do much more. In a general summary, the text length is reduced as much as possible while maintaining the same information density. Large language models can not only produce a general summary but also customize this process to specific needs of employees. For example, facts can be highlighted, or technical jargon can be simplified. In particular, summaries can be performed for a specific audience, such as a specific department within the company.

      Different departments and roles require different types of information. This is why summaries are particularly useful when tailored to the interests of a specific department or role. For example, the two emails in our case study contain information that is relevant to the legal, operations, or finance department in different ways. Therefore, the next step is to create a separate summary for each department:

      Klicken Sie auf den unteren Button, um den Inhalt von zu laden.

      Inhalt laden

      This makes it even easier for users to identify and understand the information that is relevant to them, while also drawing the right conclusions for their work.

      Generative NLP models not only allow texts to be condensed to the essential, but also provide explanations for ambiguities and details. An example of this is the explanation of a regulation mentioned only by an acronym in the confirmation of an order, whose details the user may not be familiar with. This eliminates the need for a tedious online search for a suitable explanation.

      Klicken Sie auf den unteren Button, um den Inhalt von zu laden.

      Inhalt laden

      Knowledge Extraction (Task: NER, Sentiment Analysis, Classification)

      The next step is to systematically categorize the emails and their contents. This allows incoming emails to be automatically assigned to the correct mailboxes, annotated with metadata, and collected in a structured way.

      For example, emails received on a customer service account can be automatically classified into defined categories (complaints, inquiries, suggestions, etc.). This eliminates the manual categorization of emails, which reduces the likelihood of incorrect categorizations and ensures more robust processes.

      Within these categories, the contents of emails can be further divided using semantic content analysis, for example, to determine the urgency of a request. More on that later.

      Klicken Sie auf den unteren Button, um den Inhalt von zu laden.

      Inhalt laden

      Once the emails are correctly classified, metadata can be extracted and created from each text using “Named Entity Recognition (NER).”

      NER allows entities in texts to be identified and named. Entities can be people, places, organizations, dates, or other named objects. Regarding email inboxes and their contents, NER can be useful in extracting important information and connections within the texts. By identifying and categorizing entities, relevant information can be quickly found and classified.

      In the case of complaints, NER can be used to identify the names of the product, the customer, and the seller. This information can then be used to solve the problem or make changes to the product to avoid future complaints.

      NER can also help automatically highlight relevant facts and connections in emails after they are classified. For example, if an order is received as an email from a customer, NER can extract the relevant information, enrich the email with metadata, and automatically forward it to the appropriate salesperson.

      Similarity (Task: Semantic Similarity)

      Successful knowledge management first requires identifying and gathering relevant data, facts, and documents in a targeted manner. This has been a particularly challenging task with unstructured text data such as emails, which are also stored in information silos (i.e. in mailboxes). To better capture the content of incoming emails and their overlaps, methods for semantic analysis of text can be employed. “Semantic Similarity Analysis” is a technology used to understand the meaning of texts and measure the similarities between different texts.

      In the context of knowledge management, semantic analysis can help group emails and identify those that relate to the same topic or contain similar requests. This can increase the productivity of customer support teams by allowing them to focus on important tasks, rather than spending a lot of time manually sorting or searching through emails.

      In addition, semantic analysis can help identify trends and patterns in incoming emails that may indicate problems or opportunities for improvement in the company. These insights can then be used to proactively address customer needs or improve processes and products.

      Answer Generation (Task: Text Generation)

      Finally, emails need to be answered. Those who have already experimented with text suggestions in email programs know that this task is not yet ready for automation. However, generative models can help answer emails faster and more accurately. A generative language model can quickly and reliably generate response templates based on incoming emails, which then only need to be supplemented, completed and checked by the person processing them. It is important to carefully check each response before sending it, as generative models are known to hallucinate results, i.e. generate convincing answers that contain errors upon closer examination. Here too, AI systems can at least partially remedy the situation by using a “control model” to verify the facts and statements of these “response models” for accuracy.

      Klicken Sie auf den unteren Button, um den Inhalt von zu laden.

      Inhalt laden

      Conclusion

      Natural Language Processing (NLP) offers companies numerous opportunities to improve their knowledge management strategies. NLP enables us to extract precise information from unstructured text and optimize the processing and provision of knowledge for employees.

      By applying NLP methods to emails, documents, and other text sources, companies can automatically categorize, summarize, and reduce content to the most important information. This allows employees to quickly and easily access important information without having to wade through long pages of text. This saves time, reduces error-proneness, and contributes to making better business decisions.

      At the example of a construction project, we demonstrated how NLP can be used in practice to process emails more efficiently and improve knowledge management. The application of NLP techniques, such as summarizing and specifying information for specific departments, can help companies better achieve their goals and improve their performance.

      The application of NLP in knowledge management offers great advantages for companies. It can help automate processes, improve collaboration, increase efficiency, and optimize decision-making quality. Companies that integrate NLP into their knowledge management strategy can gain valuable insights that enable them to better navigate an increasingly complex business environment.

      Image source: AdobeStock 459537717 Oliver Guggenbühl, Jonas Braun

      In a fast-paced and data-driven world, the management of information and knowledge is essential. Businesses in particular rely on making knowledge accessible internally as quickly, clearly, and concisely as possible. Knowledge management is the process of creating, extracting, and utilizing knowledge to improve business performance. It includes methods that help organizations identify and extract knowledge, distribute it, and use it to better achieve their goals. However, this can be a complex and challenging task, especially in large companies.

      Natural Language Processing (NLP) promises to provide a solution. This technology has the potential to revolutionize the knowledge strategy of companies. NLP is a branch of artificial intelligence that deals with the interaction between computers and human language. By using NLP, companies can gain insights from large amounts of unstructured text data and convert them into actionable knowledge.

      In this blog post, we examine how NLP can improve knowledge management and how companies can use NLP to perform complex processes quickly, safely, and automatically. We explore the benefits of using NLP in knowledge management, the various NLP techniques used, and how companies can use NLP to achieve their goals better with artificial intelligence.

      Case Study for effective knowledge management

      Using the example of email correspondence in a construction project, we illustrate the application and added value of natural language processing. We use two emails as specific examples that were exchanged during the construction project: an order confirmation for ordered items and a complaint about their quality.

      For a new building, the builder requested quotes for products from a variety of suppliers, including thermal insulation. Eventually, they were ordered from a supplier. In an email, the supplier clarifies the ordered items, their properties and costs, and confirms the delivery on a specified date. Later, the builder discovers that the quality of the delivered products does not meet the expected standards. The builder informs the supplier of this in a written complaint, also via email. The text of these emails contains a wealth of information that can be extracted, processed, and further processed using NLP methods to improve understanding. Due to the large number of different offers and interactions, manual processing is very time-consuming, and programmatic evaluation of the communication provides a remedy.

      Next, we introduce a knowledge management pipeline that gradually checks these two emails for their content and provides users with the maximum benefit through text processing. Click on the interactive boxes to see how the Knowledge Management Pipeline works!

      Klicken Sie auf den unteren Button, um den Inhalt von zu laden.

      Inhalt laden

      Summary (Task: Summarization)

      In the first step, the content of each text can be summarized and brought to the point in a few sentences. This reduces the text to important information and knowledge, removes irrelevant information such as platitudes and repetitions, and greatly reduces the amount of text to be read.

      Especially with long emails, the added value of summary alone is enormous: listing the important content as bullet points saves time, prevents misunderstandings, and avoids overlooking important details.

      General summaries are already helpful, but with the latest language models, NLP can do much more. In a general summary, the text length is reduced as much as possible while maintaining the same information density. Large language models can not only produce a general summary but also customize this process to specific needs of employees. For example, facts can be highlighted, or technical jargon can be simplified. In particular, summaries can be performed for a specific audience, such as a specific department within the company.

      Different departments and roles require different types of information. This is why summaries are particularly useful when tailored to the interests of a specific department or role. For example, the two emails in our case study contain information that is relevant to the legal, operations, or finance department in different ways. Therefore, the next step is to create a separate summary for each department:

      Klicken Sie auf den unteren Button, um den Inhalt von zu laden.

      Inhalt laden

      This makes it even easier for users to identify and understand the information that is relevant to them, while also drawing the right conclusions for their work.

      Generative NLP models not only allow texts to be condensed to the essential, but also provide explanations for ambiguities and details. An example of this is the explanation of a regulation mentioned only by an acronym in the confirmation of an order, whose details the user may not be familiar with. This eliminates the need for a tedious online search for a suitable explanation.

      Klicken Sie auf den unteren Button, um den Inhalt von zu laden.

      Inhalt laden

      Knowledge Extraction (Task: NER, Sentiment Analysis, Classification)

      The next step is to systematically categorize the emails and their contents. This allows incoming emails to be automatically assigned to the correct mailboxes, annotated with metadata, and collected in a structured way.

      For example, emails received on a customer service account can be automatically classified into defined categories (complaints, inquiries, suggestions, etc.). This eliminates the manual categorization of emails, which reduces the likelihood of incorrect categorizations and ensures more robust processes.

      Within these categories, the contents of emails can be further divided using semantic content analysis, for example, to determine the urgency of a request. More on that later.

      Klicken Sie auf den unteren Button, um den Inhalt von zu laden.

      Inhalt laden

      Once the emails are correctly classified, metadata can be extracted and created from each text using “Named Entity Recognition (NER).”

      NER allows entities in texts to be identified and named. Entities can be people, places, organizations, dates, or other named objects. Regarding email inboxes and their contents, NER can be useful in extracting important information and connections within the texts. By identifying and categorizing entities, relevant information can be quickly found and classified.

      In the case of complaints, NER can be used to identify the names of the product, the customer, and the seller. This information can then be used to solve the problem or make changes to the product to avoid future complaints.

      NER can also help automatically highlight relevant facts and connections in emails after they are classified. For example, if an order is received as an email from a customer, NER can extract the relevant information, enrich the email with metadata, and automatically forward it to the appropriate salesperson.

      Similarity (Task: Semantic Similarity)

      Successful knowledge management first requires identifying and gathering relevant data, facts, and documents in a targeted manner. This has been a particularly challenging task with unstructured text data such as emails, which are also stored in information silos (i.e. in mailboxes). To better capture the content of incoming emails and their overlaps, methods for semantic analysis of text can be employed. “Semantic Similarity Analysis” is a technology used to understand the meaning of texts and measure the similarities between different texts.

      In the context of knowledge management, semantic analysis can help group emails and identify those that relate to the same topic or contain similar requests. This can increase the productivity of customer support teams by allowing them to focus on important tasks, rather than spending a lot of time manually sorting or searching through emails.

      In addition, semantic analysis can help identify trends and patterns in incoming emails that may indicate problems or opportunities for improvement in the company. These insights can then be used to proactively address customer needs or improve processes and products.

      Answer Generation (Task: Text Generation)

      Finally, emails need to be answered. Those who have already experimented with text suggestions in email programs know that this task is not yet ready for automation. However, generative models can help answer emails faster and more accurately. A generative language model can quickly and reliably generate response templates based on incoming emails, which then only need to be supplemented, completed and checked by the person processing them. It is important to carefully check each response before sending it, as generative models are known to hallucinate results, i.e. generate convincing answers that contain errors upon closer examination. Here too, AI systems can at least partially remedy the situation by using a “control model” to verify the facts and statements of these “response models” for accuracy.

      Klicken Sie auf den unteren Button, um den Inhalt von zu laden.

      Inhalt laden

      Conclusion

      Natural Language Processing (NLP) offers companies numerous opportunities to improve their knowledge management strategies. NLP enables us to extract precise information from unstructured text and optimize the processing and provision of knowledge for employees.

      By applying NLP methods to emails, documents, and other text sources, companies can automatically categorize, summarize, and reduce content to the most important information. This allows employees to quickly and easily access important information without having to wade through long pages of text. This saves time, reduces error-proneness, and contributes to making better business decisions.

      At the example of a construction project, we demonstrated how NLP can be used in practice to process emails more efficiently and improve knowledge management. The application of NLP techniques, such as summarizing and specifying information for specific departments, can help companies better achieve their goals and improve their performance.

      The application of NLP in knowledge management offers great advantages for companies. It can help automate processes, improve collaboration, increase efficiency, and optimize decision-making quality. Companies that integrate NLP into their knowledge management strategy can gain valuable insights that enable them to better navigate an increasingly complex business environment.

      Image source: AdobeStock 459537717 Oliver Guggenbühl, Jonas Braun