SWISDATAs Results from the Future Data Assets Project

Resource: https://future-data-assets.de
Resource: https://future-data-assets.de

We are proud that we have been part of the bilateral Future Data Assets project as an associated partner in a strong German consortium where the main partners were, atlan-tec, Deloitte, FIR an der RWTH Aachen, Universität des Saarlandes, DMG MORI and VDI. This project was a groundbreaking initiative that created a comprehensive framework for managing and reporting on corporate data capital. SWISDATA on the other hand showed how to improve the value of data in companies. In this blog post, we will give you an overview of the project and its most important outcomes.

As a part of the project, SWISDATA implemented many new artefacts in the field of machine learning and AI for its SWIS graph. The SWIS graph can be used to both scrape data from the web and also analyse data. One of the main features of SWIS is its intelligent search and text normalisation, which allows users to find relevant information from various sources and formats. Moreover, SWIS has a built-in knowledge graph that automatically extracts and organizes knowledge from texts. Also process mining is incorporated in SWIS. Users can query this knowledge in natural language, similar to a chat bot. The knowledge is also linked to all other data in the SWIS graph and can be used as a data source for a question and answer system.

What is the corporate data capital

Corporate data is the value that company generates in terms of business performance and social impact. Data is becoming a key asset for many industries, especially in the context of digitalisation, artificial intelligence and industry 4.0. However, there is no clear and consistent way to measure, evaluate and report on data, even if data is essential for decision-making, governance and communication.

Some highlights of our research results in this project related to process mining

We created a machine learning model by using an Averaged Perceptron. It learns from event logs of websites/apps and contextual data, like the weather or sensor data. This helps to understand discovered process models by using process mining on event logs because the weights from the Averaged Perceptron are interpretable and therefore the AI is explainable.

This is a significant step for process mining to understand discovered process models. We also improved the process discovery method Heuristic Miner with a node filtering by applying a developed quality metric and added it to the SWIS graph. An iOS application for the Averaged Perceptron is accessible here.

The images show a process model when applying the Heuristic Miner in the SWIS graph and related insights from the Averaged Perceptron (iOS app).

A new way to visualize process models is to use DDCAL, a novel cluster algorithm, that was developed and added to the SWIS graph. DDCAL is a unsupervised machine learning algorithm to colour nodes or to show edges with different thicknesses in process models to reduce the cognitive load of analysts. This helps to understand process models in a better way, e.g., to discover interesting areas like the most common path. DDCAL was published in the peer reviewed Journal of Classification by Springer which is accessible here. Also an open source Python implementation of DDCAL is accessible here.

"ICT of the Future" program - an initiative of the Federal Ministry for Climate Protection, Environment, Energy, Mobility, Innovation and Technology (BMK)


FAQ for this article

QUESTION ANSWER
What is the corporate data capital? Corporate data is the value that company generates in terms of business performance and social impact. Data is becoming a key asset for many industries, especially in the context of digitalisation, artificial intelligence and industry 4.0.
What did SWISDATA implement with this project FDA in machine learning and AI? SWISDATA implemented new artefacts in the field of machine learning and AI for its SWIS graph. The SWIS graph can be used to scrape data from the web, analyse data, and has features such as intelligent search and text normalisation.
What is the significance of process mining with Averaged Perceptron? The Averaged Perceptron learns from event logs of websites/apps and contextual data, helping to understand discovered process models by using process mining on event logs. This makes AI explainable.
How did SWISDATA improve the Heuristic Miner? SWISDATA improved the Heuristic Miner with a node filtering by applying a developed quality metric and added it to the SWIS graph.
What is the significance of having an interpretable AI model? The weights from the Averaged Perceptron are interpretable, making the AI model explainable, which is a significant step for process mining to understand discovered process models.
What was published in the Journal of Classification by Springer? DDCAL, a novel cluster algorithm, was published in the peer-reviewed Journal of Classification by Springer.
Is there an open-source implementation of DDCAL available? Yes, an open-source Python implementation of DDCAL is accessible here: DDCAL GitHub.
What is DDCAL used for in process models? DDCAL is a novel cluster algorithm that was developed and added to the SWIS graph, it's used to colour nodes or show edges with different thicknesses in process models to reduce cognitive load of analysts.
What is the purpose of the 'ICT of the Future' program? The 'ICT of the Future' program is an initiative by the Federal Ministry for Climate Protection, Environment, Energy, Mobility, Innovation and Technology (BMK), its main goal is to support research and development in the field of information and communication technology.
What are the main results from the Future Data Assets Project? The main results from the Future Data Assets Project are related to the development of novel algorithms like DDCAL that can be used to understand process models better, it's a project that aims to make data-driven decision making more efficient.