08.07.2025 15:00

Test- and Measurement Data as First-Class ML Citizens

Explore-to-Innovate | 2nd and 3rd July 2025 | Benningen | Germany

Test- and measurement data is acquired during the development process of every product from simple toothbrushes up to complex machines and vehicles. Often, this data is captured in various formats such as XLSX, CSV, and MDF, making it challenging to add value, especially for further analysis and machine learning (ML).

👉 Discover how ASAM ODS unifies diverse measurement data formats into a single and coherent data view, empowering test- and measurement data to become first-class ML citizens.

[In der Blog-Übersicht wird hier ein Weiterlesen-Link angezeigt]

Test- and Measurement Data Problem

Looking from the problem perspective, the test and measurement data problem is not new and was already recognized during the Big Data era as the Five Vs: Variety, Velocity, Volume, Veracity and Value and can be grouped into three data problem classes:

Data Format Diversity
Data Quality
Data Accessibility

The aforementioned data problems are the reason for a low data analysis rate such that only 5% - 20% of the collected test and measurement data is typically analyzed (1) and “less than 0.5% of all data is ever analyzed and used” (2). The graphic below shows where those data problems occur in the data analysis process.

Solving the Test- and Measurement Data Challenge

To solve the test- and measurement data problem, let’s explore the individual problem categories and provide solution examples.

Address Data Format Diversity

Data captured by Test- and measurement systems are mostly stored in files of different formats because of different tool vendors being used. In some cases, proprietary data formats can be accessed using programming libraries which may not fit with the programming languages being used by R&D teams.

Some of the biggest problems are introduced using CSV files, because of internationalization (like different decimal point characters) and localization (like German Umlaut) issues.

Furthermore, existing toolchains limit flexibility and prevent from data movement or conversion because their disruption will introduce additional costs.

Depending the size of your data, the sheer data volume may prohibit data copies as data duplication introduce additional storage costs.

One of the most effective ways to tackle data format diversity is through ASAM ODS (External) Data Plugins. These lightweight microservices are built on the Google gRPC Protokoll, offering a streamlined and efficient API to access both metadata and bulk measurement data—directly from the original files.

Thanks to the Protobuf toolchain, these plugins are compatible with virtually any programming language, making them highly accessible to R&D teams regardless of their tech stack, resulting in low implementation costs.

By introducing this standardized API, all file formats look similar through the lens of the API. No data conversion, no data movement and no data duplication is necessary such that existing toolchains stay intact.

Overcome Data Quality Issues

Even though ASAM ODS (External) Data Plugins already helps with inconsistently formatted data, data quality problems still exist like:

Not cleansed or inaccurate data
Missing or miss-spelled data
Wrong data values

Especially missing meta data and missing data context limit organizational and searching functionality but also limit analytic functionality.

This is where the ASAM ODS Base Data Model becomes essential. Data catalogs helps to identify and correct missing or miss-spelled data values. With the help of data limits, wrong data values can be marked with NaN or NULL. Additionally, not cleansed or inaccurate data can be detected.

The additional meta data improves the data context with better data navigation and data organization capabilities which leads to improved analysis and searching capabilities.

The base data model also adds further data semantic and in combination with base entities and their respective relations defines a measurement data ontology which leads to further machine learning advantages.

ℹ️ Note: User and User Groups can be used to define Access Control Lists (ACLs), such that ASAM ODS helps to support your Data Governance policy.

Data Accessibility for Data Scientists

When looking into data access, it is important to identify the persona which actaully applies the needed data analysis: the Data Scientist.

The job of the data scientist is amongst other the understanding and implementation of machine learning algorithm and techniques. To do so, a data scientist deals with data visualization tools such as Tableau and Microsoft Power BI. And he has experience in big data tools like Apache Spark and Apache Hadoop and a good understanding of Python or R and Expert in SQL [3].

⚠️👉 This means, the Data Source must support connectivity to the highlighted tools.

Even though, there is not a one size fits all answer to these requirements, supporting Python and a well-known query language is already covering a wide range of these and covered by ASAM ODSBox.

ASAM ODSBox is a lightweight Python wrapper on top of the ASAM ODS HTTP-API. By providing the ASAM ODS data in form of pandas.DataFrames, not only Python analysis and machine learning tools such as TensorFlow or scikit-learn can directly be used, but also Power BI is capable accessing data this way.

The provided JAQuel query language is a simple and intuitive way to explore your data using the concepts of the MongoDB Query Language (MQL).

Another good news is: the Python ASAM ODSBox is open source and comes as a free web download. Additionally, you can find multiple Jupyter notebooks examples in the Data Management Learning Path curriculum.

ASAM ODS: The Solution Stack

The combination of ASAM ODS DataPlugins, with the ASAM ODS (Base) Data Model and the ASAM ODSBox provide the needed functionalities and capabilities making test- and measurement data a first-class ML citizen.

Typical machine learning tools can now be used by data scientists for “any data”.

Furthermore, the introduced tool chain bridges the gap to Microsoft Copilot or Google Gemini or other AI assistants to create solutions faster and more efficient – even for non-data scientists.

The ASAM ODS Standard helps integrating different measurement data files into a holistic data view.