Data cleansing/cleanup and migration with semantic technologies: Proof of Concept Package

We are currently offering a PoC Package (Proof of Concept) for our special offering “Data Cleanup with Semantic Technologies”.

What is it about?

In a wide variety of scenarios, the controlled destruction and transformation or migration of data sets becomes necessary. Use cases can be found in all industries, here is a small selection:

  • Data protection: identification and segregation of PID data (personal data); completeness and inventory analysis.
  • Archive cleansing: Traceable deletion of data holdings (cassation) in the case of non-existent or insufficient metadata (lifecycle attributes).
  • Data migration: moving data from unstructured to structured environments (e.g. from folders to ECM/DMS systems)
  • Controlled data transfers: Data transfers that must be made fully traceable due to regulatory requirements (e.g., in drug development). Transfer of patient data from different applications into one PID.
  • Increase data quality: data cleansing and removal of ROT data.
  • Control of documentation: Traceability of versioning and complete traceability.

Today, the exploitation of large amounts of data can only be done with AI support.


The benefits are obvious. So the key question is: how to separate the valuable payload data from the ROT data? (RED here stands for Redundant, Outdated, Trivial, i.e. data which is duplicated, outdated or has no significance for the organization).

  • Benefit POC: The feasibility of automated data cleanup can be demonstrated on the basis of a selected dataset.
  • Benefits of automated data cleanup: The savings potential compared to manual data cleanup is quantified.
  • The required processes are now also available to customers who do not wish to pay for the licenses of a “big player”.

ROT data today accounts for an estimated 70% of total IT costs.

In addition to the financial benefit, a significant improvement in data quality can be achieved.

So the key question is: how to separate the valuable payload data from the ROT data? (RED here stands for Redundant, Outdated, Trivial, i.e. data which is duplicated, outdated or has no significance for the organization).


The PoC is based on the MATRIO® Data Cleanup method(white paper):

Our offer

We guide you through this process in a structured manner and work with you to develop a comprehensive cleanup concept. In doing so, we rely on the PoolParty Semantic Suite as well as on a method construction kit for the analysis of your data. We cooperate with an experienced SW development company, which is specialized in NLP and intelligent search methods.

Technology toolbox:

Identitätscheck, Erkennung und
Löschung persönlicher Information aus Dokumenten
ThemenerkennungE-Mail Triage, Weiterleitung von Anfragen und
TextklassifikationAutomatische Dokumentenzuweisung und -Ablage
InformationsextraktionZusammenfassung & Extraktion von
Schlüsselinformation aus Dokumenten und Dossiers
AnnotationAuszeichnung von Textelementen mit
kundenspezifischen Labels
Recommender SystemeEmpfehlungen aufgrund von Textinhalten und
ÄhnlichkeitsbestimmungBestimmung und Suche ähnlicher Dokumente /
Intelligente SucheSemantische Volltextsuche, Unterstützung bei der
Navigation in Dokumenten-Repositories,
Kombination von
Inhaltsgesteuerte Entscheidungsunterstützung
DatenklassifikationTaxonomien, Ontologien
Knowledge GraphsEinsatz von KG Verfahren auf Basis der PoolParty Semantic Suite

Interested? Then please contact us directly! Your contact person: Bruno Wildhaber






Submit a Comment

Your email address will not be published. Required fields are marked *

Related articles

On 16.3. is Digital Cleanup Day

On 16.3. is Digital Cleanup Day

Tidying up is clearly not everyone's cup of tea, but we all know the good feeling that a tidy room, a tidy desk or ... a tidy drive! You can feel proud with a clear conscience, because deleting data also has an important effect on energy consumption. I have calculated...

read more
Dealing with data risks: Data breach notification

Dealing with data risks: Data breach notification

A data breach notification or "data breach notification" refers to the process by which an organization or company is required to notify the relevant data protection authorities and, if applicable, data subjects of a data breach that is likely to result in a high risk...

read more