Data cleansing/cleanup and migration with semantic technologies: Proof of Concept Package

We are currently offering a PoC Package (Proof of Concept) for our special offering “Data Cleanup with Semantic Technologies”.

What is it about?

In a wide variety of scenarios, the controlled destruction and transformation or migration of data sets becomes necessary. Use cases can be found in all industries, here is a small selection:

  • Data protection: identification and segregation of PID data (personal data); completeness and inventory analysis.
  • Archive cleansing: Traceable deletion of data holdings (cassation) in the case of non-existent or insufficient metadata (lifecycle attributes).
  • Data migration: moving data from unstructured to structured environments (e.g. from folders to ECM/DMS systems)
  • Controlled data transfers: Data transfers that must be made fully traceable due to regulatory requirements (e.g., in drug development). Transfer of patient data from different applications into one PID.
  • Increase data quality: data cleansing and removal of ROT data.
  • Control of documentation: Traceability of versioning and complete traceability.

Today, the exploitation of large amounts of data can only be done with AI support.

Benefit

The benefits are obvious. So the key question is: how to separate the valuable payload data from the ROT data? (RED here stands for Redundant, Outdated, Trivial, i.e. data which is duplicated, outdated or has no significance for the organization).

  • Benefit POC: The feasibility of automated data cleanup can be demonstrated on the basis of a selected dataset.
  • Benefits of automated data cleanup: The savings potential compared to manual data cleanup is quantified.
  • The required processes are now also available to customers who do not wish to pay for the licenses of a “big player”.

ROT data today accounts for an estimated 70% of total IT costs.

In addition to the financial benefit, a significant improvement in data quality can be achieved.

So the key question is: how to separate the valuable payload data from the ROT data? (RED here stands for Redundant, Outdated, Trivial, i.e. data which is duplicated, outdated or has no significance for the organization).

Method

The PoC is based on the MATRIO® Data Cleanup method(white paper):

Our offer

We guide you through this process in a structured manner and work with you to develop a comprehensive cleanup concept. In doing so, we rely on the PoolParty Semantic Suite as well as on a method construction kit for the analysis of your data. We cooperate with an experienced SW development company, which is specialized in NLP and intelligent search methods.

Technology toolbox:

Namenserkennung
Identitätscheck, Erkennung und
Löschung persönlicher Information aus Dokumenten
ThemenerkennungE-Mail Triage, Weiterleitung von Anfragen und
Informationen
TextklassifikationAutomatische Dokumentenzuweisung und -Ablage
InformationsextraktionZusammenfassung & Extraktion von
Schlüsselinformation aus Dokumenten und Dossiers
AnnotationAuszeichnung von Textelementen mit
kundenspezifischen Labels
Recommender SystemeEmpfehlungen aufgrund von Textinhalten und
Metadaten
ÄhnlichkeitsbestimmungBestimmung und Suche ähnlicher Dokumente /
Entitäten
Intelligente SucheSemantische Volltextsuche, Unterstützung bei der
Navigation in Dokumenten-Repositories,
Informationsfilterung
Kombination von
Komponenten
Inhaltsgesteuerte Entscheidungsunterstützung
DatenklassifikationTaxonomien, Ontologien
Knowledge GraphsEinsatz von KG Verfahren auf Basis der PoolParty Semantic Suite

Interested? Then please contact us directly! Your contact person: Bruno Wildhaber

 

 

 

 

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Related articles