Zurück
  • Visual Abstract

Verwendung von Natural Language Processing zur Extraktion der TNM Klassifikation – eine Fallstudie

Termin

Datum:
Zeit:
Redezeit:
Diskussionszeit:
Ort / Stream:
Saal Essen

Poster

Verwendung von Natural Language Processing zur Extraktion der TNM Klassifikation – eine Fallstudie

Themen

  • Digitalisierung / Künstliche Intelligenz / eHealth / Telemedizin / Applikationen
    • Sonstiges

Mitwirkende

Simone Melnik (Münster), Dennis Brosch (Münster), Tobias Brix (Münster), Armands Riders (Münster), Achim Georg Beule (Münster), Julian Varghese (Münster), Claudia Rudack (Münster)

Abstract

Introduction

A fundamental structure in the oncology department of an ENT clinic is the TNM classification of malignant tumors. Currently, this information is documented as continuous text in the pathology report, and not in a structured way. Enabling digital processing, we established an automatic extraction method to convert included data from pathology reports at the University Hospital Münster (UKM) using a generic Natural Language Processing query.

Material and methods

The database query used 143 patients of the ENT clinic, diagnosed with malignant neoplasm of the oropharynx between 2020 and 2021 with at least one pathology report. Regular expression spelling variations were used to filter the TNM classification from the reports. A Python script was employed to merge the output from patients' recent pathology reports and to classify it as the primary or recurrent TNM stage if another report was only generated after 12 weeks.

Results

The output of the Python script detected 240 cases, 128 concerning primary TNM classification. In 115 Cases, the query was able to find at least one value. In the primary setting, 94 T values, 80 N values and 8 M values were detected. A complete TNM classification could only be determined for 12 cases. A simultaneous indication of a T and N value was present in 91 cases.

Discussion

Applying a database query with regular expressions to generate structured TNM data was successful in almost half of the cases (48%). Since missing values could be explained by the heterogeneity of the TNM classification, a structured documentation will be added at the UKM. As future work, the query will be evaluated with data from the cancer registry, which is well-formatted due to manual review, but is not part of the medical records yet.

Die Autorinnen/Autoren geben an, dass kein Interessenkonflikt besteht.

  • © Conventus Congressmanagement & Marketing GmbH