KEYPHRASE EXTRACTION BASED ON LARGE LANGUAGE MODELS

Abstract

The article addresses the current problem of extracting key phrases from natural language texts, which is a critical task in the field of natural language processing and text mining. It examines in detail the main approaches to extracting key phrases (keywords), including both traditional methods and modern approaches based on artificial intelligence. The paper discusses a set of widely used methods in this field, such as TF-IDF, RAKE, YAKE, and linguistic parser-based methods. These methods are based on statistical principles and/or graph structures, but they often face problems related to their insufficient ability to take into account the context of the text. The GPT-3 large language model demonstrates superior contextual understanding compared to traditional methods for key phrase extraction. This advanced capability allows GPT-3 to more accurately identify and extract relevant key phrases from text. The comparative analysis using the Inspec benchmark dataset reveals GPT-3's significantly higher performance in terms of Mean Average Precision (MAP@K). However, it should be noted that despite high accuracy and extraction quality, the use of large language models may be limited in real-time applications due to their longer response time compared to classical statistical methods. Thus, the article emphasizes the need for further research in this area to optimize key phrase extraction algorithms, taking into account real-time requirements and text context

Скачивания

Published:

2024-11-10

Issue:

Section:

SECTION I. INFORMATION PROCESSING ALGORITHMS