Time Period Categorization in Fiction: A Comparative Analysis of Machine Learning Techniques
In: Cataloging & Classification Quarterly, 2024, S. 1-30
Online
unknown
Zugriff:
This study investigates the automatic categorization of time period metadata in fiction, a critical but often overlooked aspect of cataloging. Using a comparative analysis approach, the performance of three machine learning techniques, namely Latent Dirichlet Allocation (LDA), Sentence-BERT (SBERT), and Term Frequency-Inverse Document Frequency (TF-IDF) were assessed, by examining their precision, recall, F1 scores, and confusion matrix results. LDA identifies underlying topics within the text, TF-IDF measures word importance, and SBERT measures sentence semantic similarity. Based on F1-score analysis and confusion matrix outcomes, TF-IDF and LDA effectively categorize text data by time period, while SBERT performed poorly across all time period categories.
Titel: |
Time Period Categorization in Fiction: A Comparative Analysis of Machine Learning Techniques
|
---|---|
Autor/in / Beteiligte Person: | Westin, Fereshta |
Link: | |
Zeitschrift: | Cataloging & Classification Quarterly, 2024, S. 1-30 |
Veröffentlichung: | 2024 |
Medientyp: | unknown |
ISSN: | 0163-9374 (print) ; 1544-4554 (print) |
DOI: | 10.1080/01639374.2024.2315548 |
Schlagwort: |
|
Sonstiges: |
|