Seeded-BTM: Enabling Biterm Topic Model with Seeds for Product Aspect Mining
In: 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), 2019-08-01
Online
unknown
Zugriff:
One of the most challenging problems in aspect level-analysis of product reviews is aspect mining (or aspect extraction), which aims to extract and categorize the terms that describe aspects of the product. In recent years, as topic models like LDA can perform extraction and clustering in one step, most unsupervised and semi-supervised statistical models are LDA-based. These models often treat each short sentence from the review as the training unit. However, LDA is proposed for normal documents without considering the sparse word frequency of short texts, which makes LDA perform poorly on short texts. Instead of using LDA, this paper proposes a Seeded Biterm Topic Model (Seeded-BTM) that models the co-occurred word pairs (i.e., biterms) rather than sentences or whole reviews. And by using seed sets, the model enables unsupervised BTM to discover aspect topics that under the user guidance. Experimental results on real-world product reviews from a number of domains show that Seeded-BTM can find more human conformable product aspects and outperforms the state-of-the-art models.
Titel: |
Seeded-BTM: Enabling Biterm Topic Model with Seeds for Product Aspect Mining
|
---|---|
Autor/in / Beteiligte Person: | Li, Ning ; Chow, Chi-Yin ; Zhang, Jia-Dong |
Link: | |
Zeitschrift: | 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), 2019-08-01 |
Veröffentlichung: | IEEE, 2019 |
Medientyp: | unknown |
DOI: | 10.1109/hpcc/smartcity/dss.2019.00386 |
Schlagwort: |
|
Sonstiges: |
|