Extraction of Professional Details from Web-URLs using DeepDive.

Vyas, Aditya ; Kadakia, Urmil ; et al.

In: Procedia Computer Science, Jg. 132 (2018-04-01), S. 1602-1610

Online academicJournal

Zugriff:

Full Text Finder (Volltext)

Manual extraction of data from unstructured data sources like websites is labour intensive and becomes almost in-feasible at large scale. Recent state-of-the-art techniques for the task of information extraction show encouraging results. In this work, we make an attempt to extract professional details like name, email, address, contact number, and specialization from home pages of doctors. The work covers two possible scenarios of websites having these details. One scenario is where a website contains details of a single doctor. Another scenario is where a website may contain multiple information of multiple doctors/professionals at the same time. The problem is attempted to be solved as a relation extraction task for Information Extraction. The proposed solution has been built on top of DeepDive, a tool developed by Stanford. In both scenarios, DeepDive takes pre-processed data sentences as input and constructs entity-relations. For each entity-relation, DeepDive computes a probability that the relationship is a correct match using distance supervision and user-defined heuristic rules. In case of experiment-1, our system achieves 69.14% accuracy for the name, 88.67% accuracy for location and 100% for email, number and specialization. In case of experiment-2, the observed probabilities are not so significant and mostly around 0.5-0.7 but we present some solutions for future work. The techniques presented here can easily be extended to generalize for other types of professionals too and not just doctors. [ABSTRACT FROM AUTHOR]

Copyright of Procedia Computer Science is the property of Elsevier B.V. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)

Titel:	Extraction of Professional Details from Web-URLs using DeepDive.
Autor/in / Beteiligte Person:	Vyas, Aditya ; Kadakia, Urmil ; Jat, Pokhar Mal
Link:	Full Text Finder (Volltext)
Zeitschrift:	Procedia Computer Science, Jg. 132 (2018-04-01), S. 1602-1610
Veröffentlichung:	2018
Medientyp:	academicJournal
ISSN:	1877-0509 (print)
DOI:	10.1016/j.procs.2018.05.125
Schlagwort:	UNIFORM Resource Locators DATA extraction WEBSITES DATA mining EMAIL
Sonstiges:	Nachgewiesen in: Supplemental Index Sprachen: English

Klicken Sie ein Format an und speichern Sie dann die Daten oder geben Sie eine Empfänger-Adresse ein und lassen Sie sich per Email zusenden.

BibTeX Citavi, JabRef, u.a.
(Literaturverwaltung)

PDF kein Volltext!
(Merkzettel, Notizen)

RIS Endnote, Citavi u.a.
(Literaturverwaltung)

MODS
(XML zur Weiterverarbeitung)

oder

Wählen Sie das für Sie passende Zitationsformat und kopieren Sie es dann in die Zwischenablage, lassen es sich per Mail zusenden oder speichern es als PDF-Datei.

Gewünschter Zitations-Stil:

oder

Bitte prüfen Sie, ob die Zitation formal korrekt ist, bevor Sie sie in einer Arbeit verwenden. Benutzen Sie gegebenenfalls den "Exportieren"-Dialog, wenn Sie ein Literaturverwaltungsprogramm verwenden und die Zitat-Angaben selbst formatieren wollen.