Fuzzy Cross Language Plagiarism Detection (Arabic-English) using WordNet in a Big Data environment

Oukessou, Mohamed ; Ezzikouri, Hanane ; et al.

In: Proceedings of the 2018 2nd International Conference on Cloud and Big Data Computing, 2018-08-03

Online unknown

Zugriff:

Cross-Language Plagiarism refers to the unacknowledged reuse of a text involving its translation from one natural language to another without proper referencing to the original source. One of the common problems in data processing is efficient large-scale text comparison, especially semantic based similarity due to the increase in the number of publications and the rate of suspicious documents sources of plagiarism. CLPD nature could be more complicated than simple copy+translate and paste, thus the detecting process exposes the need for a vague concept and fuzzy sets techniques in a big data environment to reveal dishonest practices in Arabic documents. In this paper, we propose a new Cross-Language Plagiarism Detection based on fuzzy-semantic similarity using WordNet and two semantic approaches WuP the work is done in a parallel way using Apache Hadoop with its distributed file system HDFS and the MapReduce programming model. The experimental results show that the Fuzzy Wu & Palmer have high performance than Fuzzy Lin.

Titel:	Fuzzy Cross Language Plagiarism Detection (Arabic-English) using WordNet in a Big Data environment
Autor/in / Beteiligte Person:	Oukessou, Mohamed ; Ezzikouri, Hanane ; Youness, Madani ; Erritali, Mohamed
Link:	View record in OpenAIRE (Volltext) https://doi.org/10.1145/3264560.3264562
Zeitschrift:	Proceedings of the 2018 2nd International Conference on Cloud and Big Data Computing, 2018-08-03
Veröffentlichung:	ACM, 2018
Medientyp:	unknown
DOI:	10.1145/3264560.3264562
Schlagwort:	business.industry Computer science Big data Fuzzy set WordNet 02 engineering and technology computer.software_genre Fuzzy logic Semantic similarity 020204 information systems 0202 electrical engineering, electronic engineering, information engineering Programming paradigm 020201 artificial intelligence & image processing Plagiarism detection Artificial intelligence business computer Natural language Natural language processing
Sonstiges:	Nachgewiesen in: OpenAIRE Rights: CLOSED

Klicken Sie ein Format an und speichern Sie dann die Daten oder geben Sie eine Empfänger-Adresse ein und lassen Sie sich per Email zusenden.

BibTeX Citavi, JabRef, u.a.
(Literaturverwaltung)

PDF kein Volltext!
(Merkzettel, Notizen)

RIS Endnote, Citavi u.a.
(Literaturverwaltung)

MODS
(XML zur Weiterverarbeitung)

oder

Wählen Sie das für Sie passende Zitationsformat und kopieren Sie es dann in die Zwischenablage, lassen es sich per Mail zusenden oder speichern es als PDF-Datei.

Gewünschter Zitations-Stil:

oder

Bitte prüfen Sie, ob die Zitation formal korrekt ist, bevor Sie sie in einer Arbeit verwenden. Benutzen Sie gegebenenfalls den "Exportieren"-Dialog, wenn Sie ein Literaturverwaltungsprogramm verwenden und die Zitat-Angaben selbst formatieren wollen.