C4Corpus (CC BY-NC-SA part)
Technische Universität Darmstadt, 2017
Online
academicJournal
Zugriff:
A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
Titel: |
C4Corpus (CC BY-NC-SA part)
|
---|---|
Autor/in / Beteiligte Person: | Gurevych, Iryna ; Habernal, Ivan ; Zayed, Omnia |
Link: | |
Veröffentlichung: | Technische Universität Darmstadt, 2017 |
Medientyp: | academicJournal |
Schlagwort: |
|
Sonstiges: |
|