This repository is under development.
Text file containing big Arabic corpus. This file is inspired by big.txt file by Peter Norvig. While his file is for English, arabic_big.txt is the one for Arabic Language.
- islambeacon - The full Quaranic script.
- LABR - LABR: Large Scale Arabic Book Reviews Dataset.
- SaudiNewsNet - This repo contains a set of Arabic newspaper articles alongwith metadata, extracted from various Saudi newspapers.
- Arabic-Wikipedia-Corpus
- akec - Arabic Keyphrase Extraction Corpus.
- El-Haj list - list made by Dr. El-Haj of several academic papers.
More sources to be added.
Contribution is welcome to enlarge and enhance the content. If you're interested, feel free to send a pull request or please get in touch.