diff --git a/README.md b/README.md index 28d1b1c9..81d18d8d 100644 --- a/README.md +++ b/README.md @@ -105,7 +105,7 @@ Options: - `--force_ocr`: Force OCR processing on the entire document, even for pages that might contain extractable text. - `--processors TEXT`: Override the default processors by providing their full module paths, separated by commas. Example: `--processors "module1.processor1,module2.processor2"` - `--config_json PATH`: Path to a JSON configuration file containing additional settings. -- `--languages TEXT`: Optionally specify which languages to use for OCR processing. Accepts a comma-separated list. Example: `--languages "eng,fra,deu"` for English, French, and German. +- `--languages TEXT`: Optionally specify which languages to use for OCR processing. Accepts a comma-separated list. Example: `--languages "en,fr,de"` for English, French, and German. - `config --help`: List all available builders, processors, and converters, and their associated configuration. These values can be used to build a JSON configuration file for additional tweaking of marker defaults. The list of supported languages for surya OCR is [here](https://github.com/VikParuchuri/surya/blob/master/surya/languages.py). If you don't need OCR, marker can work with any language. @@ -368,4 +368,4 @@ This work would not have been possible without amazing open source models and da - Pypdfium2/pdfium - DocLayNet from IBM -Thank you to the authors of these models and datasets for making them available to the community! \ No newline at end of file +Thank you to the authors of these models and datasets for making them available to the community!