Skip to content

Releases: huggingface/dataset-viewer

0.8.1

24 Sep 13:14
c2a78e7
Compare
Choose a tag to compare

Features:

0.8.0

24 Sep 11:49
13e5332
Compare
Choose a tag to compare

Breaking:

  • endpoint /info disappears

Features:

  • endpoint /infos gives the info for a given config, or for all the configs of a dataset if config is missing

Fixes:

  • fix some functions signatures
  • fix the type of exceptions raised in /splits and /rows

0.6.1

24 Sep 08:48
Compare
Choose a tag to compare

Fixes:

  • the dataset allenai/c4 is blocklisted to avoid blocking the app (#17 (comment) + only one worker in the app)

0.6.0

24 Sep 08:48
e83e21a
Compare
Choose a tag to compare

Breaking:

  • the format of the response of the endpoints /datasets, /configs, /splits and /rows has changed.
  • the behavior has changed if config or split are missing in the endpoints /splits and /rows.

Features:

  • in /splits: if config is missing, all the splits of all the configs of the dataset are returned
  • in /rows: if config is missing, all the rows of all the splits of all the configs of the dataset are returned
  • in /rows: if split is missing, all the rows of all the splits of the config are returned

Details: 0.5.0...0.6.0

0.5.0

24 Sep 08:47
df04ffb
Compare
Choose a tag to compare

Breaking:

  • minimum version of Python is now 3.9.6
  • fix the number of workers to 1

Features:

  • add endpoints: /cache, /datasets
  • cache all the responses for /datasets, /info, /configs, /splits, /rows
  • environment variables can be setup in a .env file
  • rename environment variables: HOSTNAME to APP_HOSTNAME and PORT to APP_PORT
  • add environment variables: CACHE_SIZE_LIMIT, CACHE_TTL_SECONDS, DATASETS_ENABLE_PRIVATE, HF_TOKEN, LOG_LEVEL
  • add two targets for development: make coverage and make watch
  • prepare support for private datasets, but it's currently disabled (hardcoded in memorize)

CI:

  • check the types with mypy
  • ignore safety alert about tensorboard 2.6.0
  • setup code coverage with codecov and pytest-cov

Refactor:

  • refactor the benchmark to use the API instead of accessing the functions directly
  • use logging to manage the logs

Details: 0.4.6...0.5.0

0.4.6

23 Sep 10:04
Compare
Choose a tag to compare

Features:

  • feat: 🎸 upgrade datasets to get pathlib fix

Fixes:

  • fix: 🐛 return the adequate status code in case of error

CI:

  • add unit tests to the CI

Detail: 0.4.5...0.4.6

0.4.5

31 Aug 15:00
Compare
Choose a tag to compare

Upgrades datasets to fix an error in file format detection.

0.4.2

30 Aug 14:04
Compare
Choose a tag to compare

Downgrade black to fix an error when launching make quality

0.4.1

30 Aug 13:31
Compare
Choose a tag to compare
  • upgrade datasets to support streaming on more datasets
  • add quality checks (flake8, bandit, safety)

0.4.0

26 Aug 14:43
Compare
Choose a tag to compare

The body of 4xx errors is now a JSON: https://datasets-preview.huggingface.tech/splits?dataset=glue&config=NOSUCHCONFIG returns

{
    "status_code": 404,
    "exception": "Status404Error",
    "message": "The dataset config could not be found.",
    "cause": "ValueError",
    "cause_message": "BuilderConfig NOSUCHCONFIG not found. Available: ['cola', 'sst2', 'mrpc', 'qqp', 'stsb', 'mnli', 'mnli_mismatched', 'mnli_matched', 'qnli', 'rte', 'wnli', 'ax']"
}