Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: translations integration tests #210

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

remove `REV` from integration test task definitions

da526ed
Select commit
Loading
Failed to load commit list.
Open

feat: translations integration tests #210

remove `REV` from integration test task definitions
da526ed
Select commit
Loading
Failed to load commit list.
stage-taskcluster / translations-dataset-opus-ELRC-3075-wikipedia_health_v1-ru-en succeeded Jan 10, 2025 in 6m 57s

Stage-TC (issue_comment)

Fetch opus dataset

Details

View task in Taskcluster | View logs in Taskcluster | View task group in Taskcluster

Task Status

Started: 2025-01-10T21:26:57.423Z
Resolved: 2025-01-10T21:29:47.611Z
Task Execution Time: 2 minutes, 50 seconds, 188 milliseconds
Task Status: completed
Reason Resolved: completed
RunId: 0

Artifacts

- public/build/ELRC-3075-wikipedia_health_v1.en.zst
- public/build/ELRC-3075-wikipedia_health_v1.ru.zst
- public/logs/live_backing.log
- public/logs/live.log


[taskcluster 2025-01-10T21:26:57.516Z] Worker Type (translations-1/b-linux-large-gcp-d2g) settings:
[taskcluster 2025-01-10T21:26:57.516Z]   {
[taskcluster 2025-01-10T21:26:57.516Z]     "config": {
[taskcluster 2025-01-10T21:26:57.516Z]       "deploymentId": ""
[taskcluster 2025-01-10T21:26:57.516Z]     },
[taskcluster 2025-01-10T21:26:57.516Z]     "generic-worker": {
[taskcluster 2025-01-10T21:26:57.516Z]       "config": {
[taskcluster 2025-01-10T21:26:57.516Z]         "headlessTasks": true,
[taskcluster 2025-01-10T21:26:57.516Z]         "runTasksAsCurrentUser": false
[taskcluster 2025-01-10T21:26:57.516Z]       },
[taskcluster 2025-01-10T21:26:57.516Z]       "engine": "multiuser",
[taskcluster 2025-01-10T21:26:57.516Z]       "go-arch": "amd64",
[taskcluster 2025-01-10T21:26:57.516Z]       "go-os": "linux",
[taskcluster 2025-01-10T21:26:57.516Z]       "go-version": "go1.23.4",
[taskcluster 2025-01-10T21:26:57.516Z]       "release": "https://github.com/taskcluster/taskcluster/releases/tag/v77.3.1",
[taskcluster 2025-01-10T21:26:57.516Z]       "revision": "959a204190add062fe1217d14f2a0115ecd43fe8",
[taskcluster 2025-01-10T21:26:57.516Z]       "source": "https://github.com/taskcluster/taskcluster/commits/959a204190add062fe1217d14f2a0115ecd43fe8",
[taskcluster 2025-01-10T21:26:57.516Z]       "version": "77.3.1"
[taskcluster 2025-01-10T21:26:57.516Z]     },
[taskcluster 2025-01-10T21:26:57.516Z]     "image": "projects/taskcluster-imaging/global/images/gw-fxci-gcp-l1-2404-amd64-headless-googlecompute-2025-01-09",

...(333 lines hidden)...

[task 2025-01-10T21:27:24.283Z] Downloading hanzidentifier-1.2.0-py3-none-any.whl (4.8 kB)
[task 2025-01-10T21:27:24.307Z] Downloading huggingface_hub-0.23.4-py3-none-any.whl (402 kB)
[task 2025-01-10T21:27:24.334Z] Downloading idna-3.7-py3-none-any.whl (66 kB)
[task 2025-01-10T21:27:24.354Z] Downloading jinja2-3.1.4-py3-none-any.whl (133 kB)
[task 2025-01-10T21:27:24.375Z] Downloading joblib-1.4.2-py3-none-any.whl (301 kB)
[task 2025-01-10T21:27:24.401Z] Downloading latexcodec-3.0.0-py3-none-any.whl (18 kB)
[task 2025-01-10T21:27:24.423Z] Downloading lxml-5.3.0-cp310-cp310-manylinux_2_28_x86_64.whl (5.0 MB)
[task 2025-01-10T21:27:24.477Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.0/5.0 MB 93.1 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.493Z] Downloading MarkupSafe-2.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
[task 2025-01-10T21:27:24.513Z] Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)
[task 2025-01-10T21:27:24.526Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 27.3 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.542Z] Downloading mtdata-0.4.1-py3-none-any.whl (819 kB)
[task 2025-01-10T21:27:24.556Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 819.3/819.3 kB 48.9 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.570Z] Downloading networkx-3.3-py3-none-any.whl (1.7 MB)
[task 2025-01-10T21:27:24.591Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 81.8 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.605Z] Downloading numpy-1.26.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
[task 2025-01-10T21:27:24.699Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 196.9 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.721Z] Downloading OpenCC-1.1.9-cp310-cp310-manylinux2014_x86_64.whl (1.7 MB)
[task 2025-01-10T21:27:24.737Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 99.1 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.753Z] Downloading packaging-24.1-py3-none-any.whl (53 kB)
[task 2025-01-10T21:27:24.774Z] Downloading portalocker-2.3.0-py2.py3-none-any.whl (15 kB)
[task 2025-01-10T21:27:24.795Z] Downloading prefixed-0.7.1-py2.py3-none-any.whl (13 kB)
[task 2025-01-10T21:27:24.815Z] Downloading psutil-6.0.0-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (290 kB)
[task 2025-01-10T21:27:24.838Z] Downloading pybtex-0.24.0-py2.py3-none-any.whl (561 kB)
[task 2025-01-10T21:27:24.850Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 561.4/561.4 kB 31.9 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.866Z] Downloading PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (705 kB)
[task 2025-01-10T21:27:24.881Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 705.5/705.5 kB 36.7 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.898Z] Downloading regex-2024.5.15-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (775 kB)
[task 2025-01-10T21:27:24.919Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 775.1/775.1 kB 28.2 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.933Z] Downloading requests-2.31.0-py3-none-any.whl (62 kB)
[task 2025-01-10T21:27:24.956Z] Downloading ruamel.yaml-0.18.6-py3-none-any.whl (117 kB)
[task 2025-01-10T21:27:24.978Z] Downloading ruamel.yaml.clib-0.2.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (526 kB)
[task 2025-01-10T21:27:24.990Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 526.7/526.7 kB 26.8 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.005Z] Downloading sacrebleu-2.4.2-py3-none-any.whl (106 kB)
[task 2025-01-10T21:27:25.027Z] Downloading sacremoses-0.1.1-py3-none-any.whl (897 kB)
[task 2025-01-10T21:27:25.042Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 897.5/897.5 kB 54.4 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.061Z] Downloading safetensors-0.4.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
[task 2025-01-10T21:27:25.077Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 69.4 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.096Z] Downloading scikit_learn-1.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.3 MB)
[task 2025-01-10T21:27:25.182Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.3/13.3 MB 157.5 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.198Z] Downloading scipy-1.14.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (41.1 MB)
[task 2025-01-10T21:27:25.579Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 41.1/41.1 MB 108.0 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.596Z] Downloading sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[task 2025-01-10T21:27:25.611Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 86.5 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.627Z] Downloading simalign-0.4-py3-none-any.whl (8.1 kB)
[task 2025-01-10T21:27:25.648Z] Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
[task 2025-01-10T21:27:25.671Z] Downloading sympy-1.12.1-py3-none-any.whl (5.7 MB)
[task 2025-01-10T21:27:25.712Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.7/5.7 MB 140.2 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.726Z] Downloading tabulate-0.9.0-py3-none-any.whl (35 kB)
[task 2025-01-10T21:27:25.747Z] Downloading threadpoolctl-3.5.0-py3-none-any.whl (18 kB)
[task 2025-01-10T21:27:25.770Z] Downloading tokenizers-0.19.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.6 MB)
[task 2025-01-10T21:27:25.801Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.6/3.6 MB 119.2 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.819Z] Downloading torch-2.2.2-cp310-cp310-manylinux1_x86_64.whl (755.5 MB)
[task 2025-01-10T21:27:36.285Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 755.5/755.5 MB 32.4 MB/s eta 0:00:00
[task 2025-01-10T21:27:36.303Z] Downloading tqdm-4.66.4-py3-none-any.whl (78 kB)
[task 2025-01-10T21:27:36.326Z] Downloading transformers-4.42.3-py3-none-any.whl (9.3 MB)
[task 2025-01-10T21:27:36.390Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.3/9.3 MB 149.0 MB/s eta 0:00:00
[task 2025-01-10T21:27:36.404Z] Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
[task 2025-01-10T21:27:36.427Z] Downloading urllib3-2.2.2-py3-none-any.whl (121 kB)
[task 2025-01-10T21:27:36.448Z] Downloading wcwidth-0.2.13-py2.py3-none-any.whl (34 kB)
[task 2025-01-10T21:27:36.469Z] Downloading zhon-2.0.2-py3-none-any.whl (83 kB)
[task 2025-01-10T21:27:36.491Z] Downloading nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
[task 2025-01-10T21:27:41.781Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 56.9 MB/s eta 0:00:00
[task 2025-01-10T21:27:41.797Z] Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
[task 2025-01-10T21:27:41.879Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 173.7 MB/s eta 0:00:00
[task 2025-01-10T21:27:41.900Z] Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
[task 2025-01-10T21:27:42.061Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 148.2 MB/s eta 0:00:00
[task 2025-01-10T21:27:42.077Z] Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
[task 2025-01-10T21:27:42.092Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 kB 46.9 MB/s eta 0:00:00
[task 2025-01-10T21:27:42.115Z] Downloading nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
[task 2025-01-10T21:27:52.056Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 731.7/731.7 MB 33.6 MB/s eta 0:00:00
[task 2025-01-10T21:27:52.074Z] Downloading nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
[task 2025-01-10T21:27:53.550Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 82.4 MB/s eta 0:00:00
[task 2025-01-10T21:27:53.566Z] Downloading nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
[task 2025-01-10T21:27:54.201Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 88.9 MB/s eta 0:00:00
[task 2025-01-10T21:27:54.219Z] Downloading nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)
[task 2025-01-10T21:27:55.876Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 74.9 MB/s eta 0:00:00
[task 2025-01-10T21:27:55.894Z] Downloading nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)
[task 2025-01-10T21:28:00.197Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 45.5 MB/s eta 0:00:00
[task 2025-01-10T21:28:00.214Z] Downloading nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl (166.0 MB)
[task 2025-01-10T21:28:04.312Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 166.0/166.0 MB 40.5 MB/s eta 0:00:00
[task 2025-01-10T21:28:04.327Z] Downloading nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
[task 2025-01-10T21:28:05.164Z] Downloading triton-2.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (167.9 MB)
[task 2025-01-10T21:28:07.292Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 167.9/167.9 MB 78.9 MB/s eta 0:00:00
[task 2025-01-10T21:28:07.309Z] Downloading nvidia_nvjitlink_cu12-12.6.85-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (19.7 MB)
[task 2025-01-10T21:28:07.441Z]    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.7/19.7 MB 151.1 MB/s eta 0:00:00
[task 2025-01-10T21:28:15.190Z] Building wheels for collected packages: typo, opustrainer
[task 2025-01-10T21:28:15.191Z]   Building wheel for typo (setup.py): started
[task 2025-01-10T21:28:15.712Z]   Building wheel for typo (setup.py): finished with status 'done'
[task 2025-01-10T21:28:15.713Z]   Created wheel for typo: filename=typo-0.1.5-py3-none-any.whl size=6878 sha256=fe474edd625970d9406cf461df6203d1a5074f98cecc14df8ae6f5ea72eec07c
[task 2025-01-10T21:28:15.713Z]   Stored in directory: /builds/worker/.cache/pip/wheels/2e/2f/73/60e0ce42d1375a386b9171a37cd5536e173ad950a98e7dc6b1
[task 2025-01-10T21:28:15.719Z]   Building wheel for opustrainer (pyproject.toml): started
[task 2025-01-10T21:28:16.133Z]   Building wheel for opustrainer (pyproject.toml): finished with status 'done'
[task 2025-01-10T21:28:16.133Z]   Created wheel for opustrainer: filename=opustrainer-0.3-py3-none-any.whl size=45114 sha256=62c9e6c481bcc504bce8f27b6b628d3f525a46fa0b8351492c85930f5963a6a0
[task 2025-01-10T21:28:16.134Z]   Stored in directory: /builds/worker/.cache/pip/wheels/09/01/1c/d82c1698dc71385b0bf6eae4f433e85a02f89f99eec2e21072
[task 2025-01-10T21:28:16.136Z] Successfully built typo opustrainer
[task 2025-01-10T21:28:16.658Z] Installing collected packages: wcwidth, typo, sentencepiece, prefixed, portalocker, opencc, mpmath, zhon, urllib3, typing-extensions, tqdm, threadpoolctl, tabulate, sympy, six, safetensors, ruamel-yaml-clib, regex, pyyaml, psutil, packaging, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, markupsafe, lxml, latexcodec, joblib, idna, fsspec, filelock, colorama, click, charset-normalizer, certifi, triton, scipy, sacremoses, sacrebleu, ruamel-yaml, requests, pybtex, nvidia-cusparse-cu12, nvidia-cudnn-cu12, jinja2, hanzidentifier, blessed, scikit-learn, opustrainer, nvidia-cusolver-cu12, huggingface-hub, enlighten, torch, tokenizers, mtdata, transformers, simalign
[task 2025-01-10T21:29:19.974Z] Successfully installed blessed-1.20.0 certifi-2024.6.2 charset-normalizer-3.3.2 click-8.1.7 colorama-0.4.6 enlighten-1.10.1 filelock-3.15.4 fsspec-2024.6.1 hanzidentifier-1.2.0 huggingface-hub-0.23.4 idna-3.7 jinja2-3.1.4 joblib-1.4.2 latexcodec-3.0.0 lxml-5.3.0 markupsafe-2.1.5 mpmath-1.3.0 mtdata-0.4.1 networkx-3.3 numpy-1.26.4 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.19.3 nvidia-nvjitlink-cu12-12.6.85 nvidia-nvtx-cu12-12.1.105 opencc-1.1.9 opustrainer-0.3 packaging-24.1 portalocker-2.3.0 prefixed-0.7.1 psutil-6.0.0 pybtex-0.24.0 pyyaml-6.0.1 regex-2024.5.15 requests-2.31.0 ruamel-yaml-0.18.6 ruamel-yaml-clib-0.2.8 sacrebleu-2.4.2 sacremoses-0.1.1 safetensors-0.4.3 scikit-learn-1.5.0 scipy-1.14.0 sentencepiece-0.1.99 simalign-0.4 six-1.16.0 sympy-1.12.1 tabulate-0.9.0 threadpoolctl-3.5.0 tokenizers-0.19.1 torch-2.2.2 tqdm-4.66.4 transformers-4.42.3 triton-2.2.0 typing-extensions-4.12.2 typo-0.1.5 urllib3-2.2.2 wcwidth-0.2.13 zhon-2.0.2
[task 2025-01-10T21:29:25.995Z] Running with arguments: ['/builds/worker/checkouts/vcs/pipeline/data/dataset_importer.py', '--type', 'corpus', '--dataset', 'opus_ELRC-3075-wikipedia_health/v1', '--output_prefix', '/builds/worker/artifacts/ELRC-3075-wikipedia_health_v1', '--src', 'ru', '--trg', 'en']
[task 2025-01-10T21:29:25.995Z] Starting dataset import and augmentation.
[task 2025-01-10T21:29:25.995Z] Downloading parallel dataset
[task 2025-01-10T21:29:25.995Z] + set -euo pipefail
[task 2025-01-10T21:29:25.995Z] + [[ -z ru ]]
[task 2025-01-10T21:29:25.995Z] + [[ -z en ]]
[task 2025-01-10T21:29:25.995Z] + dataset=opus_ELRC-3075-wikipedia_health/v1
[task 2025-01-10T21:29:25.995Z] + output_prefix=/builds/worker/artifacts/ELRC-3075-wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] + echo '###### Downloading dataset opus_ELRC-3075-wikipedia_health/v1'
[task 2025-01-10T21:29:25.995Z] ###### Downloading dataset opus_ELRC-3075-wikipedia_health/v1
[task 2025-01-10T21:29:25.995Z] ++ dirname /builds/worker/checkouts/vcs/pipeline/data/download-corpus.sh
[task 2025-01-10T21:29:25.995Z] + cd /builds/worker/checkouts/vcs/pipeline/data
[task 2025-01-10T21:29:25.995Z] ++ dirname /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] + dir=/builds/worker/artifacts
[task 2025-01-10T21:29:25.995Z] + mkdir -p /builds/worker/artifacts
[task 2025-01-10T21:29:25.995Z] + name=ELRC-3075-wikipedia_health/v1
[task 2025-01-10T21:29:25.995Z] + type=opus
[task 2025-01-10T21:29:25.995Z] + [[ -f importers/corpus/opus.py ]]
[task 2025-01-10T21:29:25.995Z] + script='bash importers/corpus/opus.sh'
[task 2025-01-10T21:29:25.995Z] + bash importers/corpus/opus.sh ru en /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1 ELRC-3075-wikipedia_health/v1
[task 2025-01-10T21:29:25.995Z] + set -euo pipefail
[task 2025-01-10T21:29:25.995Z] + echo '###### Downloading opus corpus'
[task 2025-01-10T21:29:25.995Z] ###### Downloading opus corpus
[task 2025-01-10T21:29:25.995Z] + src=ru
[task 2025-01-10T21:29:25.995Z] + trg=en
[task 2025-01-10T21:29:25.995Z] + output_prefix=/builds/worker/artifacts/ELRC-3075-wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] + dataset=ELRC-3075-wikipedia_health/v1
[task 2025-01-10T21:29:25.995Z] + WGET=wget
[task 2025-01-10T21:29:25.995Z] + name=ELRC-3075-wikipedia_health
[task 2025-01-10T21:29:25.995Z] + name_and_version=ELRC_3075_wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] ++ dirname /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] + tmp=/builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] + mkdir -p /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] + archive_path=/builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.txt.zip
[task 2025-01-10T21:29:25.995Z] + wget -O /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.txt.zip https://object.pouta.csc.fi/OPUS-ELRC-3075-wikipedia_health/v1/moses/ru-en.txt.zip
[task 2025-01-10T21:29:25.995Z] --2025-01-10 21:29:22--  https://object.pouta.csc.fi/OPUS-ELRC-3075-wikipedia_health/v1/moses/ru-en.txt.zip
[task 2025-01-10T21:29:25.995Z] Resolving object.pouta.csc.fi (object.pouta.csc.fi)... 86.50.254.18, 86.50.254.19
[task 2025-01-10T21:29:25.995Z] Connecting to object.pouta.csc.fi (object.pouta.csc.fi)|86.50.254.18|:443... connected.
[task 2025-01-10T21:29:25.995Z] HTTP request sent, awaiting response... 404 Not Found
[task 2025-01-10T21:29:25.995Z] 2025-01-10 21:29:24 ERROR 404: Not Found.
[task 2025-01-10T21:29:25.995Z] 
[task 2025-01-10T21:29:25.995Z] + wget -O /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.txt.zip https://object.pouta.csc.fi/OPUS-ELRC-3075-wikipedia_health/v1/moses/en-ru.txt.zip
[task 2025-01-10T21:29:25.995Z] --2025-01-10 21:29:24--  https://object.pouta.csc.fi/OPUS-ELRC-3075-wikipedia_health/v1/moses/en-ru.txt.zip
[task 2025-01-10T21:29:25.995Z] Resolving object.pouta.csc.fi (object.pouta.csc.fi)... 86.50.254.18, 86.50.254.19
[task 2025-01-10T21:29:25.995Z] Connecting to object.pouta.csc.fi (object.pouta.csc.fi)|86.50.254.18|:443... connected.
[task 2025-01-10T21:29:25.995Z] HTTP request sent, awaiting response... 200 OK
[task 2025-01-10T21:29:25.995Z] Length: 517126 (505K) [application/zip]
[task 2025-01-10T21:29:25.995Z] Saving to: ‘/builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.txt.zip’
[task 2025-01-10T21:29:25.995Z] 
[task 2025-01-10T21:29:25.995Z]      0K .......... .......... .......... .......... ..........  9%  211K 2s
[task 2025-01-10T21:29:25.995Z]     50K .......... .......... .......... .......... .......... 19% 1.43M 1s
[task 2025-01-10T21:29:25.995Z]    100K .......... .......... .......... .......... .......... 29%  492K 1s
[task 2025-01-10T21:29:25.995Z]    150K .......... .......... .......... .......... .......... 39% 1.44M 1s
[task 2025-01-10T21:29:25.995Z]    200K .......... .......... .......... .......... .......... 49%  492K 1s
[task 2025-01-10T21:29:25.995Z]    250K .......... .......... .......... .......... .......... 59%  251M 0s
[task 2025-01-10T21:29:25.995Z]    300K .......... .......... .......... .......... .......... 69%  324M 0s
[task 2025-01-10T21:29:25.995Z]    350K .......... .......... .......... .......... .......... 79%  287M 0s
[task 2025-01-10T21:29:25.995Z]    400K .......... .......... .......... .......... .......... 89% 1.46M 0s
[task 2025-01-10T21:29:25.995Z]    450K .......... .......... .......... .......... .......... 99%  492K 0s
[task 2025-01-10T21:29:25.995Z]    500K .....                                                 100% 9.32T=0.6s
[task 2025-01-10T21:29:25.995Z] 
[task 2025-01-10T21:29:25.995Z] 2025-01-10 21:29:25 (784 KB/s) - ‘/builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.txt.zip’ saved [517126/517126]
[task 2025-01-10T21:29:25.995Z] 
[task 2025-01-10T21:29:25.995Z] + unzip -o /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.txt.zip -d /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] Archive:  /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.txt.zip
[task 2025-01-10T21:29:25.995Z]   inflating: /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/README  
[task 2025-01-10T21:29:25.995Z]   inflating: /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/LICENSE  
[task 2025-01-10T21:29:25.995Z]   inflating: /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.en-ru.en  
[task 2025-01-10T21:29:25.995Z]   inflating: /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.en-ru.ru  
[task 2025-01-10T21:29:25.995Z]   inflating: /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.en-ru.xml  
[task 2025-01-10T21:29:25.995Z] + for lang in ${src} ${trg}
[task 2025-01-10T21:29:25.995Z] + zstdmt -c /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.ru-en.ru
[task 2025-01-10T21:29:25.995Z] zstd: can't stat /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.ru-en.ru : No such file or directory -- ignored 
[task 2025-01-10T21:29:25.995Z] + zstdmt -c /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.en-ru.ru
[task 2025-01-10T21:29:25.995Z] + for lang in ${src} ${trg}
[task 2025-01-10T21:29:25.995Z] + zstdmt -c /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.ru-en.en
[task 2025-01-10T21:29:25.995Z] zstd: can't stat /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.ru-en.en : No such file or directory -- ignored 
[task 2025-01-10T21:29:25.995Z] + zstdmt -c /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.en-ru.en
[task 2025-01-10T21:29:25.995Z] + rm -rf /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] + echo '###### Done: Downloading opus corpus'
[task 2025-01-10T21:29:25.995Z] ###### Done: Downloading opus corpus
[task 2025-01-10T21:29:25.995Z] + echo '###### Done: Downloading dataset opus_ELRC-3075-wikipedia_health/v1'
[task 2025-01-10T21:29:25.995Z] ###### Done: Downloading dataset opus_ELRC-3075-wikipedia_health/v1
[task 2025-01-10T21:29:25.995Z] 
[task 2025-01-10T21:29:25.995Z] Finished dataset import and augmentation.
+ exit_code=0
+ docker cp taskcontainer_PTf1mnu1QjeP1E4BDmbwNQ:/builds/worker/artifacts artifact0
+ docker rm taskcontainer_PTf1mnu1QjeP1E4BDmbwNQ
taskcontainer_PTf1mnu1QjeP1E4BDmbwNQ
+ exit 0
[taskcluster 2025-01-10T21:29:46.691Z]    Exit Code: 0
[taskcluster 2025-01-10T21:29:46.691Z]    User Time: 148.79ms
[taskcluster 2025-01-10T21:29:46.691Z]  Kernel Time: 699.737ms
[taskcluster 2025-01-10T21:29:46.691Z]    Wall Time: 2m39.899655443s
[taskcluster 2025-01-10T21:29:46.691Z]       Result: SUCCEEDED
[taskcluster 2025-01-10T21:29:46.691Z] === Task Finished ===
[taskcluster 2025-01-10T21:29:46.691Z] Task Duration: 2m39.900308033s
[taskcluster 2025-01-10T21:29:46.789Z] Uploading artifact public/build/ELRC-3075-wikipedia_health_v1.en.zst from file /home/task_173654422425677/artifact0/ELRC-3075-wikipedia_health_v1.en.zst with content encoding "identity", mime type "application/zstd" and expiry 2025-02-09T21:22:45.495Z
[taskcluster 2025-01-10T21:29:46.791Z] Uploading artifact public/build/ELRC-3075-wikipedia_health_v1.ru.zst from file /home/task_173654422425677/artifact0/ELRC-3075-wikipedia_health_v1.ru.zst with content encoding "identity", mime type "application/zstd" and expiry 2025-02-09T21:22:45.495Z
[taskcluster 2025-01-10T21:29:47.264Z] [mounts] Preserving cache: Moving "/home/task_173654422425677/cache0" to "caches/YZ-ynTR8TZGI6tX8I9Ze7g"
[taskcluster 2025-01-10T21:29:47.409Z] Uploading link artifact public/logs/live.log to artifact public/logs/live_backing.log with expiry 2025-02-09T21:22:45.495Z