feat: translations integration tests #210
Open
stage-taskcluster / translations-dataset-opus-ELRC-3075-wikipedia_health_v1-ru-en
succeeded
Jan 10, 2025 in 6m 57s
Stage-TC (issue_comment)
Fetch opus dataset
Details
View task in Taskcluster | View logs in Taskcluster | View task group in Taskcluster
Task Status
Started: 2025-01-10T21:26:57.423Z
Resolved: 2025-01-10T21:29:47.611Z
Task Execution Time: 2 minutes, 50 seconds, 188 milliseconds
Task Status: completed
Reason Resolved: completed
RunId: 0
Artifacts
- public/build/ELRC-3075-wikipedia_health_v1.en.zst
- public/build/ELRC-3075-wikipedia_health_v1.ru.zst
- public/logs/live_backing.log
- public/logs/live.log
[taskcluster 2025-01-10T21:26:57.516Z] Worker Type (translations-1/b-linux-large-gcp-d2g) settings:
[taskcluster 2025-01-10T21:26:57.516Z] {
[taskcluster 2025-01-10T21:26:57.516Z] "config": {
[taskcluster 2025-01-10T21:26:57.516Z] "deploymentId": ""
[taskcluster 2025-01-10T21:26:57.516Z] },
[taskcluster 2025-01-10T21:26:57.516Z] "generic-worker": {
[taskcluster 2025-01-10T21:26:57.516Z] "config": {
[taskcluster 2025-01-10T21:26:57.516Z] "headlessTasks": true,
[taskcluster 2025-01-10T21:26:57.516Z] "runTasksAsCurrentUser": false
[taskcluster 2025-01-10T21:26:57.516Z] },
[taskcluster 2025-01-10T21:26:57.516Z] "engine": "multiuser",
[taskcluster 2025-01-10T21:26:57.516Z] "go-arch": "amd64",
[taskcluster 2025-01-10T21:26:57.516Z] "go-os": "linux",
[taskcluster 2025-01-10T21:26:57.516Z] "go-version": "go1.23.4",
[taskcluster 2025-01-10T21:26:57.516Z] "release": "https://github.com/taskcluster/taskcluster/releases/tag/v77.3.1",
[taskcluster 2025-01-10T21:26:57.516Z] "revision": "959a204190add062fe1217d14f2a0115ecd43fe8",
[taskcluster 2025-01-10T21:26:57.516Z] "source": "https://github.com/taskcluster/taskcluster/commits/959a204190add062fe1217d14f2a0115ecd43fe8",
[taskcluster 2025-01-10T21:26:57.516Z] "version": "77.3.1"
[taskcluster 2025-01-10T21:26:57.516Z] },
[taskcluster 2025-01-10T21:26:57.516Z] "image": "projects/taskcluster-imaging/global/images/gw-fxci-gcp-l1-2404-amd64-headless-googlecompute-2025-01-09",
...(333 lines hidden)...
[task 2025-01-10T21:27:24.283Z] Downloading hanzidentifier-1.2.0-py3-none-any.whl (4.8 kB)
[task 2025-01-10T21:27:24.307Z] Downloading huggingface_hub-0.23.4-py3-none-any.whl (402 kB)
[task 2025-01-10T21:27:24.334Z] Downloading idna-3.7-py3-none-any.whl (66 kB)
[task 2025-01-10T21:27:24.354Z] Downloading jinja2-3.1.4-py3-none-any.whl (133 kB)
[task 2025-01-10T21:27:24.375Z] Downloading joblib-1.4.2-py3-none-any.whl (301 kB)
[task 2025-01-10T21:27:24.401Z] Downloading latexcodec-3.0.0-py3-none-any.whl (18 kB)
[task 2025-01-10T21:27:24.423Z] Downloading lxml-5.3.0-cp310-cp310-manylinux_2_28_x86_64.whl (5.0 MB)
[task 2025-01-10T21:27:24.477Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.0/5.0 MB 93.1 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.493Z] Downloading MarkupSafe-2.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
[task 2025-01-10T21:27:24.513Z] Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)
[task 2025-01-10T21:27:24.526Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 27.3 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.542Z] Downloading mtdata-0.4.1-py3-none-any.whl (819 kB)
[task 2025-01-10T21:27:24.556Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 819.3/819.3 kB 48.9 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.570Z] Downloading networkx-3.3-py3-none-any.whl (1.7 MB)
[task 2025-01-10T21:27:24.591Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 81.8 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.605Z] Downloading numpy-1.26.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
[task 2025-01-10T21:27:24.699Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 196.9 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.721Z] Downloading OpenCC-1.1.9-cp310-cp310-manylinux2014_x86_64.whl (1.7 MB)
[task 2025-01-10T21:27:24.737Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 99.1 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.753Z] Downloading packaging-24.1-py3-none-any.whl (53 kB)
[task 2025-01-10T21:27:24.774Z] Downloading portalocker-2.3.0-py2.py3-none-any.whl (15 kB)
[task 2025-01-10T21:27:24.795Z] Downloading prefixed-0.7.1-py2.py3-none-any.whl (13 kB)
[task 2025-01-10T21:27:24.815Z] Downloading psutil-6.0.0-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (290 kB)
[task 2025-01-10T21:27:24.838Z] Downloading pybtex-0.24.0-py2.py3-none-any.whl (561 kB)
[task 2025-01-10T21:27:24.850Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 561.4/561.4 kB 31.9 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.866Z] Downloading PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (705 kB)
[task 2025-01-10T21:27:24.881Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 705.5/705.5 kB 36.7 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.898Z] Downloading regex-2024.5.15-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (775 kB)
[task 2025-01-10T21:27:24.919Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 775.1/775.1 kB 28.2 MB/s eta 0:00:00
[task 2025-01-10T21:27:24.933Z] Downloading requests-2.31.0-py3-none-any.whl (62 kB)
[task 2025-01-10T21:27:24.956Z] Downloading ruamel.yaml-0.18.6-py3-none-any.whl (117 kB)
[task 2025-01-10T21:27:24.978Z] Downloading ruamel.yaml.clib-0.2.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (526 kB)
[task 2025-01-10T21:27:24.990Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 526.7/526.7 kB 26.8 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.005Z] Downloading sacrebleu-2.4.2-py3-none-any.whl (106 kB)
[task 2025-01-10T21:27:25.027Z] Downloading sacremoses-0.1.1-py3-none-any.whl (897 kB)
[task 2025-01-10T21:27:25.042Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 897.5/897.5 kB 54.4 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.061Z] Downloading safetensors-0.4.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
[task 2025-01-10T21:27:25.077Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 69.4 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.096Z] Downloading scikit_learn-1.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.3 MB)
[task 2025-01-10T21:27:25.182Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.3/13.3 MB 157.5 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.198Z] Downloading scipy-1.14.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (41.1 MB)
[task 2025-01-10T21:27:25.579Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 41.1/41.1 MB 108.0 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.596Z] Downloading sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[task 2025-01-10T21:27:25.611Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 86.5 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.627Z] Downloading simalign-0.4-py3-none-any.whl (8.1 kB)
[task 2025-01-10T21:27:25.648Z] Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
[task 2025-01-10T21:27:25.671Z] Downloading sympy-1.12.1-py3-none-any.whl (5.7 MB)
[task 2025-01-10T21:27:25.712Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.7/5.7 MB 140.2 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.726Z] Downloading tabulate-0.9.0-py3-none-any.whl (35 kB)
[task 2025-01-10T21:27:25.747Z] Downloading threadpoolctl-3.5.0-py3-none-any.whl (18 kB)
[task 2025-01-10T21:27:25.770Z] Downloading tokenizers-0.19.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.6 MB)
[task 2025-01-10T21:27:25.801Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.6/3.6 MB 119.2 MB/s eta 0:00:00
[task 2025-01-10T21:27:25.819Z] Downloading torch-2.2.2-cp310-cp310-manylinux1_x86_64.whl (755.5 MB)
[task 2025-01-10T21:27:36.285Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 755.5/755.5 MB 32.4 MB/s eta 0:00:00
[task 2025-01-10T21:27:36.303Z] Downloading tqdm-4.66.4-py3-none-any.whl (78 kB)
[task 2025-01-10T21:27:36.326Z] Downloading transformers-4.42.3-py3-none-any.whl (9.3 MB)
[task 2025-01-10T21:27:36.390Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.3/9.3 MB 149.0 MB/s eta 0:00:00
[task 2025-01-10T21:27:36.404Z] Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
[task 2025-01-10T21:27:36.427Z] Downloading urllib3-2.2.2-py3-none-any.whl (121 kB)
[task 2025-01-10T21:27:36.448Z] Downloading wcwidth-0.2.13-py2.py3-none-any.whl (34 kB)
[task 2025-01-10T21:27:36.469Z] Downloading zhon-2.0.2-py3-none-any.whl (83 kB)
[task 2025-01-10T21:27:36.491Z] Downloading nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
[task 2025-01-10T21:27:41.781Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 56.9 MB/s eta 0:00:00
[task 2025-01-10T21:27:41.797Z] Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
[task 2025-01-10T21:27:41.879Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 173.7 MB/s eta 0:00:00
[task 2025-01-10T21:27:41.900Z] Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
[task 2025-01-10T21:27:42.061Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 148.2 MB/s eta 0:00:00
[task 2025-01-10T21:27:42.077Z] Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
[task 2025-01-10T21:27:42.092Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 kB 46.9 MB/s eta 0:00:00
[task 2025-01-10T21:27:42.115Z] Downloading nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
[task 2025-01-10T21:27:52.056Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 731.7/731.7 MB 33.6 MB/s eta 0:00:00
[task 2025-01-10T21:27:52.074Z] Downloading nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
[task 2025-01-10T21:27:53.550Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 82.4 MB/s eta 0:00:00
[task 2025-01-10T21:27:53.566Z] Downloading nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
[task 2025-01-10T21:27:54.201Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 88.9 MB/s eta 0:00:00
[task 2025-01-10T21:27:54.219Z] Downloading nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)
[task 2025-01-10T21:27:55.876Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 74.9 MB/s eta 0:00:00
[task 2025-01-10T21:27:55.894Z] Downloading nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)
[task 2025-01-10T21:28:00.197Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 45.5 MB/s eta 0:00:00
[task 2025-01-10T21:28:00.214Z] Downloading nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl (166.0 MB)
[task 2025-01-10T21:28:04.312Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 166.0/166.0 MB 40.5 MB/s eta 0:00:00
[task 2025-01-10T21:28:04.327Z] Downloading nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
[task 2025-01-10T21:28:05.164Z] Downloading triton-2.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (167.9 MB)
[task 2025-01-10T21:28:07.292Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 167.9/167.9 MB 78.9 MB/s eta 0:00:00
[task 2025-01-10T21:28:07.309Z] Downloading nvidia_nvjitlink_cu12-12.6.85-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (19.7 MB)
[task 2025-01-10T21:28:07.441Z] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.7/19.7 MB 151.1 MB/s eta 0:00:00
[task 2025-01-10T21:28:15.190Z] Building wheels for collected packages: typo, opustrainer
[task 2025-01-10T21:28:15.191Z] Building wheel for typo (setup.py): started
[task 2025-01-10T21:28:15.712Z] Building wheel for typo (setup.py): finished with status 'done'
[task 2025-01-10T21:28:15.713Z] Created wheel for typo: filename=typo-0.1.5-py3-none-any.whl size=6878 sha256=fe474edd625970d9406cf461df6203d1a5074f98cecc14df8ae6f5ea72eec07c
[task 2025-01-10T21:28:15.713Z] Stored in directory: /builds/worker/.cache/pip/wheels/2e/2f/73/60e0ce42d1375a386b9171a37cd5536e173ad950a98e7dc6b1
[task 2025-01-10T21:28:15.719Z] Building wheel for opustrainer (pyproject.toml): started
[task 2025-01-10T21:28:16.133Z] Building wheel for opustrainer (pyproject.toml): finished with status 'done'
[task 2025-01-10T21:28:16.133Z] Created wheel for opustrainer: filename=opustrainer-0.3-py3-none-any.whl size=45114 sha256=62c9e6c481bcc504bce8f27b6b628d3f525a46fa0b8351492c85930f5963a6a0
[task 2025-01-10T21:28:16.134Z] Stored in directory: /builds/worker/.cache/pip/wheels/09/01/1c/d82c1698dc71385b0bf6eae4f433e85a02f89f99eec2e21072
[task 2025-01-10T21:28:16.136Z] Successfully built typo opustrainer
[task 2025-01-10T21:28:16.658Z] Installing collected packages: wcwidth, typo, sentencepiece, prefixed, portalocker, opencc, mpmath, zhon, urllib3, typing-extensions, tqdm, threadpoolctl, tabulate, sympy, six, safetensors, ruamel-yaml-clib, regex, pyyaml, psutil, packaging, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, markupsafe, lxml, latexcodec, joblib, idna, fsspec, filelock, colorama, click, charset-normalizer, certifi, triton, scipy, sacremoses, sacrebleu, ruamel-yaml, requests, pybtex, nvidia-cusparse-cu12, nvidia-cudnn-cu12, jinja2, hanzidentifier, blessed, scikit-learn, opustrainer, nvidia-cusolver-cu12, huggingface-hub, enlighten, torch, tokenizers, mtdata, transformers, simalign
[task 2025-01-10T21:29:19.974Z] Successfully installed blessed-1.20.0 certifi-2024.6.2 charset-normalizer-3.3.2 click-8.1.7 colorama-0.4.6 enlighten-1.10.1 filelock-3.15.4 fsspec-2024.6.1 hanzidentifier-1.2.0 huggingface-hub-0.23.4 idna-3.7 jinja2-3.1.4 joblib-1.4.2 latexcodec-3.0.0 lxml-5.3.0 markupsafe-2.1.5 mpmath-1.3.0 mtdata-0.4.1 networkx-3.3 numpy-1.26.4 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.19.3 nvidia-nvjitlink-cu12-12.6.85 nvidia-nvtx-cu12-12.1.105 opencc-1.1.9 opustrainer-0.3 packaging-24.1 portalocker-2.3.0 prefixed-0.7.1 psutil-6.0.0 pybtex-0.24.0 pyyaml-6.0.1 regex-2024.5.15 requests-2.31.0 ruamel-yaml-0.18.6 ruamel-yaml-clib-0.2.8 sacrebleu-2.4.2 sacremoses-0.1.1 safetensors-0.4.3 scikit-learn-1.5.0 scipy-1.14.0 sentencepiece-0.1.99 simalign-0.4 six-1.16.0 sympy-1.12.1 tabulate-0.9.0 threadpoolctl-3.5.0 tokenizers-0.19.1 torch-2.2.2 tqdm-4.66.4 transformers-4.42.3 triton-2.2.0 typing-extensions-4.12.2 typo-0.1.5 urllib3-2.2.2 wcwidth-0.2.13 zhon-2.0.2
[task 2025-01-10T21:29:25.995Z] Running with arguments: ['/builds/worker/checkouts/vcs/pipeline/data/dataset_importer.py', '--type', 'corpus', '--dataset', 'opus_ELRC-3075-wikipedia_health/v1', '--output_prefix', '/builds/worker/artifacts/ELRC-3075-wikipedia_health_v1', '--src', 'ru', '--trg', 'en']
[task 2025-01-10T21:29:25.995Z] Starting dataset import and augmentation.
[task 2025-01-10T21:29:25.995Z] Downloading parallel dataset
[task 2025-01-10T21:29:25.995Z] + set -euo pipefail
[task 2025-01-10T21:29:25.995Z] + [[ -z ru ]]
[task 2025-01-10T21:29:25.995Z] + [[ -z en ]]
[task 2025-01-10T21:29:25.995Z] + dataset=opus_ELRC-3075-wikipedia_health/v1
[task 2025-01-10T21:29:25.995Z] + output_prefix=/builds/worker/artifacts/ELRC-3075-wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] + echo '###### Downloading dataset opus_ELRC-3075-wikipedia_health/v1'
[task 2025-01-10T21:29:25.995Z] ###### Downloading dataset opus_ELRC-3075-wikipedia_health/v1
[task 2025-01-10T21:29:25.995Z] ++ dirname /builds/worker/checkouts/vcs/pipeline/data/download-corpus.sh
[task 2025-01-10T21:29:25.995Z] + cd /builds/worker/checkouts/vcs/pipeline/data
[task 2025-01-10T21:29:25.995Z] ++ dirname /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] + dir=/builds/worker/artifacts
[task 2025-01-10T21:29:25.995Z] + mkdir -p /builds/worker/artifacts
[task 2025-01-10T21:29:25.995Z] + name=ELRC-3075-wikipedia_health/v1
[task 2025-01-10T21:29:25.995Z] + type=opus
[task 2025-01-10T21:29:25.995Z] + [[ -f importers/corpus/opus.py ]]
[task 2025-01-10T21:29:25.995Z] + script='bash importers/corpus/opus.sh'
[task 2025-01-10T21:29:25.995Z] + bash importers/corpus/opus.sh ru en /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1 ELRC-3075-wikipedia_health/v1
[task 2025-01-10T21:29:25.995Z] + set -euo pipefail
[task 2025-01-10T21:29:25.995Z] + echo '###### Downloading opus corpus'
[task 2025-01-10T21:29:25.995Z] ###### Downloading opus corpus
[task 2025-01-10T21:29:25.995Z] + src=ru
[task 2025-01-10T21:29:25.995Z] + trg=en
[task 2025-01-10T21:29:25.995Z] + output_prefix=/builds/worker/artifacts/ELRC-3075-wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] + dataset=ELRC-3075-wikipedia_health/v1
[task 2025-01-10T21:29:25.995Z] + WGET=wget
[task 2025-01-10T21:29:25.995Z] + name=ELRC-3075-wikipedia_health
[task 2025-01-10T21:29:25.995Z] + name_and_version=ELRC_3075_wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] ++ dirname /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] + tmp=/builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] + mkdir -p /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] + archive_path=/builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.txt.zip
[task 2025-01-10T21:29:25.995Z] + wget -O /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.txt.zip https://object.pouta.csc.fi/OPUS-ELRC-3075-wikipedia_health/v1/moses/ru-en.txt.zip
[task 2025-01-10T21:29:25.995Z] --2025-01-10 21:29:22-- https://object.pouta.csc.fi/OPUS-ELRC-3075-wikipedia_health/v1/moses/ru-en.txt.zip
[task 2025-01-10T21:29:25.995Z] Resolving object.pouta.csc.fi (object.pouta.csc.fi)... 86.50.254.18, 86.50.254.19
[task 2025-01-10T21:29:25.995Z] Connecting to object.pouta.csc.fi (object.pouta.csc.fi)|86.50.254.18|:443... connected.
[task 2025-01-10T21:29:25.995Z] HTTP request sent, awaiting response... 404 Not Found
[task 2025-01-10T21:29:25.995Z] 2025-01-10 21:29:24 ERROR 404: Not Found.
[task 2025-01-10T21:29:25.995Z]
[task 2025-01-10T21:29:25.995Z] + wget -O /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.txt.zip https://object.pouta.csc.fi/OPUS-ELRC-3075-wikipedia_health/v1/moses/en-ru.txt.zip
[task 2025-01-10T21:29:25.995Z] --2025-01-10 21:29:24-- https://object.pouta.csc.fi/OPUS-ELRC-3075-wikipedia_health/v1/moses/en-ru.txt.zip
[task 2025-01-10T21:29:25.995Z] Resolving object.pouta.csc.fi (object.pouta.csc.fi)... 86.50.254.18, 86.50.254.19
[task 2025-01-10T21:29:25.995Z] Connecting to object.pouta.csc.fi (object.pouta.csc.fi)|86.50.254.18|:443... connected.
[task 2025-01-10T21:29:25.995Z] HTTP request sent, awaiting response... 200 OK
[task 2025-01-10T21:29:25.995Z] Length: 517126 (505K) [application/zip]
[task 2025-01-10T21:29:25.995Z] Saving to: ‘/builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.txt.zip’
[task 2025-01-10T21:29:25.995Z]
[task 2025-01-10T21:29:25.995Z] 0K .......... .......... .......... .......... .......... 9% 211K 2s
[task 2025-01-10T21:29:25.995Z] 50K .......... .......... .......... .......... .......... 19% 1.43M 1s
[task 2025-01-10T21:29:25.995Z] 100K .......... .......... .......... .......... .......... 29% 492K 1s
[task 2025-01-10T21:29:25.995Z] 150K .......... .......... .......... .......... .......... 39% 1.44M 1s
[task 2025-01-10T21:29:25.995Z] 200K .......... .......... .......... .......... .......... 49% 492K 1s
[task 2025-01-10T21:29:25.995Z] 250K .......... .......... .......... .......... .......... 59% 251M 0s
[task 2025-01-10T21:29:25.995Z] 300K .......... .......... .......... .......... .......... 69% 324M 0s
[task 2025-01-10T21:29:25.995Z] 350K .......... .......... .......... .......... .......... 79% 287M 0s
[task 2025-01-10T21:29:25.995Z] 400K .......... .......... .......... .......... .......... 89% 1.46M 0s
[task 2025-01-10T21:29:25.995Z] 450K .......... .......... .......... .......... .......... 99% 492K 0s
[task 2025-01-10T21:29:25.995Z] 500K ..... 100% 9.32T=0.6s
[task 2025-01-10T21:29:25.995Z]
[task 2025-01-10T21:29:25.995Z] 2025-01-10 21:29:25 (784 KB/s) - ‘/builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.txt.zip’ saved [517126/517126]
[task 2025-01-10T21:29:25.995Z]
[task 2025-01-10T21:29:25.995Z] + unzip -o /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.txt.zip -d /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] Archive: /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.txt.zip
[task 2025-01-10T21:29:25.995Z] inflating: /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/README
[task 2025-01-10T21:29:25.995Z] inflating: /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/LICENSE
[task 2025-01-10T21:29:25.995Z] inflating: /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.en-ru.en
[task 2025-01-10T21:29:25.995Z] inflating: /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.en-ru.ru
[task 2025-01-10T21:29:25.995Z] inflating: /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.en-ru.xml
[task 2025-01-10T21:29:25.995Z] + for lang in ${src} ${trg}
[task 2025-01-10T21:29:25.995Z] + zstdmt -c /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.ru-en.ru
[task 2025-01-10T21:29:25.995Z] zstd: can't stat /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.ru-en.ru : No such file or directory -- ignored
[task 2025-01-10T21:29:25.995Z] + zstdmt -c /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.en-ru.ru
[task 2025-01-10T21:29:25.995Z] + for lang in ${src} ${trg}
[task 2025-01-10T21:29:25.995Z] + zstdmt -c /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.ru-en.en
[task 2025-01-10T21:29:25.995Z] zstd: can't stat /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.ru-en.en : No such file or directory -- ignored
[task 2025-01-10T21:29:25.995Z] + zstdmt -c /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1/ELRC-3075-wikipedia_health.en-ru.en
[task 2025-01-10T21:29:25.995Z] + rm -rf /builds/worker/artifacts/opus/ELRC_3075_wikipedia_health_v1
[task 2025-01-10T21:29:25.995Z] + echo '###### Done: Downloading opus corpus'
[task 2025-01-10T21:29:25.995Z] ###### Done: Downloading opus corpus
[task 2025-01-10T21:29:25.995Z] + echo '###### Done: Downloading dataset opus_ELRC-3075-wikipedia_health/v1'
[task 2025-01-10T21:29:25.995Z] ###### Done: Downloading dataset opus_ELRC-3075-wikipedia_health/v1
[task 2025-01-10T21:29:25.995Z]
[task 2025-01-10T21:29:25.995Z] Finished dataset import and augmentation.
+ exit_code=0
+ docker cp taskcontainer_PTf1mnu1QjeP1E4BDmbwNQ:/builds/worker/artifacts artifact0
+ docker rm taskcontainer_PTf1mnu1QjeP1E4BDmbwNQ
taskcontainer_PTf1mnu1QjeP1E4BDmbwNQ
+ exit 0
[taskcluster 2025-01-10T21:29:46.691Z] Exit Code: 0
[taskcluster 2025-01-10T21:29:46.691Z] User Time: 148.79ms
[taskcluster 2025-01-10T21:29:46.691Z] Kernel Time: 699.737ms
[taskcluster 2025-01-10T21:29:46.691Z] Wall Time: 2m39.899655443s
[taskcluster 2025-01-10T21:29:46.691Z] Result: SUCCEEDED
[taskcluster 2025-01-10T21:29:46.691Z] === Task Finished ===
[taskcluster 2025-01-10T21:29:46.691Z] Task Duration: 2m39.900308033s
[taskcluster 2025-01-10T21:29:46.789Z] Uploading artifact public/build/ELRC-3075-wikipedia_health_v1.en.zst from file /home/task_173654422425677/artifact0/ELRC-3075-wikipedia_health_v1.en.zst with content encoding "identity", mime type "application/zstd" and expiry 2025-02-09T21:22:45.495Z
[taskcluster 2025-01-10T21:29:46.791Z] Uploading artifact public/build/ELRC-3075-wikipedia_health_v1.ru.zst from file /home/task_173654422425677/artifact0/ELRC-3075-wikipedia_health_v1.ru.zst with content encoding "identity", mime type "application/zstd" and expiry 2025-02-09T21:22:45.495Z
[taskcluster 2025-01-10T21:29:47.264Z] [mounts] Preserving cache: Moving "/home/task_173654422425677/cache0" to "caches/YZ-ynTR8TZGI6tX8I9Ze7g"
[taskcluster 2025-01-10T21:29:47.409Z] Uploading link artifact public/logs/live.log to artifact public/logs/live_backing.log with expiry 2025-02-09T21:22:45.495Z
Loading