Unidata · julienchastang · Nov 14, 2024 · Nov 20, 2024 · Nov 21, 2024 · Nov 21, 2024
diff --git a/jupyter-images/fall-2024/tm/.condarc b/jupyter-images/fall-2024/tm/.condarc
@@ -0,0 +1,2 @@
+envs_dirs:
+  - /home/jovyan/additional-envs
diff --git a/jupyter-images/fall-2024/tm/Acknowledgements.ipynb b/jupyter-images/fall-2024/tm/Acknowledgements.ipynb
@@ -0,0 +1,43 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "c86cd54f-b73c-4781-b6eb-89c79d3d3b22",
+   "metadata": {},
+   "source": [
+    "## Acknowledgements\n",
+    "\n",
+    "Launching this JupyterHub server is the result of a collaboration between several research and academic institutions and their staff. For Jetstream2 and JupyterHub expertise, we thank Andrea Zonca (San Diego Supercomputing Center), Jeremy Fischer, Mike Lowe (Indiana University), the NSF Jetstream2 (`doi:10.1145/3437359.3465565`) team.\n",
+    "\n",
+    "This work employs the NSF Jetstream2 Cloud at Indiana University through allocation EES220002 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, which is supported by National Science Foundation grants #2138259, #2138286, #2138307, #2137603, and #2138296.\n",
+    "\n",
+    "Unidata is one of the University Corporation for Atmospheric Research (UCAR)'s Community Programs (UCP), and is funded primarily by the National Science Foundation (AGS-2403649).\n",
+    "\n",
+    "## To Acknowledge This JupyterHub and the Unidata Science Gateway\n",
+    "\n",
+    "If you have benefited from the Unidata Science Gateway, please cite `doi:10.5065/688s-2w73`. Additional citation information can be found in this [Citation File Format file](https://raw.githubusercontent.com/Unidata/science-gateway/master/CITATION.cff).\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/jupyter-images/fall-2024/tm/Dockerfile b/jupyter-images/fall-2024/tm/Dockerfile
@@ -0,0 +1,42 @@
+# Heavily borrowed from docker-stacks/minimal-notebook/
+# https://github.com/jupyter/docker-stacks/blob/main/minimal-notebook/Dockerfile
+
+ARG BASE_CONTAINER=jupyter/tensorflow-notebook
+FROM $BASE_CONTAINER
+
+ENV DEFAULT_ENV_NAME=tm-fall-2024 EDITOR=nano VISUAL=nano
+
+LABEL maintainer="Unidata <[email protected]>"
+
+USER root
+
+RUN apt-get update && \
+    apt-get install -y --no-install-recommends vim nano curl zip unzip && \
+    apt-get clean && \
+    rm -rf /var/lib/apt/lists/*
+
+USER $NB_UID
+
+ADD environment.yml /tmp
+
+RUN mamba install --quiet --yes \
+      'conda-forge::nb_conda_kernels' \
+      'conda-forge::jupyterlab-git' \
+      'conda-forge::ipywidgets' && \
+    mamba env update --name $DEFAULT_ENV_NAME -f /tmp/environment.yml && \
+    pip install --no-cache-dir nbgitpuller && \
+    mamba clean --all -f -y && \
+    jupyter lab clean -y && \
+    npm cache clean --force && \
+    rm -rf /home/$NB_USER/.cache/yarn && \
+    rm -rf /home/$NB_USER/.node-gyp && \
+    fix-permissions $CONDA_DIR && \
+    fix-permissions /home/$NB_USER
+
+COPY GPU_sanity_check.ipynb Acknowledgements.ipynb \
+    default_kernel.py .condarc /
+
+ARG JUPYTER_SETTINGS_DIR=/opt/conda/share/jupyter/lab/settings/
+COPY overrides.json $JUPYTER_SETTINGS_DIR
+
+USER $NB_UID
diff --git a/jupyter-images/fall-2024/tm/GPU_sanity_check.ipynb b/jupyter-images/fall-2024/tm/GPU_sanity_check.ipynb
@@ -0,0 +1,287 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "f1b605ba-8d44-4654-be99-df2d39289c36",
+   "metadata": {},
+   "source": [
+    "## GPU JHub Testing Notebook"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "25cacd08-18b8-4991-966e-7e49aa44192a",
+   "metadata": {},
+   "source": [
+    "Notebook used for first pass testing of the environment and GPU access. Here are the various JS2 GPU instance [flavors](https://docs.jetstream-cloud.org/general/instance-flavors/#jetstream2-gpu)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8f0798c9-ee1d-4157-a4be-deedd90bd9a5",
+   "metadata": {},
+   "source": [
+    "Note: this also tests PyTorch install, as of Novembeerr 2024, I hope to not use tensorflow for work at UCAR / Unidata. This entire notebook should run without any errors.  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "831b1a5d-488d-476c-8050-4f18cd635c0c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import psutil\n",
+    "import platform\n",
+    "import sys\n",
+    "\n",
+    "import torch\n",
+    "import platform"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "50c795b6-34a9-4c2b-ab71-67c5ea087fa2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def get_simple_system_info():\n",
+    "    # Memory info\n",
+    "    memory = psutil.virtual_memory()\n",
+    "    ram_gb = memory.total / (1024 ** 3)  # Convert to GB\n",
+    "    ram_used_gb = memory.used / (1024 ** 3)\n",
+    "    \n",
+    "    # CPU info\n",
+    "    cpu_cores = psutil.cpu_count()\n",
+    "    cpu_usage = psutil.cpu_percent(interval=1)\n",
+    "    \n",
+    "    print(f\"Python Version: {platform.python_version()}\")\n",
+    "    print(f\"\\nCPU:\")\n",
+    "    print(f\"- Cores: {cpu_cores}\")\n",
+    "    print(f\"- Current Usage: {cpu_usage}%\")\n",
+    "    print(f\"\\nRAM:\")\n",
+    "    print(f\"- Total: {ram_gb:.1f} GB\")\n",
+    "    print(f\"- Used: {ram_used_gb:.1f} GB\")\n",
+    "    print(f\"- Usage: {memory.percent}%\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "7ce932bc-9406-4841-91ca-371e3c768980",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Python Version: 3.10.15\n",
+      "\n",
+      "CPU:\n",
+      "- Cores: 8\n",
+      "- Current Usage: 1.0%\n",
+      "\n",
+      "RAM:\n",
+      "- Total: 29.4 GB\n",
+      "- Used: 1.4 GB\n",
+      "- Usage: 6.1%\n"
+     ]
+    }
+   ],
+   "source": [
+    "get_simple_system_info()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "9aa829ef-1093-4d6d-9052-a86a1a8647fc",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Thu Nov 21 16:36:04 2024       \n",
+      "+---------------------------------------------------------------------------------------+\n",
+      "| NVIDIA-SMI 535.183.06             Driver Version: 535.183.06   CUDA Version: 12.2     |\n",
+      "|-----------------------------------------+----------------------+----------------------+\n",
+      "| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |\n",
+      "| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |\n",
+      "|                                         |                      |               MIG M. |\n",
+      "|=========================================+======================+======================|\n",
+      "|   0  GRID A100X-10C                 On  | 00000000:04:00.0 Off |                    0 |\n",
+      "| N/A   N/A    P0              N/A /  N/A |      0MiB / 10240MiB |      0%      Default |\n",
+      "|                                         |                      |             Disabled |\n",
+      "+-----------------------------------------+----------------------+----------------------+\n",
+      "                                                                                         \n",
+      "+---------------------------------------------------------------------------------------+\n",
+      "| Processes:                                                                            |\n",
+      "|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |\n",
+      "|        ID   ID                                                             Usage      |\n",
+      "|=======================================================================================|\n",
+      "|  No running processes found                                                           |\n",
+      "+---------------------------------------------------------------------------------------+\n"
+     ]
+    }
+   ],
+   "source": [
+    "!nvidia-smi"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "0bb2ce0f-afba-4693-8072-bccb92dca0bf",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def get_pytorch_info():\n",
+    "    print(\"PyTorch System Information\")\n",
+    "    print(\"-\" * 30)\n",
+    "    \n",
+    "    # PyTorch version\n",
+    "    print(f\"PyTorch Version: {torch.__version__}\")\n",
+    "    \n",
+    "    # CUDA availability\n",
+    "    print(f\"\\nCUDA Available: {torch.cuda.is_available()}\")\n",
+    "    \n",
+    "    if torch.cuda.is_available():\n",
+    "        # Current device information\n",
+    "        current_device = torch.cuda.current_device()\n",
+    "        print(f\"Current CUDA Device: {current_device}\")\n",
+    "        \n",
+    "        # Device name\n",
+    "        print(f\"Device Name: {torch.cuda.get_device_name(current_device)}\")\n",
+    "        \n",
+    "        # CUDA version\n",
+    "        print(f\"CUDA Version: {torch.version.cuda}\")\n",
+    "        \n",
+    "        # Number of CUDA devices\n",
+    "        print(f\"Device Count: {torch.cuda.device_count()}\")\n",
+    "        \n",
+    "        # Memory information\n",
+    "        print(\"\\nGPU Memory Information:\")\n",
+    "        print(f\"- Total: {torch.cuda.get_device_properties(current_device).total_memory / 1024**3:.2f} GB\")\n",
+    "        print(f\"- Allocated: {torch.cuda.memory_allocated(current_device) / 1024**3:.2f} GB\")\n",
+    "        print(f\"- Cached: {torch.cuda.memory_reserved(current_device) / 1024**3:.2f} GB\")\n",
+    "        \n",
+    "        # Architecture information\n",
+    "        device_props = torch.cuda.get_device_properties(current_device)\n",
+    "        print(f\"\\nGPU Architecture:\")\n",
+    "        print(f\"- GPU Compute Capability: {device_props.major}.{device_props.minor}\")\n",
+    "        print(f\"- Multi Processors: {device_props.multi_processor_count}\")\n",
+    "    else:\n",
+    "        print(\"\\nNo CUDA GPU available. PyTorch will run on CPU only.\")\n",
+    "        print(f\"CPU Architecture: {platform.machine()}\")\n",
+    "        print(f\"CPU Type: {platform.processor()}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "ea2e96f2-37fe-49a0-8cc8-d32a3b666a0a",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "PyTorch System Information\n",
+      "------------------------------\n",
+      "PyTorch Version: 2.5.1+cu124\n",
+      "\n",
+      "CUDA Available: True\n",
+      "Current CUDA Device: 0\n",
+      "Device Name: GRID A100X-10C\n",
+      "CUDA Version: 12.4\n",
+      "Device Count: 1\n",
+      "\n",
+      "GPU Memory Information:\n",
+      "- Total: 10.00 GB\n",
+      "- Allocated: 0.00 GB\n",
+      "- Cached: 0.00 GB\n",
+      "\n",
+      "GPU Architecture:\n",
+      "- GPU Compute Capability: 8.0\n",
+      "- Multi Processors: 108\n"
+     ]
+    }
+   ],
+   "source": [
+    "get_pytorch_info()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "b3f8dad5-6c3c-4a64-bc31-b9ef5d5894fb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def get_instance_type():\n",
+    "    cpu_count = psutil.cpu_count()\n",
+    "    ram_gb = psutil.virtual_memory().total / (1024**3)\n",
+    "    gpu_ram = 0\n",
+    "    \n",
+    "    if torch.cuda.is_available():\n",
+    "        current_device = torch.cuda.current_device()\n",
+    "        gpu_ram = torch.cuda.get_device_properties(current_device).total_memory / (1024**3)\n",
+    "    \n",
+    "    if cpu_count == 4 and 13 <= ram_gb <= 17 and 7 <= gpu_ram <= 9:\n",
+    "        return \"g3.small\"\n",
+    "    elif cpu_count == 8 and 28 <= ram_gb <= 32 and 9 <= gpu_ram <= 11:\n",
+    "        return \"g3.medium\"\n",
+    "    elif cpu_count == 16 and 58 <= ram_gb <= 62 and 19 <= gpu_ram <= 21:\n",
+    "        return \"g3.large\"\n",
+    "    elif cpu_count == 32 and 123 <= ram_gb <= 127 and 39 <= gpu_ram <= 41:\n",
+    "        return \"g3.xl\"\n",
+    "    else:\n",
+    "        return \"custom\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "ef30c0c2-bf93-4c7a-97e2-c80f7c503f20",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'g3.medium'"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "get_instance_type()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python [conda env:tm-fall-2024]",
+   "language": "python",
+   "name": "conda-env-tm-fall-2024-py"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.15"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}