From 88265be51a60159fb4f64c3c7afc4921d941c7b5 Mon Sep 17 00:00:00 2001
From: Ben Denham <ben@denham.nz>
Date: Sat, 25 May 2024 11:06:15 +1200
Subject: [PATCH] Add explanation for error when defining task types from an
 interactive shell on Windows and macOS.

---
 docs/cookbook.md        |  28 ++++++++-
 examples/cookbook.ipynb | 133 +++++++++++++++++++++++-----------------
 2 files changed, 103 insertions(+), 58 deletions(-)
diff --git a/docs/cookbook.md b/docs/cookbook.md
index 154683b..4fec9d2 100644
--- a/docs/cookbook.md
+++ b/docs/cookbook.md
@@ -948,9 +948,10 @@ results = lab.run_tasks(runs)
 ### Why do I see the following error: `An attempt has been made to start a new process before the current process has finished`?
 
 When running labtech in a Python script on Windows, macOS, or any
-Python environment using the `spawn` multiprocessing start method, you
-will see the following error if you do not guard your experiment and
-lab creation and other non-definition code with `__name__ ==
+Python environment using the
+[`spawn` multiprocessing start method](https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods),
+you will see the following error if you do not guard your experiment
+and lab creation and other non-definition code with `__name__ ==
 '__main__'`:
 
 ```
@@ -1000,3 +1001,24 @@ if __name__ == '__main__':
 ```
 
 For details, see [Safe importing of main module](https://docs.python.org/3/library/multiprocessing.html#multiprocessing-safe-main-import).
+
+
+### Why do I see the following error: `AttributeError: Can't get attribute 'YOUR_TASK_CLASS' on <module '__main__' (built-in)>`?
+
+You will see this error (as part of a very long stack trace) when
+defining and running labtech tasks from an interactive Python shell on
+Windows or macOS (or more specifically, when
+[Python's multiprocessing start method](https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods)
+has been set to `spawn` or `forkserver`).
+
+The solution to this error is to define all of your labtech `Task`
+types in a separate `.py` Python module file which you can import into
+your interactive shell session (e.g. `from my_module import MyTask`).
+
+The reason for this error is that `spawn` and `forkserver` start
+methods will not copy the current state of your `__main__` module
+(which contains the variables you declare interactively in the Python
+shell, including task definitions) into labtech's task subprocesses.
+This error does not occur for the `fork` start method (the current
+default on Linux) because forked subprocesses *do* receive the current
+state of all modules (including `__main__`) from the parent process.
diff --git a/examples/cookbook.ipynb b/examples/cookbook.ipynb
index cd5467d..9dfd3c0 100644
--- a/examples/cookbook.ipynb
+++ b/examples/cookbook.ipynb
@@ -11,7 +11,7 @@
     "You can also run this cookbook as an [interactive\n",
     "notebook](https://mybinder.org/v2/gh/ben-denham/labtech/main?filepath=examples/cookbook.ipynb)."
    ],
-   "id": "54e41ef7-5701-45c3-9521-5603c90f590d"
+   "id": "4d0cf9dd-38cd-45ba-9a60-a876b50642c5"
   },
   {
    "cell_type": "code",
@@ -21,7 +21,7 @@
    "source": [
     "%pip install labtech fsspec mlflow pandas scikit-learn setuptools"
    ],
-   "id": "f4809934-9d04-4f3f-bbbe-4c1fde421aeb"
+   "id": "dce8897e-659a-402f-bbdb-7cd8787bbc26"
   },
   {
    "cell_type": "code",
@@ -31,7 +31,7 @@
    "source": [
     "!mkdir storage"
    ],
-   "id": "bfeef0cf-2506-4893-b19e-2d79a8f7395c"
+   "id": "2e6c0e2d-a343-457c-a0a1-b1b925832537"
   },
   {
    "cell_type": "code",
@@ -50,7 +50,7 @@
     "digits_X, digits_y = datasets.load_digits(return_X_y=True)\n",
     "digits_X = StandardScaler().fit_transform(digits_X)"
    ],
-   "id": "45d37676-a06b-459e-9877-6857a124d06b"
+   "id": "ea6d22f5-29d1-4018-a7d5-d60f4f301b92"
   },
   {
    "cell_type": "markdown",
@@ -64,7 +64,7 @@
     "is sent to `STDOUT` (e.g. calls to `print()`) or `STDERR` (e.g. uncaught\n",
     "exceptions) will also be captured and logged:"
    ],
-   "id": "c19eeed4-76a5-43a9-bb75-a716a7e8ebba"
+   "id": "8355baa1-ba57-412e-91f0-bcdb7e410773"
   },
   {
    "cell_type": "code",
@@ -91,7 +91,7 @@
     "lab = labtech.Lab(storage=None)\n",
     "results = lab.run_tasks(experiments)"
    ],
-   "id": "6a7b4078-e473-4711-8883-950e793581d5"
+   "id": "8d2b57a1-10f5-48db-982a-2052d8b17484"
   },
   {
    "cell_type": "markdown",
@@ -130,7 +130,7 @@
     "learning model (like `LRClassifierTask` below), and then make a task of\n",
     "that type a parameter for your primary experiment task:"
    ],
-   "id": "c3da647c-fc3f-47e1-a2d8-504f60ec7b1a"
+   "id": "e2274012-1bf7-420d-9ab6-577eac524b55"
   },
   {
    "cell_type": "code",
@@ -171,7 +171,7 @@
     "lab = labtech.Lab(storage=None)\n",
     "results = lab.run_tasks([experiment])"
    ],
-   "id": "aeb5e720-96f4-4892-9b9d-d0fd10226c34"
+   "id": "f4f79347-328d-4b38-82f9-2e7a227eebeb"
   },
   {
    "cell_type": "markdown",
@@ -182,7 +182,7 @@
     "[Protocol](https://docs.python.org/3/library/typing.html#typing.Protocol)\n",
     "that defines their common result type:"
    ],
-   "id": "b402ef03-7d87-46fe-acfe-a4ddb57e2634"
+   "id": "786fbf62-96d7-4468-873b-eede7e927de5"
   },
   {
    "cell_type": "code",
@@ -242,7 +242,7 @@
     "lab = labtech.Lab(storage=None)\n",
     "results = lab.run_tasks(experiments)"
    ],
-   "id": "ad6daf81-91dd-43df-9ac6-0ab9fe84d568"
+   "id": "23b7489f-c62c-4709-8b42-bc7056bea80f"
   },
   {
    "cell_type": "markdown",
@@ -262,7 +262,7 @@
     "> `Enum` must support equality between identical (but distinct) object\n",
     "> instances."
    ],
-   "id": "cd43bd8d-d87d-46e2-ac88-c3b9991d40b0"
+   "id": "06b4ff10-27fe-4cf8-bf23-65ef9c75faea"
   },
   {
    "cell_type": "code",
@@ -311,7 +311,7 @@
     "lab = labtech.Lab(storage=None)\n",
     "results = lab.run_tasks(experiments)"
    ],
-   "id": "58ec6e49-995a-4709-aef2-77df1a1afb96"
+   "id": "ba79db5c-f89b-4ebf-8aa6-6cbafe4382f6"
   },
   {
    "cell_type": "markdown",
@@ -334,7 +334,7 @@
     "The following example demonstrates specifying a `dataset_key` parameter\n",
     "to a task that is used to look up a dataset from the lab context:"
    ],
-   "id": "089bd996-b048-47a3-a80d-f629f462bc49"
+   "id": "acee6f4a-c2ce-499e-aa4d-b38355d99648"
   },
   {
    "cell_type": "code",
@@ -371,7 +371,7 @@
     ")\n",
     "results = lab.run_tasks(experiments)"
    ],
-   "id": "5523830a-a083-4d78-be16-5c0c07b94aa1"
+   "id": "9cc80f48-9e29-4ea4-a4f3-c73b6616f4f7"
   },
   {
    "cell_type": "markdown",
@@ -389,7 +389,7 @@
     "cross-validation within the task using a number of workers specified in\n",
     "the lab context as `within_task_workers`:"
    ],
-   "id": "f140fb18-c300-4c3c-8a88-df771e88e1c6"
+   "id": "9ebd78a6-cec4-47ef-8379-77743ef2b6bf"
   },
   {
    "cell_type": "code",
@@ -430,7 +430,7 @@
     ")\n",
     "results = lab.run_tasks(experiments)"
    ],
-   "id": "74d0c9d3-e585-431a-b905-bab19eea4015"
+   "id": "555fc02c-ec69-4816-ad36-dcf9af70d28d"
   },
   {
    "cell_type": "markdown",
@@ -457,7 +457,7 @@
     "raised during the execution of a task will be logged, but the execution\n",
     "of other tasks will continue:"
    ],
-   "id": "fa27fba2-4af7-48c4-93b3-53b3f92f0a2b"
+   "id": "9468e082-bb96-41a0-b0cf-6233ba60d43a"
   },
   {
    "cell_type": "code",
@@ -470,7 +470,7 @@
     "    continue_on_failure=True,\n",
     ")"
    ],
-   "id": "d2ec351d-4f71-482d-b0b5-3253e429345d"
+   "id": "bad0d80d-69f6-4ca0-bd9a-283eb9d0fe74"
   },
   {
    "cell_type": "markdown",
@@ -489,7 +489,7 @@
     "sub-class for that extension so that you can continue using caches for\n",
     "the base class:"
    ],
-   "id": "bbd39245-e64e-4ddd-b8f6-82f375a896a8"
+   "id": "0fe8f333-829c-48a1-a1e2-c8d04472a5c2"
   },
   {
    "cell_type": "code",
@@ -513,7 +513,7 @@
     "        base_result = super().run()\n",
     "        return base_result * self.multiplier"
    ],
-   "id": "f1444edb-3e91-483d-8224-138d7e53eb25"
+   "id": "e68dd65f-888b-4953-8628-c6dfe39ed5c0"
   },
   {
    "cell_type": "markdown",
@@ -525,7 +525,7 @@
     "all cached task instances for a list of task types. You can then “run”\n",
     "the tasks to load their cached results:"
    ],
-   "id": "1a6582d2-62a0-4b91-813e-d3bcb34918fa"
+   "id": "97e5dd18-3653-4722-887e-3ec263236896"
   },
   {
    "cell_type": "code",
@@ -536,7 +536,7 @@
     "cached_cvexperiment_tasks = lab.cached_tasks([CVExperiment])\n",
     "results = lab.run_tasks(cached_cvexperiment_tasks)"
    ],
-   "id": "1addb848-196e-48fe-8e42-c883340391fd"
+   "id": "dc2f853f-83a6-42c2-b043-ce0564aa0fbc"
   },
   {
    "cell_type": "markdown",
@@ -547,7 +547,7 @@
     "You can clear the cache for a list of tasks using the `uncache_tasks()`\n",
     "method of a `Lab` instance:"
    ],
-   "id": "986f605d-0ec4-4fc8-b41f-999b6342ee4d"
+   "id": "09f930a4-81c7-4300-93e3-e6be4ea9a949"
   },
   {
    "cell_type": "code",
@@ -557,7 +557,7 @@
    "source": [
     "lab.uncache_tasks(cached_cvexperiment_tasks)"
    ],
-   "id": "336138c8-b6fe-4baa-b1ea-ee3a144528c3"
+   "id": "041f9dc3-50b9-42c4-968a-2ef0cf1bfc40"
   },
   {
    "cell_type": "markdown",
@@ -566,7 +566,7 @@
     "You can also ignore all previously cached results when running a list of\n",
     "tasks by passing the `bust_cache` option to `run_tasks()`:"
    ],
-   "id": "0f284670-b192-4e7c-8166-8da19d3a0038"
+   "id": "48de1753-05ec-44ee-9085-d8a64aaadfea"
   },
   {
    "cell_type": "code",
@@ -576,7 +576,7 @@
    "source": [
     "lab.run_tasks(cached_cvexperiment_tasks, bust_cache=True)"
    ],
-   "id": "753a2a94-acc2-4a74-9ed0-c8ee3a402946"
+   "id": "ae7e79b5-1028-46ee-918b-92285942a00f"
   },
   {
    "cell_type": "markdown",
@@ -600,7 +600,7 @@
     "consider using a\n",
     "[`TypeDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict):"
    ],
-   "id": "dbc816bd-35dd-4b31-be3b-e4f4a4d7f2c9"
+   "id": "832c24e9-1382-491d-84b8-6b1cd49b9a36"
   },
   {
    "cell_type": "code",
@@ -627,7 +627,7 @@
     "            model_weights=np.array([self.seed, self.seed ** 2]),\n",
     "        )"
    ],
-   "id": "92ed576c-57a5-4bca-b4d7-804f5e98001b"
+   "id": "adcc9686-2226-4b1d-8223-1b8f7e82cc95"
   },
   {
    "cell_type": "markdown",
@@ -645,7 +645,7 @@
     "The following example demonstrates defining and using a custom cache\n",
     "type to store Pandas DataFrames as parquet files:"
    ],
-   "id": "7f435195-df49-48c0-bbb0-e2c857b17286"
+   "id": "9b2a911c-0a92-47b7-8499-fcba6835dfbc"
   },
   {
    "cell_type": "code",
@@ -688,7 +688,7 @@
     "lab = labtech.Lab(storage='storage/parquet_example')\n",
     "lab.run_tasks([TabularTask()])"
    ],
-   "id": "de56d80a-7546-49e9-a49a-b95e97d80721"
+   "id": "d09fe3ac-0bff-4762-9cf3-db7b6b7dd444"
   },
   {
    "cell_type": "markdown",
@@ -710,7 +710,7 @@
     "S3](https://s3fs.readthedocs.io/en/latest/) and [Azure Blob\n",
     "Storage](https://github.com/fsspec/adlfs):"
    ],
-   "id": "c360ccf9-8f93-4b50-b521-2a6954f9c0f9"
+   "id": "df89161b-59d0-4959-8ade-4f5ba71b272b"
   },
   {
    "cell_type": "code",
@@ -779,7 +779,7 @@
     "lab = labtech.Lab(storage=LocalFsspecStorage('storage/fsspec_example'))\n",
     "results = lab.run_tasks(experiments)"
    ],
-   "id": "a08f04ad-f9b2-4198-86eb-ab903bff2eeb"
+   "id": "6cfddcd8-1c1c-47e3-a262-3509ec5742d1"
   },
   {
    "cell_type": "markdown",
@@ -821,7 +821,7 @@
     "`AggregationTask` to aggregate the results from many individual tasks to\n",
     "create an aggregated cache that can be loaded more efficiently:"
    ],
-   "id": "e5441fdb-b76a-49bb-b9d8-4f1d7f7a20fd"
+   "id": "8016dada-229e-4389-b08f-f1c10b0a9b2d"
   },
   {
    "cell_type": "code",
@@ -862,7 +862,7 @@
     "lab = labtech.Lab(storage='storage/aggregation_lab')\n",
     "result = lab.run_task(aggregation_task)"
    ],
-   "id": "347aacf7-21ca-4f40-9e39-d6cdd8527235"
+   "id": "44a381ad-462c-4a38-ba64-d2f5f9531f2c"
   },
   {
    "cell_type": "markdown",
@@ -892,7 +892,7 @@
     "\n",
     "The following code demonstrates this pattern:"
    ],
-   "id": "eb22810a-53ff-4e69-adb8-2459bb360d06"
+   "id": "80810ff4-8e9d-4ee4-8d6e-0cd80457f1cc"
   },
   {
    "cell_type": "code",
@@ -934,7 +934,7 @@
     ")\n",
     "results = lab.run_tasks(experiments)"
    ],
-   "id": "a519fe63-8074-49d8-869f-c0358a7bc223"
+   "id": "d49976be-1d19-4559-84cc-565255a3ea84"
   },
   {
    "cell_type": "markdown",
@@ -946,7 +946,7 @@
     "it was originally executed and how long it took to execute from the\n",
     "task’s `.result_meta` attribute:"
    ],
-   "id": "f8d82f3a-1524-48de-8f67-f27954f6b7f1"
+   "id": "5bb3a49c-6a88-4808-a5c4-03605881def2"
   },
   {
    "cell_type": "code",
@@ -957,7 +957,7 @@
     "print(f'The task was executed at: {aggregation_task.result_meta.start}')\n",
     "print(f'The task execution took: {aggregation_task.result_meta.duration}')"
    ],
-   "id": "8d3a48c1-8806-42ed-8839-e00ec2450a92"
+   "id": "c2a0b526-3069-4bc8-a47d-d38e64a4630e"
   },
   {
    "cell_type": "markdown",
@@ -977,7 +977,7 @@
     "Another approach is to include all of the intermediate tasks for which\n",
     "you wish to access the results for in the call to `run_tasks()`:"
    ],
-   "id": "91d61ecb-9767-4b96-8572-4ddb460c405f"
+   "id": "76e4a54a-5918-4c40-bcf4-1634bbaacb9f"
   },
   {
    "cell_type": "code",
@@ -1005,7 +1005,7 @@
     "    for experiment in experiments\n",
     "])"
    ],
-   "id": "d4c2e193-ae5e-4726-bef6-872fcfa93895"
+   "id": "19274674-f714-43c9-b2b5-9d8ca095e188"
   },
   {
    "cell_type": "markdown",
@@ -1019,7 +1019,7 @@
     "available, so you may need to set `bust_cache=True` to ensure all\n",
     "intermediate tasks are executed:"
    ],
-   "id": "2cc817a5-3479-4af4-844b-4f70162a9669"
+   "id": "5394728d-5d90-45f6-9e65-cdb5b857ad16"
   },
   {
    "cell_type": "code",
@@ -1047,7 +1047,7 @@
     "    for experiment in experiments\n",
     "])"
    ],
-   "id": "8960c4de-cc1a-4bec-b8b5-99c86f9581b3"
+   "id": "e4f88b82-d1f4-4bd8-88bc-8b53c61ac3e5"
   },
   {
    "cell_type": "markdown",
@@ -1063,7 +1063,7 @@
     "This is modeled in labtech by defining a task type for each step, and\n",
     "having each step depend on the result from the previous step:"
    ],
-   "id": "f8c010ac-58e1-4721-a49f-6681d4e0b39e"
+   "id": "eb08bb55-7002-4295-86a4-2b2032f769d6"
   },
   {
    "cell_type": "code",
@@ -1113,7 +1113,7 @@
     "result = lab.run_task(task_c)\n",
     "print(result)"
    ],
-   "id": "e5090e9c-29a9-48ce-a7c5-2118bb952341"
+   "id": "9ac0ab45-78f4-42b6-af27-60efc6de00b2"
   },
   {
    "cell_type": "markdown",
@@ -1125,7 +1125,7 @@
     "[Mermaid diagram](https://mermaid.js.org/syntax/classDiagram.html) of\n",
     "task types for a given list of tasks:"
    ],
-   "id": "164c7328-f603-4e09-adcd-71e37f3e2db8"
+   "id": "805a30b2-1222-4b96-ab5b-c011af344d43"
   },
   {
    "cell_type": "code",
@@ -1140,7 +1140,7 @@
     "    direction='RL',\n",
     ")"
    ],
-   "id": "5e92b15f-4f33-4ed5-aba3-d6da5624eed9"
+   "id": "61c3a039-53fe-4b99-9c11-f8985e871955"
   },
   {
    "cell_type": "markdown",
@@ -1164,7 +1164,7 @@
     "additional tracking calls (such as `mlflow.log_metric()` or\n",
     "`mlflow.log_model()`) in the body of your task’s `run()` method:"
    ],
-   "id": "b074774d-26aa-4865-a4ca-b42c1f7791e4"
+   "id": "e3d2d996-ef7d-4210-b909-9d2b6fad6399"
   },
   {
    "cell_type": "code",
@@ -1211,7 +1211,7 @@
     "lab = labtech.Lab(storage=None)\n",
     "results = lab.run_tasks(runs)"
    ],
-   "id": "0b57a3dd-8a66-4f19-a6ae-59ce7dc58b74"
+   "id": "0fa4430f-45a5-4dd9-b3fe-bf2f4a8bcc5f"
   },
   {
    "cell_type": "markdown",
@@ -1237,9 +1237,11 @@
     "### Why do I see the following error: `An attempt has been made to start a new process before the current process has finished`?\n",
     "\n",
     "When running labtech in a Python script on Windows, macOS, or any Python\n",
-    "environment using the `spawn` multiprocessing start method, you will see\n",
-    "the following error if you do not guard your experiment and lab creation\n",
-    "and other non-definition code with `__name__ == '__main__'`:\n",
+    "environment using the [`spawn` multiprocessing start\n",
+    "method](https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods),\n",
+    "you will see the following error if you do not guard your experiment and\n",
+    "lab creation and other non-definition code with\n",
+    "`__name__ == '__main__'`:\n",
     "\n",
     "    RuntimeError:\n",
     "            An attempt has been made to start a new process before the\n",
@@ -1260,7 +1262,7 @@
     "non-definition code for a Python script in a `main()` function, and then\n",
     "guard the call to `main()` with `__name__ == '__main__'`:"
    ],
-   "id": "1aacb4fa-26cc-42b3-b4b4-ff1525bf9758"
+   "id": "97b7f8bc-eab1-48c3-a057-a5b64b0675f3"
   },
   {
    "cell_type": "code",
@@ -1291,16 +1293,37 @@
     "if __name__ == '__main__':\n",
     "    main()"
    ],
-   "id": "663f575b-b70a-450d-b4ba-10538c7f13e8"
+   "id": "a1929a02-4810-4c6d-b08f-2e001c6282e1"
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "For details, see [Safe importing of main\n",
-    "module](https://docs.python.org/3/library/multiprocessing.html#multiprocessing-safe-main-import)."
+    "module](https://docs.python.org/3/library/multiprocessing.html#multiprocessing-safe-main-import).\n",
+    "\n",
+    "### Why do I see the following error: `AttributeError: Can't get attribute 'YOUR_TASK_CLASS' on <module '__main__' (built-in)>`?\n",
+    "\n",
+    "You will see this error (as part of a very long stack trace) when\n",
+    "defining and running labtech tasks from an interactive Python shell on\n",
+    "Windows or macOS (or more specifically, when [Python’s multiprocessing\n",
+    "start\n",
+    "method](https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods)\n",
+    "has been set to `spawn` or `forkserver`).\n",
+    "\n",
+    "The solution to this error is to define all of your labtech `Task` types\n",
+    "in a separate `.py` Python module file which you can import into your\n",
+    "interactive shell session (e.g. `from my_module import MyTask`).\n",
+    "\n",
+    "The reason for this error is that `spawn` and `forkserver` start methods\n",
+    "will not copy the current state of your `__main__` module (which\n",
+    "contains the variables you declare interactively in the Python shell,\n",
+    "including task definitions) into labtech’s task subprocesses. This error\n",
+    "does not occur for the `fork` start method (the current default on\n",
+    "Linux) because forked subprocesses *do* receive the current state of all\n",
+    "modules (including `__main__`) from the parent process."
    ],
-   "id": "00eebba9-bb2b-4d3d-b831-df020b15d4b0"
+   "id": "36e28894-934d-4ac4-80ea-91adaff4734a"
   }
  ],
  "nbformat": 4,