From 88265be51a60159fb4f64c3c7afc4921d941c7b5 Mon Sep 17 00:00:00 2001 From: Ben Denham Date: Sat, 25 May 2024 11:06:15 +1200 Subject: [PATCH] Add explanation for error when defining task types from an interactive shell on Windows and macOS. --- docs/cookbook.md | 28 ++++++++- examples/cookbook.ipynb | 133 +++++++++++++++++++++++----------------- 2 files changed, 103 insertions(+), 58 deletions(-) diff --git a/docs/cookbook.md b/docs/cookbook.md index 154683b..4fec9d2 100644 --- a/docs/cookbook.md +++ b/docs/cookbook.md @@ -948,9 +948,10 @@ results = lab.run_tasks(runs) ### Why do I see the following error: `An attempt has been made to start a new process before the current process has finished`? When running labtech in a Python script on Windows, macOS, or any -Python environment using the `spawn` multiprocessing start method, you -will see the following error if you do not guard your experiment and -lab creation and other non-definition code with `__name__ == +Python environment using the +[`spawn` multiprocessing start method](https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods), +you will see the following error if you do not guard your experiment +and lab creation and other non-definition code with `__name__ == '__main__'`: ``` @@ -1000,3 +1001,24 @@ if __name__ == '__main__': ``` For details, see [Safe importing of main module](https://docs.python.org/3/library/multiprocessing.html#multiprocessing-safe-main-import). + + +### Why do I see the following error: `AttributeError: Can't get attribute 'YOUR_TASK_CLASS' on `? + +You will see this error (as part of a very long stack trace) when +defining and running labtech tasks from an interactive Python shell on +Windows or macOS (or more specifically, when +[Python's multiprocessing start method](https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods) +has been set to `spawn` or `forkserver`). + +The solution to this error is to define all of your labtech `Task` +types in a separate `.py` Python module file which you can import into +your interactive shell session (e.g. `from my_module import MyTask`). + +The reason for this error is that `spawn` and `forkserver` start +methods will not copy the current state of your `__main__` module +(which contains the variables you declare interactively in the Python +shell, including task definitions) into labtech's task subprocesses. +This error does not occur for the `fork` start method (the current +default on Linux) because forked subprocesses *do* receive the current +state of all modules (including `__main__`) from the parent process. diff --git a/examples/cookbook.ipynb b/examples/cookbook.ipynb index cd5467d..9dfd3c0 100644 --- a/examples/cookbook.ipynb +++ b/examples/cookbook.ipynb @@ -11,7 +11,7 @@ "You can also run this cookbook as an [interactive\n", "notebook](https://mybinder.org/v2/gh/ben-denham/labtech/main?filepath=examples/cookbook.ipynb)." ], - "id": "54e41ef7-5701-45c3-9521-5603c90f590d" + "id": "4d0cf9dd-38cd-45ba-9a60-a876b50642c5" }, { "cell_type": "code", @@ -21,7 +21,7 @@ "source": [ "%pip install labtech fsspec mlflow pandas scikit-learn setuptools" ], - "id": "f4809934-9d04-4f3f-bbbe-4c1fde421aeb" + "id": "dce8897e-659a-402f-bbdb-7cd8787bbc26" }, { "cell_type": "code", @@ -31,7 +31,7 @@ "source": [ "!mkdir storage" ], - "id": "bfeef0cf-2506-4893-b19e-2d79a8f7395c" + "id": "2e6c0e2d-a343-457c-a0a1-b1b925832537" }, { "cell_type": "code", @@ -50,7 +50,7 @@ "digits_X, digits_y = datasets.load_digits(return_X_y=True)\n", "digits_X = StandardScaler().fit_transform(digits_X)" ], - "id": "45d37676-a06b-459e-9877-6857a124d06b" + "id": "ea6d22f5-29d1-4018-a7d5-d60f4f301b92" }, { "cell_type": "markdown", @@ -64,7 +64,7 @@ "is sent to `STDOUT` (e.g. calls to `print()`) or `STDERR` (e.g. uncaught\n", "exceptions) will also be captured and logged:" ], - "id": "c19eeed4-76a5-43a9-bb75-a716a7e8ebba" + "id": "8355baa1-ba57-412e-91f0-bcdb7e410773" }, { "cell_type": "code", @@ -91,7 +91,7 @@ "lab = labtech.Lab(storage=None)\n", "results = lab.run_tasks(experiments)" ], - "id": "6a7b4078-e473-4711-8883-950e793581d5" + "id": "8d2b57a1-10f5-48db-982a-2052d8b17484" }, { "cell_type": "markdown", @@ -130,7 +130,7 @@ "learning model (like `LRClassifierTask` below), and then make a task of\n", "that type a parameter for your primary experiment task:" ], - "id": "c3da647c-fc3f-47e1-a2d8-504f60ec7b1a" + "id": "e2274012-1bf7-420d-9ab6-577eac524b55" }, { "cell_type": "code", @@ -171,7 +171,7 @@ "lab = labtech.Lab(storage=None)\n", "results = lab.run_tasks([experiment])" ], - "id": "aeb5e720-96f4-4892-9b9d-d0fd10226c34" + "id": "f4f79347-328d-4b38-82f9-2e7a227eebeb" }, { "cell_type": "markdown", @@ -182,7 +182,7 @@ "[Protocol](https://docs.python.org/3/library/typing.html#typing.Protocol)\n", "that defines their common result type:" ], - "id": "b402ef03-7d87-46fe-acfe-a4ddb57e2634" + "id": "786fbf62-96d7-4468-873b-eede7e927de5" }, { "cell_type": "code", @@ -242,7 +242,7 @@ "lab = labtech.Lab(storage=None)\n", "results = lab.run_tasks(experiments)" ], - "id": "ad6daf81-91dd-43df-9ac6-0ab9fe84d568" + "id": "23b7489f-c62c-4709-8b42-bc7056bea80f" }, { "cell_type": "markdown", @@ -262,7 +262,7 @@ "> `Enum` must support equality between identical (but distinct) object\n", "> instances." ], - "id": "cd43bd8d-d87d-46e2-ac88-c3b9991d40b0" + "id": "06b4ff10-27fe-4cf8-bf23-65ef9c75faea" }, { "cell_type": "code", @@ -311,7 +311,7 @@ "lab = labtech.Lab(storage=None)\n", "results = lab.run_tasks(experiments)" ], - "id": "58ec6e49-995a-4709-aef2-77df1a1afb96" + "id": "ba79db5c-f89b-4ebf-8aa6-6cbafe4382f6" }, { "cell_type": "markdown", @@ -334,7 +334,7 @@ "The following example demonstrates specifying a `dataset_key` parameter\n", "to a task that is used to look up a dataset from the lab context:" ], - "id": "089bd996-b048-47a3-a80d-f629f462bc49" + "id": "acee6f4a-c2ce-499e-aa4d-b38355d99648" }, { "cell_type": "code", @@ -371,7 +371,7 @@ ")\n", "results = lab.run_tasks(experiments)" ], - "id": "5523830a-a083-4d78-be16-5c0c07b94aa1" + "id": "9cc80f48-9e29-4ea4-a4f3-c73b6616f4f7" }, { "cell_type": "markdown", @@ -389,7 +389,7 @@ "cross-validation within the task using a number of workers specified in\n", "the lab context as `within_task_workers`:" ], - "id": "f140fb18-c300-4c3c-8a88-df771e88e1c6" + "id": "9ebd78a6-cec4-47ef-8379-77743ef2b6bf" }, { "cell_type": "code", @@ -430,7 +430,7 @@ ")\n", "results = lab.run_tasks(experiments)" ], - "id": "74d0c9d3-e585-431a-b905-bab19eea4015" + "id": "555fc02c-ec69-4816-ad36-dcf9af70d28d" }, { "cell_type": "markdown", @@ -457,7 +457,7 @@ "raised during the execution of a task will be logged, but the execution\n", "of other tasks will continue:" ], - "id": "fa27fba2-4af7-48c4-93b3-53b3f92f0a2b" + "id": "9468e082-bb96-41a0-b0cf-6233ba60d43a" }, { "cell_type": "code", @@ -470,7 +470,7 @@ " continue_on_failure=True,\n", ")" ], - "id": "d2ec351d-4f71-482d-b0b5-3253e429345d" + "id": "bad0d80d-69f6-4ca0-bd9a-283eb9d0fe74" }, { "cell_type": "markdown", @@ -489,7 +489,7 @@ "sub-class for that extension so that you can continue using caches for\n", "the base class:" ], - "id": "bbd39245-e64e-4ddd-b8f6-82f375a896a8" + "id": "0fe8f333-829c-48a1-a1e2-c8d04472a5c2" }, { "cell_type": "code", @@ -513,7 +513,7 @@ " base_result = super().run()\n", " return base_result * self.multiplier" ], - "id": "f1444edb-3e91-483d-8224-138d7e53eb25" + "id": "e68dd65f-888b-4953-8628-c6dfe39ed5c0" }, { "cell_type": "markdown", @@ -525,7 +525,7 @@ "all cached task instances for a list of task types. You can then “run”\n", "the tasks to load their cached results:" ], - "id": "1a6582d2-62a0-4b91-813e-d3bcb34918fa" + "id": "97e5dd18-3653-4722-887e-3ec263236896" }, { "cell_type": "code", @@ -536,7 +536,7 @@ "cached_cvexperiment_tasks = lab.cached_tasks([CVExperiment])\n", "results = lab.run_tasks(cached_cvexperiment_tasks)" ], - "id": "1addb848-196e-48fe-8e42-c883340391fd" + "id": "dc2f853f-83a6-42c2-b043-ce0564aa0fbc" }, { "cell_type": "markdown", @@ -547,7 +547,7 @@ "You can clear the cache for a list of tasks using the `uncache_tasks()`\n", "method of a `Lab` instance:" ], - "id": "986f605d-0ec4-4fc8-b41f-999b6342ee4d" + "id": "09f930a4-81c7-4300-93e3-e6be4ea9a949" }, { "cell_type": "code", @@ -557,7 +557,7 @@ "source": [ "lab.uncache_tasks(cached_cvexperiment_tasks)" ], - "id": "336138c8-b6fe-4baa-b1ea-ee3a144528c3" + "id": "041f9dc3-50b9-42c4-968a-2ef0cf1bfc40" }, { "cell_type": "markdown", @@ -566,7 +566,7 @@ "You can also ignore all previously cached results when running a list of\n", "tasks by passing the `bust_cache` option to `run_tasks()`:" ], - "id": "0f284670-b192-4e7c-8166-8da19d3a0038" + "id": "48de1753-05ec-44ee-9085-d8a64aaadfea" }, { "cell_type": "code", @@ -576,7 +576,7 @@ "source": [ "lab.run_tasks(cached_cvexperiment_tasks, bust_cache=True)" ], - "id": "753a2a94-acc2-4a74-9ed0-c8ee3a402946" + "id": "ae7e79b5-1028-46ee-918b-92285942a00f" }, { "cell_type": "markdown", @@ -600,7 +600,7 @@ "consider using a\n", "[`TypeDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict):" ], - "id": "dbc816bd-35dd-4b31-be3b-e4f4a4d7f2c9" + "id": "832c24e9-1382-491d-84b8-6b1cd49b9a36" }, { "cell_type": "code", @@ -627,7 +627,7 @@ " model_weights=np.array([self.seed, self.seed ** 2]),\n", " )" ], - "id": "92ed576c-57a5-4bca-b4d7-804f5e98001b" + "id": "adcc9686-2226-4b1d-8223-1b8f7e82cc95" }, { "cell_type": "markdown", @@ -645,7 +645,7 @@ "The following example demonstrates defining and using a custom cache\n", "type to store Pandas DataFrames as parquet files:" ], - "id": "7f435195-df49-48c0-bbb0-e2c857b17286" + "id": "9b2a911c-0a92-47b7-8499-fcba6835dfbc" }, { "cell_type": "code", @@ -688,7 +688,7 @@ "lab = labtech.Lab(storage='storage/parquet_example')\n", "lab.run_tasks([TabularTask()])" ], - "id": "de56d80a-7546-49e9-a49a-b95e97d80721" + "id": "d09fe3ac-0bff-4762-9cf3-db7b6b7dd444" }, { "cell_type": "markdown", @@ -710,7 +710,7 @@ "S3](https://s3fs.readthedocs.io/en/latest/) and [Azure Blob\n", "Storage](https://github.com/fsspec/adlfs):" ], - "id": "c360ccf9-8f93-4b50-b521-2a6954f9c0f9" + "id": "df89161b-59d0-4959-8ade-4f5ba71b272b" }, { "cell_type": "code", @@ -779,7 +779,7 @@ "lab = labtech.Lab(storage=LocalFsspecStorage('storage/fsspec_example'))\n", "results = lab.run_tasks(experiments)" ], - "id": "a08f04ad-f9b2-4198-86eb-ab903bff2eeb" + "id": "6cfddcd8-1c1c-47e3-a262-3509ec5742d1" }, { "cell_type": "markdown", @@ -821,7 +821,7 @@ "`AggregationTask` to aggregate the results from many individual tasks to\n", "create an aggregated cache that can be loaded more efficiently:" ], - "id": "e5441fdb-b76a-49bb-b9d8-4f1d7f7a20fd" + "id": "8016dada-229e-4389-b08f-f1c10b0a9b2d" }, { "cell_type": "code", @@ -862,7 +862,7 @@ "lab = labtech.Lab(storage='storage/aggregation_lab')\n", "result = lab.run_task(aggregation_task)" ], - "id": "347aacf7-21ca-4f40-9e39-d6cdd8527235" + "id": "44a381ad-462c-4a38-ba64-d2f5f9531f2c" }, { "cell_type": "markdown", @@ -892,7 +892,7 @@ "\n", "The following code demonstrates this pattern:" ], - "id": "eb22810a-53ff-4e69-adb8-2459bb360d06" + "id": "80810ff4-8e9d-4ee4-8d6e-0cd80457f1cc" }, { "cell_type": "code", @@ -934,7 +934,7 @@ ")\n", "results = lab.run_tasks(experiments)" ], - "id": "a519fe63-8074-49d8-869f-c0358a7bc223" + "id": "d49976be-1d19-4559-84cc-565255a3ea84" }, { "cell_type": "markdown", @@ -946,7 +946,7 @@ "it was originally executed and how long it took to execute from the\n", "task’s `.result_meta` attribute:" ], - "id": "f8d82f3a-1524-48de-8f67-f27954f6b7f1" + "id": "5bb3a49c-6a88-4808-a5c4-03605881def2" }, { "cell_type": "code", @@ -957,7 +957,7 @@ "print(f'The task was executed at: {aggregation_task.result_meta.start}')\n", "print(f'The task execution took: {aggregation_task.result_meta.duration}')" ], - "id": "8d3a48c1-8806-42ed-8839-e00ec2450a92" + "id": "c2a0b526-3069-4bc8-a47d-d38e64a4630e" }, { "cell_type": "markdown", @@ -977,7 +977,7 @@ "Another approach is to include all of the intermediate tasks for which\n", "you wish to access the results for in the call to `run_tasks()`:" ], - "id": "91d61ecb-9767-4b96-8572-4ddb460c405f" + "id": "76e4a54a-5918-4c40-bcf4-1634bbaacb9f" }, { "cell_type": "code", @@ -1005,7 +1005,7 @@ " for experiment in experiments\n", "])" ], - "id": "d4c2e193-ae5e-4726-bef6-872fcfa93895" + "id": "19274674-f714-43c9-b2b5-9d8ca095e188" }, { "cell_type": "markdown", @@ -1019,7 +1019,7 @@ "available, so you may need to set `bust_cache=True` to ensure all\n", "intermediate tasks are executed:" ], - "id": "2cc817a5-3479-4af4-844b-4f70162a9669" + "id": "5394728d-5d90-45f6-9e65-cdb5b857ad16" }, { "cell_type": "code", @@ -1047,7 +1047,7 @@ " for experiment in experiments\n", "])" ], - "id": "8960c4de-cc1a-4bec-b8b5-99c86f9581b3" + "id": "e4f88b82-d1f4-4bd8-88bc-8b53c61ac3e5" }, { "cell_type": "markdown", @@ -1063,7 +1063,7 @@ "This is modeled in labtech by defining a task type for each step, and\n", "having each step depend on the result from the previous step:" ], - "id": "f8c010ac-58e1-4721-a49f-6681d4e0b39e" + "id": "eb08bb55-7002-4295-86a4-2b2032f769d6" }, { "cell_type": "code", @@ -1113,7 +1113,7 @@ "result = lab.run_task(task_c)\n", "print(result)" ], - "id": "e5090e9c-29a9-48ce-a7c5-2118bb952341" + "id": "9ac0ab45-78f4-42b6-af27-60efc6de00b2" }, { "cell_type": "markdown", @@ -1125,7 +1125,7 @@ "[Mermaid diagram](https://mermaid.js.org/syntax/classDiagram.html) of\n", "task types for a given list of tasks:" ], - "id": "164c7328-f603-4e09-adcd-71e37f3e2db8" + "id": "805a30b2-1222-4b96-ab5b-c011af344d43" }, { "cell_type": "code", @@ -1140,7 +1140,7 @@ " direction='RL',\n", ")" ], - "id": "5e92b15f-4f33-4ed5-aba3-d6da5624eed9" + "id": "61c3a039-53fe-4b99-9c11-f8985e871955" }, { "cell_type": "markdown", @@ -1164,7 +1164,7 @@ "additional tracking calls (such as `mlflow.log_metric()` or\n", "`mlflow.log_model()`) in the body of your task’s `run()` method:" ], - "id": "b074774d-26aa-4865-a4ca-b42c1f7791e4" + "id": "e3d2d996-ef7d-4210-b909-9d2b6fad6399" }, { "cell_type": "code", @@ -1211,7 +1211,7 @@ "lab = labtech.Lab(storage=None)\n", "results = lab.run_tasks(runs)" ], - "id": "0b57a3dd-8a66-4f19-a6ae-59ce7dc58b74" + "id": "0fa4430f-45a5-4dd9-b3fe-bf2f4a8bcc5f" }, { "cell_type": "markdown", @@ -1237,9 +1237,11 @@ "### Why do I see the following error: `An attempt has been made to start a new process before the current process has finished`?\n", "\n", "When running labtech in a Python script on Windows, macOS, or any Python\n", - "environment using the `spawn` multiprocessing start method, you will see\n", - "the following error if you do not guard your experiment and lab creation\n", - "and other non-definition code with `__name__ == '__main__'`:\n", + "environment using the [`spawn` multiprocessing start\n", + "method](https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods),\n", + "you will see the following error if you do not guard your experiment and\n", + "lab creation and other non-definition code with\n", + "`__name__ == '__main__'`:\n", "\n", " RuntimeError:\n", " An attempt has been made to start a new process before the\n", @@ -1260,7 +1262,7 @@ "non-definition code for a Python script in a `main()` function, and then\n", "guard the call to `main()` with `__name__ == '__main__'`:" ], - "id": "1aacb4fa-26cc-42b3-b4b4-ff1525bf9758" + "id": "97b7f8bc-eab1-48c3-a057-a5b64b0675f3" }, { "cell_type": "code", @@ -1291,16 +1293,37 @@ "if __name__ == '__main__':\n", " main()" ], - "id": "663f575b-b70a-450d-b4ba-10538c7f13e8" + "id": "a1929a02-4810-4c6d-b08f-2e001c6282e1" }, { "cell_type": "markdown", "metadata": {}, "source": [ "For details, see [Safe importing of main\n", - "module](https://docs.python.org/3/library/multiprocessing.html#multiprocessing-safe-main-import)." + "module](https://docs.python.org/3/library/multiprocessing.html#multiprocessing-safe-main-import).\n", + "\n", + "### Why do I see the following error: `AttributeError: Can't get attribute 'YOUR_TASK_CLASS' on `?\n", + "\n", + "You will see this error (as part of a very long stack trace) when\n", + "defining and running labtech tasks from an interactive Python shell on\n", + "Windows or macOS (or more specifically, when [Python’s multiprocessing\n", + "start\n", + "method](https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods)\n", + "has been set to `spawn` or `forkserver`).\n", + "\n", + "The solution to this error is to define all of your labtech `Task` types\n", + "in a separate `.py` Python module file which you can import into your\n", + "interactive shell session (e.g. `from my_module import MyTask`).\n", + "\n", + "The reason for this error is that `spawn` and `forkserver` start methods\n", + "will not copy the current state of your `__main__` module (which\n", + "contains the variables you declare interactively in the Python shell,\n", + "including task definitions) into labtech’s task subprocesses. This error\n", + "does not occur for the `fork` start method (the current default on\n", + "Linux) because forked subprocesses *do* receive the current state of all\n", + "modules (including `__main__`) from the parent process." ], - "id": "00eebba9-bb2b-4d3d-b831-df020b15d4b0" + "id": "36e28894-934d-4ac4-80ea-91adaff4734a" } ], "nbformat": 4,