Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/mvds00/test axon subprocess error checking #2226

Open
wants to merge 2 commits into
base: staging
Choose a base branch
from

Conversation

mvds00
Copy link

@mvds00 mvds00 commented Aug 13, 2024

Bug

While working with the e2e tests, it appeared that the miner code did not function well, but the test never gave any indication of this. The two issues that this PR fixes are:

  • the miner code is not run with NEST_ASYNCIO disabled, leading to errors generated by uvicorn, that is used by class axon of bittensor.
  • test_axon.py doesn't check any output of the miner code.
    Although the miner code gets far enough to register an axon, making the test succeed, going forward it is not practical to not have everything working flawlessly under the hood.

Description of the Change

The subprocess to run a miner is now started with asyncio.create_subprocess_exec instead of asyncio.create_subprocess_shell so that it can be easily terminated by the test, to examine the output.
The status of the subprocess is monitored. The NEST_ASYNCIO environment variable is set to 0 to disable nest_asyncio and fix the miner.

Alternate Designs

Rework of the test is actively worked on, this will be a separate PR on top of these commits.

Possible Drawbacks

Not known.

Verification Process

The test was run and with additional debugging it was observed that the miner now runs better.

Release Notes

N/A

@roman-opentensor roman-opentensor added the In Review Cortex Team Members Reviewing label Aug 13, 2024
µ added 2 commits August 14, 2024 09:53
Issues in miner.py largely went unnoticed. This patch addresses this by
checking whether the miner indeed runs, and at the end of the test
terminating it and examining its output. It is asserted that the string
"Exception" is not present in stdout/stderr.

The subprocess is started with asyncio.create_subprocess_exec() so that
it can be terminated easily. Added benefit is that escaping and quoting
is not necessary anymore, as the arguments are passed as a list.
Bittensor uses nest_asyncio by default. This doesn't work well with
uvicorn, that is used in the implementation of axon.

By disabling nest_asyncio the miner now runs without errors.
@heeres heeres force-pushed the feature/mvds00/test_axon-subprocess-error-checking branch from 40305f1 to 5070627 Compare August 14, 2024 10:18
@mvds00
Copy link
Author

mvds00 commented Aug 14, 2024

fixed ruff formatting issue

Copy link
Contributor

@thewhaleking thewhaleking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few changes. I like the idea of this test, but I really do want nest_asyncio to completely go away in the future.

)
logging.info("Neuron Alice is now mining")
await asyncio.sleep(1) # wait for the miner to start up or fail
if axon_process.returncode != None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is not instead of !=

await asyncio.sleep(
5
) # wait for 5 seconds for the metagraph to refresh with latest data
if axon_process.returncode != None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is not instead of !=

stdout = textwrap.indent(stdout.decode("UTF8"), INDENT)
stderr = textwrap.indent(stderr.decode("UTF8"), INDENT)
assert (
axon_process.returncode == None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is instead of ==

stdout = textwrap.indent(stdout.decode("UTF8"), INDENT)
stderr = textwrap.indent(stderr.decode("UTF8"), INDENT)
assert (
axon_process.returncode == None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is instead of ==

@@ -109,4 +129,29 @@ async def test_axon(local_chain):
assert updated_axon.port == 8091
assert updated_axon.hotkey == alice_keypair.ss58_address
assert updated_axon.coldkey == alice_keypair.ss58_address

# Terminate miner and make sure nothing bad happened.
if axon_process.returncode == None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is

try:
# try/except to prevent exception in case the process just ends now
axon_process.terminate()
except:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not do bare excepts. Ideally, we should know what we're trying to catch, but if not, at least except Exception as e and log it.

pass
await asyncio.sleep(1)

if axon_process.returncode == None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is

try:
# try/except to prevent exception in case the process just ends now
axon_process.kill()
except:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bare except

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
In Review Cortex Team Members Reviewing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants