Is asyncssh the solution of my problem? #475
-
Hello everyone, I created a script where I send 15 000 commands to 70 switchs, so around 200 commands per switch to fetch data from them.
My script is taking 30 minutes to run and to fetch all the data I want from the switches. Do you think asyncssh is a good tool for what I'm doing and will the performance of asyncssh be better than what I'm using today? Thank you, |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 15 replies
-
The exact performance will depend on how quickly the switches respond to your commands, but I did a quick test here of opening 75 connections to "localhost", with each connection sequentially starting 200 sessions running "ls" and collecting up all the results. To avoid running into simultaneous authentication limits on localhost, I added a semaphore that limited the code to starting at most 10 connections at once, but you probably wouldn't need that if you were connecting to different target systems. Here's some example code: import asyncio, asyncssh
sem = asyncio.Semaphore(10)
targets = 75*[{'host': 'localhost', 'commands': 200*['ls']}]
async def run_client(host, commands):
async with sem:
conn = await asyncssh.connect(host)
async with conn:
return [await conn.run(command) for command in commands]
async def run_multiple_clients():
tasks = (run_client(target['host'], target['commands']) for target in targets)
results = await asyncio.gather(*tasks, return_exceptions=True)
asyncio.run(run_multiple_clients()) Even when running all of this in a single event loop (and thus only being able to take advantage of a single core), my test finished in just a little over 1 minute. Here's a variation that uses multiple threads, with an event loop per thread. Interestingly, it's actually a bit slower than the single event loop version, even though it did take advantage of more than one CPU core. import asyncio, asyncssh
from concurrent.futures import ThreadPoolExecutor
targets = 75*[{'host': 'localhost', 'commands': 200*['echo foo']}]
async def run_client(host, commands):
async with asyncssh.connect(host) as conn:
return [await conn.run(command) for command in commands]
def run_async(target):
return asyncio.run(run_client(target['host'], target['commands']))
if __name__ == '__main__':
with ThreadPoolExecutor(10) as executor:
results = executor.map(run_async, targets) To get real benefit here, it might be necessary to use multiple processes instead, so the Python global interpreter lock doesn't get in the way of concurrency. However, attempting to switch to ProcessPoolExecutor and processing results locally in each process to avoid having to return them to the main process only shaved off 10 seconds or so, and it used a lot more total CPU. By far, the first version above was the most efficient. It's hard to know with your actual target hosts whether AsyncSSH will be faster than your current solution or not, but it's certainly capable of handling this kind of workload without a problem. The code is also pretty compact. |
Beta Was this translation helpful? Give feedback.
-
Hello Ron, Thank you so much for your prompt reply. I had a bit of trouble implementing a jump server connection at first, but I finally figured out how to do it. I will let you know as soon as I finish it. Thank you |
Beta Was this translation helpful? Give feedback.
-
That's getting closer, but you'll want to move the connect_ssh() call and the "async with sem" around it to be the first lines in get_port_one(). That way, that code runs as part of the tasks you are creating and running in parallel, instead of the connects running serially in the for loop in get_port_all() as the tasks are being created. Once you've done that, I think the main remaining issue will be that you may hit the maximum number of allowed sessions on the jump host. To fix that, you'd need to call asyncssh.connect() multiple times, limiting the number of calls to connect_ssh() made on any given tunnel. You can still reuse a tunnel object for multiple connect_ssh() calls, but just not ALL of the calls if you have a lot of switches. Moving the asyncssh.connect() inside the for loop and keeping a counter for the number of sessions you've opened on it would probably be the easiest thing. When the count hits 10, do a new call to asyncssh.connect(), replacing the previous tunnel object. The previous sessions you created should be enough to keep the previous tunnel connection from being garbage-collected even though you don't have an explicit reference to it. You'd be relying on garbage collection to clean it up later on when you are done, but since your program exits at that point, that's not a big concern. You could also keep an explicit list of all the tunnel objects you open in this way, and call close() and await wait_closed() on them at the very end, if you wanted to cleanly free them. |
Beta Was this translation helpful? Give feedback.
-
That's terrific, David -- glad to hear you got it working, and got such an improvement! I'm happy to help - thanks for letting me know how you made out... |
Beta Was this translation helpful? Give feedback.
-
Hello @ronf, I hope you are doing well. I'm trying to use again asyncSSH to send several commands on the same equipment this time, but i'm having some trouble to do it. I want to send the command
When I'm running this script, it seems like an infinite loop and nothing happens. Thank you, |
Beta Was this translation helpful? Give feedback.
That's terrific, David -- glad to hear you got it working, and got such an improvement!
I'm happy to help - thanks for letting me know how you made out...