Skip to content

Reinforcement learning for load distribution in a decentralized Edge environment. This is the implementation of my Master's thesis project for the Data science course (October 2023).

License

Notifications You must be signed in to change notification settings

GiacomoPracucci/RL-edge-computing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reinforcement Learning for load distribution in decentralized Edge environment

Paper: https://dl.acm.org/doi/10.1145/3660319.3660331

Description

The project proposes the implementation of SAC (Soft actor-critic) and PPO (Proximal Policy Optimization) deep reinforcement learning algorithms and of the evolutionary algorithm NEAT (Neuro Evolution of Augmenting Topologies) to optimize workload management in an Edge Computing system (DFaaS). The goal is to find the optimal policy for local processing, forwarding of requests to edge nodes, and rejection of requests based on system conditions. The current implementation still has simplifying assumptions compared to the real scenario.

In the simulated environment, the agent receives a sequence of incoming requests over time. At each step, it must decide how many of these requests to process locally, how many to forward to another edge node, and/or how many to reject. The number of incoming requests varies over time.

The action space is a three-dimensional continuous box where each dimension corresponds to the proportions of requests that are processed locally, forwarded, or rejected.

The observation space consists of four components:

  • The number of incoming requests
  • The remaining queue capacity
  • The remaining forward capacity
  • A congestion flag, indicating whether the queue is congested

The reward function in this environment depends on the actions taken by the agent and the system state. The reward function provides more points for processing requests locally and fewer points for forwarding requests. It penalizes the system heavily for rejecting requests and for causing congestion in the queue.

Training and test settings

Three different training scenarios were defined, distinguished by the different way of generating requests to be processed and the different way of updating the available forwarding capacity to other nodes.

  • Scenario 1 scenario_1
  • Scenario 2 scneario_2
  • Scenario 3 scenario_3

The idea is to evaluate the results obtained according to different work contexts. Different scenarios allow us to assess the generalization capabilities of the algorithms by evaluating the performance obtained in work scenarios other than the training scenario (overfitting evaluation).

Best experiment results

The highest reward scores and best generalization abilities have been achieved by PPO with standard hyperparameters, trained in scenario 2.

  • Results achieved by testing ppo (trained in scenario 2) in scenario 3 s2s3_reward s2s3_rejected

  • Results achieved by testing ppo (trained in scenario 2) in scenario 1 s2s1_reward s2s1_rejected

About

Reinforcement learning for load distribution in a decentralized Edge environment. This is the implementation of my Master's thesis project for the Data science course (October 2023).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages