Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-arena episodes #35

Merged
merged 3 commits into from
Apr 16, 2024
Merged

Add multi-arena episodes #35

merged 3 commits into from
Apr 16, 2024

Conversation

benaslater
Copy link
Contributor

Adds the feature of multi-arena episodes, by looking for the mergeNextArena parameter in an arena's yaml. When a good (not timeout or death) episode end is reached TrainingAgent checks TrainingArena to see if it should end the episode or merge the next arena. If the latter it calls the new TrainingArena function loadNextArena.

  • Discarded Alternative: Just call ResetArena() since it does basically the same thing. This would be more complicated to implement in the situation where arenas are being randomly cycled. Also I think the semantics of ResetArena in AAI are already a bit complicated and I didn't want to further the problem.

  • Discarded alternative: Add a completeArena function to TrainingArena with a boolean for whether the episode was successful or not. Benefit: The TrainingAgent doesn't have to care about whether to call EndEpisode or rotate the arenas Drawback: Would have to pass control of showing notifications to TrainingArena, as TrainingAgent doesn't know whether the completion of an arena means a notification should be shown (episode end) or not (arena has been merged)

Proposed change(s)

Adds multi arena episode support

Additional changes:

  • Merge the functions TryLoadArenaConfiguration, ApplyNewArenaConfiguration, CleanupRewards, NotifyArenaChange into one function (having three separate functions when they're always called in sequence increases the chance one is missed out)
  • If the arena is randomised, use the ChooseRandomArenaID function on the first call (I think we can probably get rid of that code branch somehow but I wasn't brave enough to do it this time)
  • Rename getMaxArenaID to getArenaCount (This function returns the count. We expect arenas to be indexed from 0, so would expect the max arena ID to be the count-1)
  • Rename DetermineNextArenaID to SetNextArenaID to emphasise the function's side effect

Useful links (Github issues, ML-Agents forum threads etc.)

Types of change(s)

  • Bug fix
  • New feature
  • Code refactor
  • Breaking change
  • Documentation update
  • Other (please describe)

Checklist

  • [To be added in new package] Added tests that prove my fix is effective or that my feature works
  • [Will do in update to https://github.com/Add documentation for multi-arena episodes animal-ai#51] Updated the changelog (if applicable)
  • Updated the documentation (if applicable)
  • Updated the migration guide (if applicable)

Testing

  • Tests of a merged arena + failure cases in E2E tests (I'll push these to a new repo)
  • Manually test error case of putting mergeNextArena in the final arena (The error does get swallowed though - https://www.notion.so/Don-t-swallow-Unity-errors-9c1c84c1c51c4165b0d5d46e3d764c7c?pvs=4)
  • Manually test exclusion of merged arenas when randomiseArenas is set to true (Add a debug log in TrainingArena.cs that prints out validArenaIDs and confirm one is excluded)
  • Train stablebaselines PPO on the arena from the E2E test and observe it learns without hitting an error

Adds the feature of multi-arena episodes, by looking for the mergeNextArena
parameter in an arena's yaml. When a good (not timeout or death) episode end
is reached TrainingAgent checks TrainingArena to see if it should end the
episode or merge the next arena. If the latter it calls the new TrainingArena
function loadNextArena.
@alhasacademy96 alhasacademy96 self-assigned this Apr 10, 2024
Copy link
Member

@alhasacademy96 alhasacademy96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

however, you will upon inspection see that we have a duplicate code on lines 232, GetTotalArenas().

I think the original getMaxArenaID() had or meant to have something like this:

public int getMaxArenaID{
return _arenasConfigurations.configurations.sum;
}

which would return an integer of total arenas defined in the config. An argument would be presented here that both .sum and .count would technically return the same number as both are returning the number of arenas defined. However, having two seperate methods ensures encapsulation and allows for flexibility.

i think we should revert this back to the original name and change .count to .sum, which would then serve its purpose. I can do this on my end but in the meantime lets change the method name back.

Assets/Scripts/AAI3EnvironmentManager.cs Outdated Show resolved Hide resolved
Copy link
Member

@alhasacademy96 alhasacademy96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all in all a solid approach. But i think a few changes are to be made and some code to be reverted for effectiveness.

@@ -141,6 +141,7 @@ public class ArenaConfiguration
public bool toUpdate = false;
public string protoString = "";
public int randomSeed = 0;
public bool mergeNextArena = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having this boolean applied locally (per arena definition in the config file) would be inferior to having it applied globally, like the canChangePerspective boolean. It would make sense to set this boolean once (TRUE = ALL ARENAS ARE MERGED / FALSE = NO MERGEDARENAS) and only once at the top of the config file which would then apply to all arenas, rather than specifiying it for each arena.

However, if the point of this parameter is to have the flexibility of choosing exactly which and what arenas are to be merged, then this makes sense having it set locally.

Can you elaborate on the actual requirements in this regard?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the second of your points re flexibility is what's important for this change.

An example use case for this feature is to have a maze that the agent navigates several times during one episode. For example the first time might be to learn the maze structure and then the second time we've added a block, so that the most efficient route is no longer possible and the agent has to use its learning from the previous run to reroute. For this example if we had 3 mazes we wanted to test on, we would have 6 arenas we want to merge in pairs (0-1, 2-3, 4-5).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, perhaps an elaboration could be made on the actual and most probable use case for this feature, in the docs. Maybe it was just me but i did not understand why and how it would be used :)

Assets/Scripts/ArenasParameters.cs Show resolved Hide resolved
Assets/Scripts/YAMLclasses.cs Show resolved Hide resolved
Assets/Scripts/TrainingAgent.cs Show resolved Hide resolved
Assets/Scripts/TrainingAgent.cs Show resolved Hide resolved
Assets/Scripts/TrainingArena.cs Show resolved Hide resolved
Assets/Scripts/TrainingArena.cs Show resolved Hide resolved
Assets/Scripts/TrainingArena.cs Outdated Show resolved Hide resolved
Assets/Scripts/TrainingArena.cs Outdated Show resolved Hide resolved
Assets/Scripts/TrainingArena.cs Outdated Show resolved Hide resolved
@alhasacademy96
Copy link
Member

all in all a solid approach. But i think a few changes are to be made and some code to be reverted for effectiveness.

getArenaCount duplicated GetTotalArenas.

GetConfiguration previously returned a bool and there was some special
handling for the case it couldn't find a configuration.
We've removed the special handling so removing the bool return to make
using it a bit less clunky.
@benaslater
Copy link
Contributor Author

Sorry I'm a bit of a novice on github PRs - I've pushed a new commit to the feature branch with the changes discussed in the comments. Do I close this PR and open a new one on that?

ref: 5050eb3

@alhasacademy96
Copy link
Member

alhasacademy96 commented Apr 11, 2024

Sorry I'm a bit of a novice on github PRs - I've pushed a new commit to the feature branch with the changes discussed in the comments. Do I close this PR and open a new one on that?

ref: 5050eb3

No need. simply see the commits you made after the comments are reflected on this PR. You can also see the "OUTDATED" marker for the scripts you made changes to which are now overwritten by the newer commits you made recently.

Edit: WHEN you think you're ready for a re-PR, simply click the recycling button next to my name at the top right so I know you're ready :). This button simply asks for a review again.

@alhasacademy96
Copy link
Member

alhasacademy96 commented Apr 11, 2024

however, I do think that the important change is still not made, specifically keeping the modularity of TrainingArena.cs, where because its such an important script, I think the best way to go at it is to keep the modular structure I set for my previous PR. Your methods and modifications will still be valid.

It's important to keep the mainResetArena()method as readable and organized as possible, whilst incorporating your additions.

@benaslater

Copy link

Multi-Arena Episodes

1 similar comment
Copy link

Multi-Arena Episodes

@benaslater
Copy link
Contributor Author

I think the best way to go at it is to keep the modular structure I set for my previous PR.

I don't think I agree. Moving a big block of code into a function with a clear interface can be useful for code clarity, but I view breaking that code into several functions that must be called in sequence as a bit of an antipattern where we need to do it multiple times.

i.e. if I wanted to achieve A, and that involved doing A1, then A2, then A3. I could do it by defining A (where A1, A2 and A3 are clearly commented in the definition of A), or I could do it by defining A1, A2 and A3 separately. The issue is if I do it in the latter way now every time I want to do A it takes me 3 function calls instead of 1, and I don't seem to get much benefit over clearly marking the distinct stages of A in the body of my A function with comments. The case where it might be attractive to do it anyway is if A1, A2, A3 are useful elsewhere, but even then I would be tempted to define A() = A1(); A2(); A3();

In this case A is UpdateActiveArenaToCurrentArenaID: given that the current arena ID has been updated, we want to do all the steps needed to achieve that. We need to do that in ResetArena() and LoadNextArena().

@alhasacademy96
Copy link
Member

Hey Ben, here are some points I'd like you to consider in general and not in necessarily specific for your PR only.

-Maintainability + readability: Breaking down a large function like UpdateActiveArenaToCurrentArenaID (renamed) into smaller, logically separated functions (A1, A2, A3) can significantly enhance readability and maintainability. Each function can be focused on a single responsibility, making the code easier to understand at a glance. When functions are smaller and well-named, new developers or others unfamiliar with the code can more easily understand the workflow and purpose of the code. We need to think about future developers and collaborators in this regard for AAI.

-Reusability: Although in your scenario Ben A1, A2, and A3 might primarily support the process A, encapsulating these steps into separate functions can unexpectedly benefit future developments. For instance, if a new feature or requirement emerges that only needs the functionality of A2, it can be reused without duplicating code or introducing unnecessary dependencies.

-Tests: Smaller, well-defined functions are easier to test (recall what i mentioned earlier). When functions are broken down into smaller parts, it's easier to write tests for each part, ensuring that they work correctly in isolation. This can lead to a more robust and reliable codebase, as each part can be individually verifiied..

-Error Handling and debugging: It can be easier to handle errors and debug smaller functions. When an error occurs, if the code is split into smaller parts, it's often quicker to isolate the problem. Also, you can handle errors specific to the part of the process each function represents, which might be more difficult in a larger, monolithic functio

-Flexibility in refactoring/extending: Future changes to the process can be more manageable when the code is modular. For instance, if you later need to add a new step A4 between A2 and A3, doing so is straightforward without disturbing the other parts of the code. i think this can be more complex and error-prone in a single large function where changes affect the entire function's flow (which was the case before my PR).

-Design principles (important for any SD): breaking down functions follows well-established software design principles like SRP SoC. These principles advocate for structuring software so that each element of the code has a single responsibility and is independent of the other parts, which enhances modularity and scalability, which you already know I assume :) but a reminder.

So the point im trtying to make is that it's imperative that we understand these points and implement them where we can. Again, these points do not only apply to your PR, but in general for the codebase. I would have to kindly insist on this matter Ben.

@benaslater
Copy link
Contributor Author

Hey Ibrahim, thank you for these general points. I don't agree that factoring into functions in this way is an improvement from a software engineering perspective, but it's not likely to have a big impact either way so I've pushed a change returning the functions.

@alhasacademy96
Copy link
Member

Thanks you can merge when you're ready

Copy link
Member

@alhasacademy96 alhasacademy96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all good to go!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants