Add multi-arena episodes #35

benaslater · 2024-04-10T09:44:40Z

Adds the feature of multi-arena episodes, by looking for the mergeNextArena parameter in an arena's yaml. When a good (not timeout or death) episode end is reached TrainingAgent checks TrainingArena to see if it should end the episode or merge the next arena. If the latter it calls the new TrainingArena function loadNextArena.

Discarded Alternative: Just call ResetArena() since it does basically the same thing. This would be more complicated to implement in the situation where arenas are being randomly cycled. Also I think the semantics of ResetArena in AAI are already a bit complicated and I didn't want to further the problem.
Discarded alternative: Add a completeArena function to TrainingArena with a boolean for whether the episode was successful or not. Benefit: The TrainingAgent doesn't have to care about whether to call EndEpisode or rotate the arenas Drawback: Would have to pass control of showing notifications to TrainingArena, as TrainingAgent doesn't know whether the completion of an arena means a notification should be shown (episode end) or not (arena has been merged)

Proposed change(s)

Adds multi arena episode support

Additional changes:

Merge the functions TryLoadArenaConfiguration, ApplyNewArenaConfiguration, CleanupRewards, NotifyArenaChange into one function (having three separate functions when they're always called in sequence increases the chance one is missed out)
If the arena is randomised, use the ChooseRandomArenaID function on the first call (I think we can probably get rid of that code branch somehow but I wasn't brave enough to do it this time)
Rename getMaxArenaID to getArenaCount (This function returns the count. We expect arenas to be indexed from 0, so would expect the max arena ID to be the count-1)
Rename DetermineNextArenaID to SetNextArenaID to emphasise the function's side effect

Useful links (Github issues, ML-Agents forum threads etc.)

Ticket: https://www.notion.so/Multi-Arena-Episodes-7edc1ee064034193967f46bc12c75913?pm=c
Documentation change: Add documentation for multi-arena episodes animal-ai#51

Types of change(s)

Checklist

[To be added in new package] Added tests that prove my fix is effective or that my feature works
[Will do in update to https://github.com/Add documentation for multi-arena episodes animal-ai#51] Updated the changelog (if applicable)
Updated the documentation (if applicable)
Updated the migration guide (if applicable)

Testing

Tests of a merged arena + failure cases in E2E tests (I'll push these to a new repo)
Manually test error case of putting mergeNextArena in the final arena (The error does get swallowed though - https://www.notion.so/Don-t-swallow-Unity-errors-9c1c84c1c51c4165b0d5d46e3d764c7c?pvs=4)
Manually test exclusion of merged arenas when randomiseArenas is set to true (Add a debug log in TrainingArena.cs that prints out validArenaIDs and confirm one is excluded)
Train stablebaselines PPO on the arena from the E2E test and observe it learns without hitting an error

Adds the feature of multi-arena episodes, by looking for the mergeNextArena parameter in an arena's yaml. When a good (not timeout or death) episode end is reached TrainingAgent checks TrainingArena to see if it should end the episode or merge the next arena. If the latter it calls the new TrainingArena function loadNextArena.

alhasacademy96

however, you will upon inspection see that we have a duplicate code on lines 232, GetTotalArenas().

I think the original getMaxArenaID() had or meant to have something like this:

public int getMaxArenaID{
return _arenasConfigurations.configurations.sum;
}

which would return an integer of total arenas defined in the config. An argument would be presented here that both .sum and .count would technically return the same number as both are returning the number of arenas defined. However, having two seperate methods ensures encapsulation and allows for flexibility.

i think we should revert this back to the original name and change .count to .sum, which would then serve its purpose. I can do this on my end but in the meantime lets change the method name back.

Assets/Scripts/AAI3EnvironmentManager.cs

alhasacademy96

all in all a solid approach. But i think a few changes are to be made and some code to be reverted for effectiveness.

alhasacademy96 · 2024-04-10T17:10:39Z

Assets/Scripts/ArenasParameters.cs

@@ -141,6 +141,7 @@ public class ArenaConfiguration
 		public bool toUpdate = false;
 		public string protoString = "";
 		public int randomSeed = 0;
+		public bool mergeNextArena = false;


I think having this boolean applied locally (per arena definition in the config file) would be inferior to having it applied globally, like the canChangePerspective boolean. It would make sense to set this boolean once (TRUE = ALL ARENAS ARE MERGED / FALSE = NO MERGEDARENAS) and only once at the top of the config file which would then apply to all arenas, rather than specifiying it for each arena.

However, if the point of this parameter is to have the flexibility of choosing exactly which and what arenas are to be merged, then this makes sense having it set locally.

Can you elaborate on the actual requirements in this regard?

I think the second of your points re flexibility is what's important for this change.

An example use case for this feature is to have a maze that the agent navigates several times during one episode. For example the first time might be to learn the maze structure and then the second time we've added a block, so that the most efficient route is no longer possible and the agent has to use its learning from the previous run to reroute. For this example if we had 3 mazes we wanted to test on, we would have 6 arenas we want to merge in pairs (0-1, 2-3, 4-5).

In this case, perhaps an elaboration could be made on the actual and most probable use case for this feature, in the docs. Maybe it was just me but i did not understand why and how it would be used :)

Assets/Scripts/ArenasParameters.cs

Assets/Scripts/YAMLclasses.cs

Assets/Scripts/TrainingAgent.cs

Assets/Scripts/TrainingArena.cs

alhasacademy96 · 2024-04-10T18:07:46Z

all in all a solid approach. But i think a few changes are to be made and some code to be reverted for effectiveness.

getArenaCount duplicated GetTotalArenas. GetConfiguration previously returned a bool and there was some special handling for the case it couldn't find a configuration. We've removed the special handling so removing the bool return to make using it a bit less clunky.

benaslater · 2024-04-11T09:15:43Z

Sorry I'm a bit of a novice on github PRs - I've pushed a new commit to the feature branch with the changes discussed in the comments. Do I close this PR and open a new one on that?

ref: 5050eb3

alhasacademy96 · 2024-04-11T11:48:53Z

Sorry I'm a bit of a novice on github PRs - I've pushed a new commit to the feature branch with the changes discussed in the comments. Do I close this PR and open a new one on that?

ref: 5050eb3

No need. simply see the commits you made after the comments are reflected on this PR. You can also see the "OUTDATED" marker for the scripts you made changes to which are now overwritten by the newer commits you made recently.

Edit: WHEN you think you're ready for a re-PR, simply click the recycling button next to my name at the top right so I know you're ready :). This button simply asks for a review again.

alhasacademy96 · 2024-04-11T11:51:18Z

however, I do think that the important change is still not made, specifically keeping the modularity of TrainingArena.cs, where because its such an important script, I think the best way to go at it is to keep the modular structure I set for my previous PR. Your methods and modifications will still be valid.

It's important to keep the mainResetArena()method as readable and organized as possible, whilst incorporating your additions.

@benaslater

notion-workspace · 2024-04-11T16:16:26Z

Multi-Arena Episodes

notion-workspace · 2024-04-11T16:16:27Z

Multi-Arena Episodes

benaslater · 2024-04-12T08:20:35Z

I think the best way to go at it is to keep the modular structure I set for my previous PR.

I don't think I agree. Moving a big block of code into a function with a clear interface can be useful for code clarity, but I view breaking that code into several functions that must be called in sequence as a bit of an antipattern where we need to do it multiple times.

i.e. if I wanted to achieve A, and that involved doing A1, then A2, then A3. I could do it by defining A (where A1, A2 and A3 are clearly commented in the definition of A), or I could do it by defining A1, A2 and A3 separately. The issue is if I do it in the latter way now every time I want to do A it takes me 3 function calls instead of 1, and I don't seem to get much benefit over clearly marking the distinct stages of A in the body of my A function with comments. The case where it might be attractive to do it anyway is if A1, A2, A3 are useful elsewhere, but even then I would be tempted to define A() = A1(); A2(); A3();

In this case A is UpdateActiveArenaToCurrentArenaID: given that the current arena ID has been updated, we want to do all the steps needed to achieve that. We need to do that in ResetArena() and LoadNextArena().

alhasacademy96 · 2024-04-12T10:47:26Z

Hey Ben, here are some points I'd like you to consider in general and not in necessarily specific for your PR only.

-Maintainability + readability: Breaking down a large function like UpdateActiveArenaToCurrentArenaID (renamed) into smaller, logically separated functions (A1, A2, A3) can significantly enhance readability and maintainability. Each function can be focused on a single responsibility, making the code easier to understand at a glance. When functions are smaller and well-named, new developers or others unfamiliar with the code can more easily understand the workflow and purpose of the code. We need to think about future developers and collaborators in this regard for AAI.

-Reusability: Although in your scenario Ben A1, A2, and A3 might primarily support the process A, encapsulating these steps into separate functions can unexpectedly benefit future developments. For instance, if a new feature or requirement emerges that only needs the functionality of A2, it can be reused without duplicating code or introducing unnecessary dependencies.

-Tests: Smaller, well-defined functions are easier to test (recall what i mentioned earlier). When functions are broken down into smaller parts, it's easier to write tests for each part, ensuring that they work correctly in isolation. This can lead to a more robust and reliable codebase, as each part can be individually verifiied..

-Error Handling and debugging: It can be easier to handle errors and debug smaller functions. When an error occurs, if the code is split into smaller parts, it's often quicker to isolate the problem. Also, you can handle errors specific to the part of the process each function represents, which might be more difficult in a larger, monolithic functio

-Flexibility in refactoring/extending: Future changes to the process can be more manageable when the code is modular. For instance, if you later need to add a new step A4 between A2 and A3, doing so is straightforward without disturbing the other parts of the code. i think this can be more complex and error-prone in a single large function where changes affect the entire function's flow (which was the case before my PR).

-Design principles (important for any SD): breaking down functions follows well-established software design principles like SRP SoC. These principles advocate for structuring software so that each element of the code has a single responsibility and is independent of the other parts, which enhances modularity and scalability, which you already know I assume :) but a reminder.

So the point im trtying to make is that it's imperative that we understand these points and implement them where we can. Again, these points do not only apply to your PR, but in general for the codebase. I would have to kindly insist on this matter Ben.

benaslater · 2024-04-15T09:14:37Z

Hey Ibrahim, thank you for these general points. I don't agree that factoring into functions in this way is an improvement from a software engineering perspective, but it's not likely to have a big impact either way so I've pushed a change returning the functions.

alhasacademy96 · 2024-04-15T10:00:22Z

Thanks you can merge when you're ready

alhasacademy96

all good to go!

benaslater requested a review from alhasacademy96 as a code owner April 10, 2024 09:44

alhasacademy96 self-assigned this Apr 10, 2024

alhasacademy96 requested changes Apr 10, 2024

View reviewed changes

Assets/Scripts/AAI3EnvironmentManager.cs Outdated Show resolved Hide resolved

alhasacademy96 requested changes Apr 10, 2024

View reviewed changes

Factor UpdateActiveArenaToCurrentArenaID into smaller functions

6e170ca

alhasacademy96 approved these changes Apr 15, 2024

View reviewed changes

benaslater merged commit bca5134 into main Apr 16, 2024

benaslater deleted the multiArenaEpisodes branch April 16, 2024 07:16

alhasacademy96 mentioned this pull request May 6, 2024

AnimalAI Multi Arena Episodes Kinds-of-Intelligence-CFI/animal-ai#49

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multi-arena episodes #35

Add multi-arena episodes #35

benaslater commented Apr 10, 2024

alhasacademy96 left a comment

alhasacademy96 left a comment

alhasacademy96 Apr 10, 2024

benaslater Apr 11, 2024

alhasacademy96 Apr 11, 2024

alhasacademy96 commented Apr 10, 2024

benaslater commented Apr 11, 2024

alhasacademy96 commented Apr 11, 2024 •

edited

Loading

alhasacademy96 commented Apr 11, 2024 •

edited

Loading

notion-workspace bot commented Apr 11, 2024

notion-workspace bot commented Apr 11, 2024

benaslater commented Apr 12, 2024

alhasacademy96 commented Apr 12, 2024

benaslater commented Apr 15, 2024

alhasacademy96 commented Apr 15, 2024

alhasacademy96 left a comment

Add multi-arena episodes #35

Add multi-arena episodes #35

Conversation

benaslater commented Apr 10, 2024

Proposed change(s)

Additional changes:

Useful links (Github issues, ML-Agents forum threads etc.)

Types of change(s)

Checklist

Testing

alhasacademy96 left a comment

Choose a reason for hiding this comment

alhasacademy96 left a comment

Choose a reason for hiding this comment

alhasacademy96 Apr 10, 2024

Choose a reason for hiding this comment

benaslater Apr 11, 2024

Choose a reason for hiding this comment

alhasacademy96 Apr 11, 2024

Choose a reason for hiding this comment

alhasacademy96 commented Apr 10, 2024

benaslater commented Apr 11, 2024

alhasacademy96 commented Apr 11, 2024 • edited Loading

alhasacademy96 commented Apr 11, 2024 • edited Loading

notion-workspace bot commented Apr 11, 2024

notion-workspace bot commented Apr 11, 2024

benaslater commented Apr 12, 2024

alhasacademy96 commented Apr 12, 2024

benaslater commented Apr 15, 2024

alhasacademy96 commented Apr 15, 2024

alhasacademy96 left a comment

Choose a reason for hiding this comment

alhasacademy96 commented Apr 11, 2024 •

edited

Loading

alhasacademy96 commented Apr 11, 2024 •

edited

Loading