-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multi-arena episodes #35
Merged
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -47,7 +47,7 @@ public class TrainingAgent : Agent, IPrefab | |
private float _freezeDelay = 0f; | ||
private bool _isFrozen = false; | ||
|
||
private bool _nextUpdateEpisodeEnd = false; | ||
private bool _nextUpdateCompleteArena = false; | ||
alhasacademy96 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
[Header("Agent Notification")] | ||
public bool showNotification = false; | ||
|
@@ -214,20 +214,20 @@ public override void Heuristic(in ActionBuffers actionsOut) | |
|
||
#region Agent Health Methods | ||
|
||
public void UpdateHealthNextStep(float updateAmount, bool andEndEpisode = false) | ||
public void UpdateHealthNextStep(float updateAmount, bool andCompleteArena = false) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See main comment above for the rest of the changes in this method. |
||
{ | ||
/// <summary> | ||
/// ML-Agents doesn't guarantee behaviour if an episode ends outside of OnActionReceived | ||
/// Therefore we queue any health updates to happen on the next action step. | ||
/// </summary> | ||
_nextUpdateHealth += updateAmount; | ||
if (andEndEpisode) | ||
if (andCompleteArena) | ||
{ | ||
_nextUpdateEpisodeEnd = true; | ||
_nextUpdateCompleteArena = true; | ||
} | ||
} | ||
|
||
public void UpdateHealth(float updateAmount, bool andEndEpisode = false) | ||
public void UpdateHealth(float updateAmount, bool andCompleteArena = false) | ||
{ | ||
if (NotificationManager.Instance == null && showNotification == true) | ||
{ | ||
|
@@ -260,12 +260,19 @@ public void UpdateHealth(float updateAmount, bool andEndEpisode = false) | |
StartCoroutine(EndEpisodeAfterDelay()); | ||
return; | ||
} | ||
if (andEndEpisode || _nextUpdateEpisodeEnd) | ||
if (andCompleteArena || _nextUpdateCompleteArena) | ||
{ | ||
_nextUpdateCompleteArena = false; | ||
float cumulativeReward = this.GetCumulativeReward(); | ||
|
||
if (cumulativeReward >= Arena.CurrentPassMark) | ||
{ | ||
// If passed and the next arena is merged load that without ending the episode | ||
if (_arena.mergeNextArena) | ||
{ | ||
_arena.LoadNextArena(); | ||
return; | ||
} | ||
if (showNotification) | ||
{ | ||
NotificationManager.Instance.ShowSuccessNotification(); | ||
|
@@ -278,7 +285,6 @@ public void UpdateHealth(float updateAmount, bool andEndEpisode = false) | |
NotificationManager.Instance.ShowFailureNotification(); | ||
} | ||
} | ||
_nextUpdateEpisodeEnd = false; | ||
alhasacademy96 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
StartCoroutine(EndEpisodeAfterDelay()); | ||
} | ||
} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having this boolean applied locally (per arena definition in the config file) would be inferior to having it applied globally, like the canChangePerspective boolean. It would make sense to set this boolean once (TRUE = ALL ARENAS ARE MERGED / FALSE = NO MERGEDARENAS) and only once at the top of the config file which would then apply to all arenas, rather than specifiying it for each arena.
However, if the point of this parameter is to have the flexibility of choosing exactly which and what arenas are to be merged, then this makes sense having it set locally.
Can you elaborate on the actual requirements in this regard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the second of your points re flexibility is what's important for this change.
An example use case for this feature is to have a maze that the agent navigates several times during one episode. For example the first time might be to learn the maze structure and then the second time we've added a block, so that the most efficient route is no longer possible and the agent has to use its learning from the previous run to reroute. For this example if we had 3 mazes we wanted to test on, we would have 6 arenas we want to merge in pairs (0-1, 2-3, 4-5).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, perhaps an elaboration could be made on the actual and most probable use case for this feature, in the docs. Maybe it was just me but i did not understand why and how it would be used :)