forked from NathanKlineInstitute/SMARTAgent
-
Notifications
You must be signed in to change notification settings - Fork 0
/
dad_notes.txt
1334 lines (1149 loc) · 47.4 KB
/
dad_notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
########################################
7-31-20
########################################
To-Do:
-Test prediction for longer, see when ball goes above bricks
-Test with v0, add frameskip # both will => compute faster * Change aigame.py to frameskip
# env = gym.make('Breakout-v0', frameskip=3)
* Updated time to 200,000. Ball still does not come back after first miss.
-Upload notes to github
-Update staymoves (momentum) loop
-Combine aigame.py & aigame2.py
-sim.py motor populations w/ help
-Haroon says to add to aigame.py which moves to use?
-Update computer vision for racket predictions
-Currently limited coordinated to not include bricks for simplicity in object detections
-Check notes on 7/27
Tutorial for pull request
1. git add <filename>
2. git commit -m "message"
3. git push
4. Go to github site to submit pull request
########################################
7-30-20
########################################
Worked with Lakshay, walked him through the code
np.median(a=, axis=) # a is array, axis is along which you calculate
np.amax # total max in the array
See yesterday for notes on continued testRacketPredictions
########################################
7-29-20
########################################
To-Do:
-Finish testRacketPredictions
*DONE
-Test prediction for longer, see when ball goes above bricks
-Add frameskip
-Upload notes to github
-Update staymoves (momentum) loop
-Combine aigame.py & aigame2.py
-sim.py motor populations w/ help
-Haroon says to add to aigame.py which moves to use?
-Update computer vision for racket predictions
-Currently limited coordinated to not include bricks for simplicity in object detections
-Check notes on 7/27
testRacketPrediction:
## Updated the court coordinates
File "testRacketPrediction.py", line 83, in predictBallRacketXIntercept
NB_intercept_steps = np.ceil((CourtHeight - ypos2)/deltay)
NameError: name 'CourtHeight' is not defined
## Not sure why, changed variable to hard-coded number (against my preference)
* Fixed that proble
* Not getting type 2 error, that's good
Traceback (most recent call last):
File "testRacketPrediction.py", line 124, in <module>
xpos_Ball2, ypos_Ball2 = findobj (observation, courtXRng, courtYRng)
File "testRacketPrediction.py", line 34, in findobj
ypos = np.median(Obj_inds,0)[0]
IndexError: invalid index to scalar variable.
## New error, unsure what it's problem is
# Happens when you try to index a numpy scalar variable (google)
Haroon says to change this:
if sIC.shape[0]*sIC.shape[1]==np.shape(Obj_inds)[0] or len(Obj_inds)==0: # if number of elements is equal, no sig obj is found
* NOW WORKS
########################################
7-27-20
########################################
To-Do:
-Add frameskip
-Upload notes to github
-Update staymoves (momentum) loop
-Combine aigame.py & aigame2.py
-sim.py motor populations w/ help
-Haroon says to add to aigame.py which moves to use?
### testRacketPredictions ###
Adapted code from aigame.py
Paddle only goes to one side, will need to explore.
Only goes right
Not based on targetX
Switched 2 with another input, paddle still went right
Sam & Haroon to rescue
Adding print text to track the code
Gets stuck at predX = -1, so defaults to np.random.randint(2,3)
Is 3 included as the top limit?
No, should be randint(2,4)
Why is it stuck?
Error 2: "if deltay<=0: predX = -1"
ypos2 <= ypos 1
ypos is kept 78.0, never updated
New task:
Store objects found and iterate through to find ball based on size (xdim, ydim) and possibly color
find all color other than black
Check opencv (open source library for computer vision)
check getObjectsBoundingBoxes in imgutils.py (for object detection)
########################################
7-24-20
########################################
To-Do:
-Add frameskip
-Upload notes to github
-Test testRacketPredi ctions.py
*DONE
-Combine aigame.py & aigame2.py
-sim.py motor populations w/ help
-Haroon says to add to aigame.py which moves to use?
### testRacketPredictions ###
Adapted code from aigame.py
Paddle only goes to one side, will need to explore.
########################################
7-23-20
########################################
To-Do:
-Add frameskip
-Upload notes to github
-Fix momentum
-Will try 4 based on prior testing
* DONE
-Test testRacketPredictions.py
*STARTED
-Combine aigame.py & aigame2.py
-sim.py motor populations w/ help
-Haroon says to add to aigame.py which moves to use?
testRacketPredictions is Pong-specific,
making changes
########################################
7-22-20
########################################
Watched workshops
Organization of Computational Neuroscience (CNS)
########################################
07-21-20
########################################
To-Do:
-Haroon says add to aigame.py which moves to use?
-Sam wants to combine aigame.py & aigame2.py
-Test testRacketPrediction.py to see things work properly
-Fix momentum
-Upload to github
-Why won't picture update?
*FIXED, removed frameskip from sim.json since env is v4
Talk with sam and haroon:
frameskip in sim.json unneccessary for v4
frameskip allows less processing for model, if game runs slow is usefull
Adapt sim.py for motor populations
Sam and Haroon will help
########################################
07-20-20
########################################
To-Do:
-Haroon says add to aigame.py which moves to use?
-Sam wants to combine aigame.py & aigame2.py
-Update racketXRng/racketYRng variables
*DONE, no more mention of racketXRng
-Test testRacketPrediction.py to see things work properly
Possible Error:
#What does this do?
Additional calcuations allow for accurate coordinate of X axis (since only area between paddles is usually found)
#self.FullImages.append(np.sum(self.last_obs[courtYRng[0]:courtYRng[1],:,:],2))
self.dObjPos['ball'].append([courtXRng[0]-1+xpos_Ball,ypos_Ball])
self.dObjPos['racket'].append([racketXRng[0]-1+xpos_Racket,ypos_Racket])
* Switched additional calculations to Y coordinates, since game switches to vertical instead of horizontal. Hope that works.
self.dObjPos['ball'].append([xpos_Ball,courtYRng[0]-1+ypos_Ball])
self.dObjPos['racket'].append([xpos_Racket,racketYRng[0]-1+ypos_Racket])
# Momentum
# I dont have the updated loop, and unsure on # of stay steps needed
Changed sim.json, ran test simulation
Updated: moves, movecodes, env.
$ from aigame2 import AIGame; AIGame = AIGame()
bash: syntax error near unexpected token `(' # Forgot to be in python
$ from aigame2 import AIGame; AIGame = AIGame()
$ rewards, epCount, proposed_actions, total_hits = AIGame.playGame(actions=[1], epCount = 0)
File "/home/davidd/workspace/SMARTAgent/aigame2.py", line 268, in playGame
self.dObjPos['racket'].append([racketXRng[0]-1+xpos_Racket,ypos_Racket])
NameError: name 'racketXRng' is not defined
$ from aigame2 import AIGame; AIGame = AIGame()
$ rewards, epCount, proposed_actions, total_hits = AIGame.playGame(actions=[1], epCount = 0)
$ rewards, epCount, proposed_actions, total_hits = AIGame.playGame(actions=[3], epCount = 0)
$ rewards, epCount, proposed_actions, total_hits = AIGame.playGame(actions=[2], epCount = 0)
$ rewards, epCount, proposed_actions, total_hits = AIGame.playGame(actions=[3], epCount = 0)
# No updated image of board? Paddle not moving and ball not spawning
########################################
07-17-20
########################################
To-Do:
-Haroon says add to aigame.py which moves to use
-Update racketXRng/racketYRng variables
*Started
-Find ball after action
*Now uses y pos
*DONE
-Test testRacketPrediction.py to see things work properly
Possible Error:
#What does this do?
Additional calcuations allow for accurate coordinate of X axis (since only area between paddles is usually found)
#self.FullImages.append(np.sum(self.last_obs[courtYRng[0]:courtYRng[1],:,:],2))
self.dObjPos['ball'].append([courtXRng[0]-1+xpos_Ball,ypos_Ball])
self.dObjPos['racket'].append([racketXRng[0]-1+xpos_Racket,ypos_Racket])
# Momentum
# I dont have the updated loop, and unsure on # of stay steps needed
For ball_hits_racket, what is the last condition?
if current_ball_dir-self.last_ball_dir<0 and reward==0 and xpos_Ball2>courtXRng[1]-courtXRng[0]-40:
# Ensures the ball is on the right side of court (pong) aka side of model. Used 20 for breakout. Can change later.
########################################
07-16-20
########################################
To-Do:
-Haroon says add to aigame.py which moves to use
-Update racketXRng/racketYRng variables
*Started
-Look into proposed actions, action input
*Done
-Update findObj inputs and function
*Done
*Left function alone, just updated racket position function:
# xpos_Racket, ypos_Racket = self.findobj(self.last_obs, racketXRng, courtYRng) # get x,y positions of racket
# TO
# xpos_Racket, ypos_Racket = self.findobj(self.last_obs, courtXRng, racketYRng) # get x,y positions of racket
going to assume dconf['moves'] are 'LEFT'(3) 'RIGHT'(2) and 'NOMOVE'(1)
Possible Error:
#What does this do?
#self.FullImages.append(np.sum(self.last_obs[courtYRng[0]:courtYRng[1],:,:],2))
self.dObjPos['ball'].append([courtXRng[0]-1+xpos_Ball,ypos_Ball])
self.dObjPos['racket'].append([racketXRng[0]-1+xpos_Racket,ypos_Racket])
# Momentum
# I dont have the updated loop, and unsure on # of stay steps needed
########################################
07-15-20
########################################
To-Do:
-Update racketXRng/racketYRng variables
*Started
-Update predictBallRacketYIntercept
*DONE
-Look into proposed actions, action input
-Update findObj inputs and function
All variable ranges are in a tuple, can be called by index (starting at 0)
########################################
07-14-20
########################################
Looking through playGame for where to switch X,Y
########################################
07-13-20
########################################
pull request from master to dad_develop; merged
git pull
#need to stash sim.json before merge
#did not git pull yet.
git stash #Will probably clear, no cares about sim.json
git pull
git add dad_notes.txt; git commit -m "notes on breakout"; git push
To-Do:
-Update court size
*DONE
-look into turning racketXRng to racketYRng
*need to update all racket stuff
-ask about predictBallRacketYIntercept
*Seems like changes necessary, unsure what the hardcoded numbers represent (ie 160)
*Will need to be switched to XIntercept
*Asking Haroon
*DONE: 160 (height) and 120 (width) are court sizes.
########################################
07-12-20
########################################
Ranges for pong:
self.courtYRng = (34, 194) # court y range
self.courtXRng = (20, 140) # court x range
self.racketXRng = (141, 144) # racket x range
# Will look into turning racketXRng to racketYRng
Ranges for breakout:
#Rounded to middle
courtYRng = (31.5, 192.5) #31.5 is guess, didn't see how high ball goes -
# courtXRng = ( 7.5 , 143.5 + (136.5 - 120.5)) #Paddle goes off the view
courtXRng = (7.5, 159.5)
racketYRng = (188.5, 192.5)
# racketXRng = 136.5 - 120.5 = 16.0
########################################
07-10-20
########################################
How to map coordinates of breakout game?
Guess/Check or is there way to plot pixels?
import gym; env = gym.make('BreakoutNoFrameskip-v4'); env.reset()
## CHANGE caction to number 1, 3, 4 to move paddle, and repeat this command
# observation, reward, done, info = env.step(caction); env.render()
obs, reward, done, info = env.step(1); env.render()
import matplotlib.pyplot as plt
plt.imshow(obs); plt.show()
Unchaged/Left Alone:
-updateInputRates
-computeMotionFields
-computeAllObjectsMotionDirections
-updateDirSensitiveRates
-findobj
-predictBallRacketYIntercept
*Seems like changes necessary, unsure what the hardcoded numbers represent (ie 160)
########################################
07-08-20
########################################
Continuing to go through aigame2.py
Going to keep useSimulatedEnv for simplicity and to allow the option in case it's desired.
########################################
07-07-20
########################################
To-Do:
-Make aigame.py for Breakout
Script: aigame2.py
########################################
07-06-20
########################################
To-Do:
-Work on reward
*DONE
-Test new code
*DONE, both manual and sim ran script fine
-Update team and commit changes
*DONE
-Look into Breakout
*DONE
Current system to update reward:
observation, reward, done, info = env.step(1);
Idea for new:
#interreward is intermediate reward
observation, interreward, done, info = env.step(1);
reward = reward + interreward
#With new logic, must start reward at 0 at start of every new loop
#First caction line (observation, reward, done, info = self.env.step(caction)) already initalized reward before if statement every time
Turning code into while loop, with done as the dependent variable
stay_step = 0
while not done and stay_step < 6:
#used or statement, caused issues. fix with and statement.
#testing *works
from aigame import AIGame; AIGame = AIGame()
rewards, epCount, proposed_actions, total_hits = AIGame.playGame(actions=[1], epCount = 0)
########################################
FINAL PRODUCT FOR MOMENTUM
IN aigame.py
# To eliminate momentum
# print('Here is caction: ' , caction)
if caction == 3 or caction == 4: # Follow up(4)/down(3) with stay(1)
stay_step = 0 # initialize
while not done and stay_step < 6:
# Takes 6 stays instead of 3 because it seems every other input is ignored (check dad notes for details)
observation, interreward, done, info = env.step(1) # Stay motion
reward = reward + interreward # Uses summation so no reinforcement/punishment is missed
stay_step += 1
#print(stay_step)
env.render() # Renders the game after the stay steps
########################################
working to commit these changes to github now.
# from samn
git add, then git commit. then push to branch and pull request.
Tutorial for pull request
1. git add <filename>
2. git commit -m "message"
3. git push
4. Go to github site to submit pull request
## List of breakout versions ##
EnvSpec(Breakout-v0)
EnvSpec(Breakout-v4)
EnvSpec(BreakoutDeterministic-v0)
EnvSpec(BreakoutDeterministic-v4)
EnvSpec(BreakoutNoFrameskip-v0)
EnvSpec(BreakoutNoFrameskip-v4)
EnvSpec(Breakout-ram-v0)
EnvSpec(Breakout-ram-v4)
EnvSpec(Breakout-ramDeterministic-v0)
EnvSpec(Breakout-ramDeterministic-v4)
EnvSpec(Breakout-ramNoFrameskip-v0)
EnvSpec(Breakout-ramNoFrameskip-v4)
import gym; env = gym.make('BreakoutNoFrameskip-v4'); env.reset()
##CHANGE caction to number 1, 3, 4 to move paddle, and repeat this command
observation, reward, done, info = env.step(caction); env.render()
# Same to pong, motion has momentum
ACTION_MEANING = {
0: "NOOP", # No
1: "FIRE", # No
2: "UP", # Right
3: "RIGHT", # Left
4: "LEFT", # IndexError
5: "DOWN", # IndexError
}
# Motion checks
Input: 2 1 1 1 1 1 # Motion really small, lot of time to respond
Motion: N Y Y Y N N # Maybe ball speeds up??
Input: 2 1 1 1 1 1
Motion: N Y Y Y N N
Input: 3 1 1 1 1 1
Motion: N Y Y Y Y N
Input: 3 1 1 1 1 1
Motion: N Y Y Y N N
Input: 3 1 1 1 1 1
Motion: N Y Y Y N N
Input: 3 3 1 1 1 1 1 1 # Takes 4 null inputs instead of typical 3
Motion: N Y Y Y Y Y N N
Input: 3 3 1 1 1 1 1 1
Motion: N Y Y Y Y Y N N
Input: 2 2 1 1 1 1 1 1
Motion: N Y Y Y Y Y N N
Input: 2 2 1 1 1 1 1 1
Motion: N Y Y Y Y Y N N
Input: 2 2 2 1 1 1 1 1 1 # 4 null inputs
Motion: N Y Y Y Y Y Y N N
Input: 3 2 1 1 1 1
Motion: N L R R N N
# Board motion
# Starting from left side
How many 2 inputs to reach other side? 26
# Starting from right side
How many 3 inputs to reach other side? 26
# Right side of env goes a bit off the board
########################################
07-03-20
########################################
To-Do:
-Test code to see if we can completely eliminate momentum
*DONE
if caction == 3 or caction == 4:
env.step(1);
env.step(1);
env.step(1);
env.step(1); env.render()
from aigame import AIGame; AIGame = AIGame()
Input: 3 1 1 1 1 1
Motion: Y Y N N N N
if caction == 3 or caction == 4:
env.step(1);
env.step(1);
env.step(1);
env.step(1);
env.step(1); env.render()
from aigame import AIGame; AIGame = AIGame()
Input: 3 1 1 1 1 1
Motion:Y N N N N N
Input: 3 1 1 1 1 1
Motion:Y Y N N N N
Input: 4 1 1 1 1 1
Motion:Y N N N N N
Input: 4 1 3 1 1 1
Motion:Y N Y Y N N
Input: 4 1 1 1 1 1
Motion:Y Y N N N N
Seems to randomly have both 1 step or 2 steps
##THIS WORLDS BELOW vVvVv
if caction == 3 or caction == 4:
env.step(1);
env.step(1);
env.step(1);
env.step(1);
env.step(1);
env.step(1); env.render()
#If every even index input is discluded, then 6 nulls are needed
Input: 4 1 1 1 1 1
Motion:Y N N N N N
Input: 3 1 1 1 1 1
Motion:Y N N N N N
Input: 4 3 1 1 1 1
Motion:Y Y N N N N
Input: 3 3 1 1 1 1
Motion:Y Y N N N N
Input: 3 4 1 1 1 1
Motion:Y Y N N N N
Input: 4 4 1 1 1 1
Motion:Y N N N N N
If goes up/down it goes 6 frames per input, versus 1 frame for stay
Time goes slower for null, look into middle strategy again
########################################
07-02-20
########################################
Trying new if statement
# Code in aigame.py:
if caction == 3 or caction == 4:
env.step(1);
env.step(1);
env.step(1); env.render()
Test Inputs: 3 1 1 1
There is a small jump after the second null imput
Without my if loop, there seems to be a binary loop of motion
input: 4 1 1 1 1 1 1
Only motion on odd index of input (motion on inputs 1 3 5 7)
Still needs 3 null inputs to stop motion
Talk with Haroon
Possisble error: not sure if we can get reward(+/- 1) from env.step
Should we read at every step?
observation, reward, done, info = env.step(caction)
For observation, could work with only last step, maybe not for reward though
dont want to miss any reward/punishment!
########################################
07-01-20
########################################
To control environment with integrated aigame.py
# look at code in testCentroidTracking.py
from aigame import AIGame
AIGame = AIGame()
# to initialize the class and then
rewards, epCount, proposed_actions, total_hits, Racket_pos, Ball_pos = AIGame.playGame(actions=[3], epCount = 0)
What is epCount?
In some of Haroon's notes, he doesn't set epcount = 0, but in testCentroidTracking.py
Need =0, if not:
SyntaxError: positional argument follows keyword argument
*epCount, or episode count, means how many times the game resets the environment (aka points scored i think)
Regardless, getting ValueError from input:
sValueError: not enough values to unpack (expected 6, got 4)
Which 2 values are not going through?
#from playGame in aigame.py
return rewards, epCount, proposed_actions, total_hits
"moves": {"UP": 4,"DOWN":3, "NOMOVE":1}
# Code in aigame.py:
if caction == 3 or caction == 4:
dconf['moves']['NOMOVE']
dconf['moves']['NOMOVE']
dconf['moves']['NOMOVE']
########################################
06-30-20
########################################
Cloned repo onto new server
Need to add key to github so can clone onto server
1. ssh-keygen
2. cat ~/davidd/.ssh/id_rsa.pub
3. https://github.com/settings/keys
if statement for no move after move
if caction == 3 or 4:
dconf['moves']['NOMOVE']
elif caction == 1:
nothing
Can't get controlled environment right
########################################
06-29-20
########################################
To-Do:
-Work on aigame.py (see plan below)
####Plan####
1. Understand variables and functions in aigame.py
# Where is the playGame func called?
# variable actionsPerPlay(1) & tstepPerAction (20)
*DONE*
1a. Comment on every line of code to explain it
*Not done*
2. Make a list of potential things affected by using multiple steps for one calling of playGame
Reward system
ball_hits_racket
3. Implement
*TRYING*
Call with Haroon
Don't use actionsPerPlay, would change the algorithm too much
Just include implicit stay action
Paramount that motion is nullified before next motion
Find where done is being used
Find where reward is being used, might need to sum it up
Reward system should be integrated
Computation for ball_hits_racket possible error
########################################
06-26-20
########################################
New account on server ([email protected])
Updated password ($ passwd)
########################################
06-25-20
########################################
To-Do:
-Work on aigame.py (see plan below)
*ALMOST DONE WITH STEP 1*
-Understand Question 3/Motion of paddle
- How is value of decisions changing with dynamics?
*DONE*
####Plan####
1. Understand variables and functions in aigame.py
# Where is the playGame func called?
# variable actionsPerPlay(1) & tstepPerAction (20)
1a. Comment on every line of code to explain it
2. Make a list of potential things affected by using multiple steps for one calling of playGame
3. Implement
Haven't finished reading entire code, but looking into actionsPerPlay, aka intaction
########################################
06-24-20
########################################
To-Do:
-Commit on github
*DONE (took me a bit :p )
-Work on aigame.py (see plan below)
-Understand Question 3/Motion of paddle
- How is value of decisions changing with dynamics?
####Plan####
1. Understand variables and functions in aigame.py
1a. Comment on every line of code to explain it
*STARTED
2. Make a list of potential things affected by using multiple steps for one calling of playGame
3. Implement
########################################
06-23-20
########################################
To-Do:
-Talk to Haroon about my next project
*DONE
-If momentum problem solved, what should I do next?
*Work on momentum
-Commit plotallrecurrentmaps today!
-Work on aigame.py (see plan below)
-Understand Question 3/Motion of paddle
- How is value of decisions changing with dynamics?
Questions:
1. What's output encoding?
*Yes, specifically how output command are encoded in firing rates of different pops of neuronsw
2. Source of bias? - Is this referring to the paddle bias of up over down?
*Refers to all bias: momentum, paddle bias, etc.
3. Does "hold" pop equate to EMDown and EMUp, where its action is 0 (no motion)?
3a. Can't the network already do 0 as an action?
*See below
##Notes from Haroon##
Still work on momentum, his sim env (which solves momentum) is just for testing
we can set ‘useSimulatedEnv’ to 0 and use the actual game (aigame.py / func playGame)
System reads the next state/position of object/ associated reward/ associated reward after every action input (up/down/null), we should try to read after every 2 inputs (up/down/null + null) (sim.py / func playGame)
####Plan####
1. Understand variables and functions in aigame.py
1a. Comment on every line of code to explain it
2. Make a list of potential things affected by using multiple steps for one calling of playGame
3. Implement
##Harron ideas on Question 3##
right now we are comparing firing rates of EMDown population and EMUp population
#Haroon
when EMUP firing rate is equal to EMDown firing rate we say ‘DONT MOVE’
when EMUP firing rate is higher than EMDown firing rate, we say ‘MOVE UP’
and when EMDown firing rate is higher than EMUP firing rate, we say ‘MOVE DOWN’
now we have had debate many times where i believe this setup is not sustainable and will fail by design
because the firing rates are changing dynamically
and value of decisions is evolving with the dynamics of networks
this is a scary concept
problem is i have not yet come up with an alternative idea to test
and Sam still believes that this will work
ideally we want to try different actions for all possible pairs of positions and directions
and only make the connection strong when an action for ball and racket at a particular location moving in a particular direction produce a reward
theoretically simple to describe….. very hard to imagine how to implement it in biological network
right now we are trying one action for a pair of configuration and then change the network
so next time we take another action that is associable to previous network state
so although things are evolving and learning….. next time the model faces exactly same situation and it already forgot what it was supposed to do
#Me
So why has the system already forget what it was supposed to do?
Just not enough reward/reinforcement?
#Haroon
no….. next time the same synapse strengthened for another condition
so multiple conditions are encoded in single synapses or overlapping synapses
#Me
And 'conditions' is object position and direction?
#Haroon
and reward
#Me
So if reward is always changing based on the prior gameplay, is the network still recognizing situations its already been in?
#Haroon
thats what it is supposed to do
not sure if it is doing so
because thats the only way the network will choose correct action
its like if the car is coming in front of you, and if you dont move aside it will hit you
so to make that decision you need to recognize your location, the fornt car’s location
it direction of motion
and someone should have told you that its going to hit you if its coming from front
only then you will move on side
all these 4 things have to be associated
position+direction+action+reward/punishment
########################################
06-22-20
########################################
To-Do:
-Start creating own function
*DONE
-Test functions
*DONE
-Notes on momentum
*Check excel sheet for prior experiments
####Notes from sam & haroon on mom*
#difficult to get any kind of stability with output encoding used
#What's output encoding?
#Haroon made controlled environment to eliminate the momentum
#Source of bias? - Is this the paddle bias of up over down?
#They are looking into adding a hold pop. Is this a pop that would cause 0 motion? I thought the system was already capable of this.
#Sam had idea to inhibit opposite motion populations if there is a lot of activity from one particular pop.
########################################
FINAL PRODUCT FOR plotallrecurrentmaps
def plotallrecurrentmaps (pdf, t, dnumc, dstartidx, dendidx, lnety = ['EV1DNW', 'EV1DN', 'EV1DNE', 'EV1DW', 'EV1','EV1DE','EV1DSW', 'EV1DS', 'EV1DSE'], asweight=False, cmap='jet',dmap=None):
if dmap is None:
drfmap = getallrecurrentmaps(pdf, t, dnumc, dstartidx, dendidx, lnety, asweight=asweight)
else:
drfmap = dmap
vmin,vmax = 1e9,-1e9
for nety in lnety:
vmin = min(vmin, np.amin(drfmap[nety]))
vmax = max(vmax, np.amax(drfmap[nety]))
for tdx,nety in enumerate(lnety):
postid = dstartidx[nety] + 0
subplot(3,3,tdx+1)
imshow(drfmap[nety],cmap=cmap,origin='upper',vmin=vmin,vmax=vmax);
title(nety+'->'+nety+str(postid));
colorbar()
return drfmap
def getallrecurrentmaps (pdf, t, dnumc, dstartidx, dendidx, lnety = ['EV1DNW', 'EV1DN', 'EV1DNE', 'EV1DW', 'EV1','EV1DE','EV1DSW', 'EV1DS', 'EV1DSE'], asweight=False):
# gets all recurrent maps in lnety
return {nety:getrecurrentmap(pdf, t, nety, dnumc, dstartidx, dendidx, asweight=asweight) for nety in lnety}
def getrecurrentmap(pdf, t, nety, dnumc, dstartidx, dendidx, asweight=False):
postid = dstartidx[nety] + 0
nrow = ncol = int(np.sqrt(dnumc[nety]))
rfmap = np.zeros((nrow,ncol))
pdfs = pdf[(pdf.postid==postid) & (pdf.preid>dstartidx[nety]) & (pdf.preid<=dendidx[nety]) & (pdf.time==t)]
if len(pdfs) < 1: return rfmap
if not asweight:
for idx in pdfs.index:
preid = int(pdfs.at[idx,'preid'])
x,y = gid2pos(dnumc[nety], dstartidx[nety], preid)
rfmap[y,x] += 1
else:
rfcnt = np.zeros((nrow,ncol))
for idx in pdfs.index:
preid = int(pdfs.at[idx,'preid'])
x,y = gid2pos(dnumc[nety], dstartidx[nety], preid)
rfcnt[y,x] += 1
rfmap[y,x] += pdfs.at[idx,'weight']
for y in range(nrow):
for x in range(ncol):
if rfcnt[y,x]>0: rfmap[y,x]/=rfcnt[y,x] #rfmap integrates weight, take the average
return rfmap
########################################
To update git repository, do "git pull"
######Testing Function######
drfmap = plotallrecurrentmaps(pdf, pdf.time[0], #Just realized postid needs to be automted
DONE, postid changes made
drfmap = plotallrecurrentmaps(pdf, pdf.time[0], dnumc, dstartidx, dendidx, asweight = False)
def plotallrecurrentmaps (pdf, t, dnumc, dstartidx, dendidx, lnety = ['EV1DNW', 'EV1DN', 'EV1DNE', 'EV1DW', 'EV1','EV1DE','EV1DSW', 'EV1DS', 'EV1DSE'], asweight=False, cmap='jet',dmap=None):
##Input to old function
drfmap = plotallinputmaps(pdf, pdf.time[0], dstartidx['EMUP'] + 0, 'EMUP', dnumc, dstartidx, dendidx, asweight=True)
####Inputs####
Function Inputs - Calling Input
pdf - pdf
t - pdf.time[0]
postid - dstartidx['EMUP'] + 0
poty - 'EMUP'
dnumc - dnumc
dstartidx - dstartidx
dendidx - dendidx
lprety - lprety
def plotallinputmaps (pdf, t, postid, poty, dnumc, dstartidx, dendidx, lprety=['EV1DNW', 'EV1DN', 'EV1DNE', 'EV1DW', 'EV1','EV1DE','EV1DSW', 'EV1DS', 'EV1DSE'], asweight=False, cmap='jet',dmap=None):
######Creating Function######
def plotallrecurrentmaps (pdf, t, dnumc, dstartidx, dendidx, lnety = ['EV1DNW', 'EV1DN', 'EV1DNE', 'EV1DW', 'EV1','EV1DE','EV1DSW', 'EV1DS', 'EV1DSE'], asweight=False, cmap='jet',dmap=None):
if dmap is None:
drfmap = getallrecurrentmaps(pdf, t, dnumc, dstartidx, dendidx, lnety, asweight=asweight)
else:
drfmap = dmap
vmin,vmax = 1e9,-1e9
for nety in lnety:
vmin = min(vmin, np.amin(drfmap[nety]))
vmax = max(vmax, np.amax(drfmap[nety]))
for tdx,nety in enumerate(lnety):
postid = dstartidx[nety] + 0
subplot(3,3,tdx+1)
imshow(drfmap[nety],cmap=cmap,origin='upper',vmin=vmin,vmax=vmax);
title(nety+'->'+nety+str(postid));
colorbar()
return drfmap
####Changes####
#-Removed poty from input variables, prety is nety (and lprety is lnety)
#-Automated postid for each nety
def getallrecurrentmaps (pdf, t, dnumc, dstartidx, dendidx, lnety = ['EV1DNW', 'EV1DN', 'EV1DNE', 'EV1DW', 'EV1','EV1DE','EV1DSW', 'EV1DS', 'EV1DSE'], asweight=False):
# gets all recurrent maps in lnety
return {nety:getrecurrentmap(pdf, t, nety, dnumc, dstartidx, dendidx, asweight=asweight) for nety in lnety}
####Changes####
#-Removed poty from getrecurrentmap input variables, prety is nety (and lprety is lnety)
def getrecurrentmap(pdf, t, nety, dnumc, dstartidx, dendidx, asweight=False):
postid = dstartidx[nety] + 0
nrow = ncol = int(np.sqrt(dnumc[nety]))
rfmap = np.zeros((nrow,ncol))
pdfs = pdf[(pdf.postid==postid) & (pdf.preid>dstartidx[nety]) & (pdf.preid<=dendidx[nety]) & (pdf.time==t)]
if len(pdfs) < 1: return rfmap
if not asweight:
for idx in pdfs.index:
preid = int(pdfs.at[idx,'preid'])
x,y = gid2pos(dnumc[nety], dstartidx[nety], preid)
rfmap[y,x] += 1
else:
rfcnt = np.zeros((nrow,ncol))
for idx in pdfs.index:
preid = int(pdfs.at[idx,'preid'])
x,y = gid2pos(dnumc[nety], dstartidx[nety], preid)
rfcnt[y,x] += 1
rfmap[y,x] += pdfs.at[idx,'weight']
for y in range(nrow):
for x in range(ncol):
if rfcnt[y,x]>0: rfmap[y,x]/=rfcnt[y,x] #rfmap integrates weight, take the average
return rfmap
####Questions####
#-preid is the numbers within a range, should I worry about postid == preid
#-Theres no check for if postid is in range of dstartidx[nety] and dendidx[nety]
#-Instead of postid being input like postid - dstartidx['EMUP'] + 0, could automate this to just read in poty, but would remove user accessability and options
####Changes made####
#-All reference to prety, poty are now merged to nety (neuron type)
#-Defines postid in function for each nety
########################################
06-19-20
########################################
To-Do:
-Understand all functions called by plotallinputmaps
-Can I scrap any of them?
####Inputs####
Function Inputs - Calling Input
pdf - pdf
t - pdf.time[0]
postid - dstartidx['EMUP'] + 0
poty - 'EMUP'
dnumc - dnumc
dstartidx - dstartidx
dendidx - dendidx
lprety - lprety
cmap - colormap for the imshow function
#
def plotallinputmaps (pdf, t, postid, poty, dnumc, dstartidx, dendidx, lprety=['EV1DNW', 'EV1DN', 'EV1DNE', 'EV1DW', 'EV1','EV1DE','EV1DSW', 'EV1DS', 'EV1DSE'], asweight=False, cmap='jet',dmap=None):
if dmap is None:
drfmap = getallinputmaps(pdf, t, postid, poty, dnumc, dstartidx, dendidx, lprety, asweight=asweight)
else:
drfmap = dmap
vmin,vmax = 1e9,-1e9
for prety in lprety:
vmin = min(vmin, np.amin(drfmap[prety]))
vmax = max(vmax, np.amax(drfmap[prety]))
for tdx,prety in enumerate(lprety):
subplot(3,3,tdx+1)
imshow(drfmap[prety],cmap=cmap,origin='upper',vmin=vmin,vmax=vmax);
title(prety+'->'+poty+str(postid));
colorbar()
return drfmap
##Image data stored in drfmap from getallinputmaps, this is the meat of it.
##lprety these types coded since they have the sensory info, want to see how this info conveyed to the motor neurons
##dmap is get all input maps
##asweight draws the color picture draws the count or weight
#
def getallinputmaps (pdf, t, postid, poty, dnumc, dstartidx, dendidx, lprety = ['EV1DNW', 'EV1DN', 'EV1DNE', 'EV1DW', 'EV1','EV1DE','EV1DSW', 'EV1DS', 'EV1DSE'], asweight=False):
# gets all input maps onto postid
return {prety:getinputmap(pdf, t, prety, postid, poty, dnumc, dstartidx, dendidx, asweight=asweight) for prety in lprety}
##Returns to a dictionary, key is prety in lprety and value is output from getinputmap
##This is basically a compiler for getinputmaps
####Inputs####
Function Inputs - Calling Input
pdf - pdf
t - pdf.time[0]
prety - from lprety
postid - dstartidx['EMUP'] + 0
poty - 'EMUP'
dnumc - dnumc
dstartidx - dstartidx
dendidx - dendidx
#
def getinputmap (pdf, t, prety, postid, poty, dnumc, dstartidx, dendidx, asweight=False):
nrow = ncol = int(np.sqrt(dnumc[poty]))
rfmap = np.zeros((nrow,ncol))
pdfs = pdf[(pdf.postid==postid) & (pdf.preid>dstartidx[prety]) & (pdf.preid<=dendidx[prety]) & (pdf.time==t)]
if len(pdfs) < 1: return rfmap
if not asweight:
for idx in pdfs.index:
preid = int(pdfs.at[idx,'preid'])
x,y = gid2pos(dnumc[prety], dstartidx[prety], preid)
rfmap[y,x] += 1
else:
rfcnt = np.zeros((nrow,ncol))
for idx in pdfs.index:
preid = int(pdfs.at[idx,'preid'])
x,y = gid2pos(dnumc[prety], dstartidx[prety], preid)
rfcnt[y,x] += 1
rfmap[y,x] += pdfs.at[idx,'weight']
for y in range(nrow):
for x in range(ncol):
if rfcnt[y,x]>0: rfmap[y,x]/=rfcnt[y,x] #rfmap integrates weight, take the average
return rfmap
#####Notes#####
-dnumc is a directory
-
#####Variables#####
-nrow, ncol
Equal value?
-rfmap
Image data, 2D matrix
-pdfs
Confused by this?
Seems to be raw data
-asweight
If true, looks at weights of connections, not just count
#####Sam Notes#####
dnumc is a dictionary that holds the number of cells of a type
nrow,ncol gets the number of rows,columns -- assumes they're arranged in a square
rfmap is the receptive field map output