You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First off -- great code; a joy to read. Thank you for sharing it.
You mention regarding a few of the hyperparams: "Manually choose hyperparameters for B and lambda because those are not specified in the paper." (Have you attempted to ask the authors, rather than guessing?) Looks like you ran a few lambda experiments. For the buffer size B, it seems that a buffer size of 25600 vs. a batch size of 512 results in a rather long history (with random replacement). I wonder how much history is too much.
Regarding cropping gaze data:
synth renders start as 640x480, cropped to 140x84, resized to 55x35
real data start as 60x36, resized to 55x35 (specified by paper)
Are the crop parameters just based on a best-guess visual alignment of synth to real?
More generally, what is your level of confidence of the fidelity of this implementation to the original? Your refined eyeballs don't like quite as compelling as the S+U paper, but I don't know if that's a function of fewer synth images used, or fewer training iterations, or what.
Thanks again!
The text was updated successfully, but these errors were encountered:
First off -- great code; a joy to read. Thank you for sharing it.
You mention regarding a few of the hyperparams: "Manually choose hyperparameters for B and lambda because those are not specified in the paper." (Have you attempted to ask the authors, rather than guessing?) Looks like you ran a few lambda experiments. For the buffer size B, it seems that a buffer size of 25600 vs. a batch size of 512 results in a rather long history (with random replacement). I wonder how much history is too much.
Regarding cropping gaze data:
synth renders start as 640x480, cropped to 140x84, resized to 55x35
real data start as 60x36, resized to 55x35 (specified by paper)
Are the crop parameters just based on a best-guess visual alignment of synth to real?
More generally, what is your level of confidence of the fidelity of this implementation to the original? Your refined eyeballs don't like quite as compelling as the S+U paper, but I don't know if that's a function of fewer synth images used, or fewer training iterations, or what.
Thanks again!
The text was updated successfully, but these errors were encountered: