Responses to Visual Event Timing in Generative Neural Network Models
Summary
Precise estimation of sub-second event timing from visual inputs is a fundamental aspect of human perception, enabling complex coordinative abilities. Early visual cortex areas exhibit monotonically increasing responses to visual event timing, turning into timing-tuned responses beginning in the medial temporal area (MT/V5). Here, we investigate whether such responses can be found in recurrent generative neural network models, unsupervisedly trained to efficiently encode visual event timing. Utilizing biologically plausible learning rules, as well as network structure, we were able to find monotonic and tuned responses to visual event timing in non-hierarchical- but not in hierarchical models. Thus, supporting the emergence of monotonic and tuned responses from inherent components of visual inputs. We showed that unsupervised recurrent generative neural networks can generally be used as models for human visual event timing. Moreover, we propose that advanced models could contribute explaining the response development along the visual hierarchy or the relationship between spatial and temporal abstraction.