A curated dataset of urban scenes for audio-visual scene analysis S Wang, A Mesaros, T Heittola, T Virtanen ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 54 | 2021 |
Audio-visual scene classification: Analysis of DCASE 2021 challenge submissions S Wang, T Heittola, A Mesaros, T Virtanen arXiv preprint arXiv:2105.13675, 2021 | 19 | 2021 |
Low-latency deep clustering for speech separation S Wang, G Naithani, T Virtanen ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 15 | 2019 |
Self-supervised learning of audio representations from audio-visual data using spatial alignment S Wang, A Politis, A Mesaros, T Virtanen IEEE Journal of Selected Topics in Signal Processing 16 (6), 1467-1479, 2022 | 8 | 2022 |
Deep neural network based low-latency speech separation with asymmetric analysis-synthesis window pair S Wang, G Naithani, A Politis, T Virtanen 2021 29th European Signal Processing Conference (EUSIPCO), 301-305, 2021 | 7 | 2021 |
Self-supervised learning of audio representations using angular contrastive loss S Wang, S Tripathy, A Mesaros ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 2 | 2023 |