発表文献

Journal papers

  • Y. Saito, S. Takamichi, and H. Saruwatari, “Voice conversion using input-to-output highway networks,” IEICE Transactions on Information and Systems, 2017. (accepted)
  • Y. Bando, H. Saruwatari, N. Ono, S. Makino, K. Itoyama, D. Kitamura, M. Ishimura, M. Takakusaki, N. Mae, K. Yamaoka, Y. Matsui, Y. Ambe, M. Konyo, S. Tadokoro, K. Yoshii, and H. G. Okuno, “Low-latency and high-quality two-stage human-voice-enhancement system for a hose-shaped rescue robot,” Journal of Robotics and Mechatronics, vol. 29, no. 1, 2017. [DOI]
  • S. Takamichi, T. Toda, G. Neubig, S. Sakti, and S. Nakamura, “A statistical sample-based approach to GMM-based voice conversion using tied-covariance acoustic models,” IEICE Transactions on Information and Systems, vol. E99-D, no. 10, pp. 2490-2498, 2016. [DOI]
  • D. Kitamura, N. Ono, H. Sawada, H. Kameoka, and H. Saruwatari, “Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 9, pp. 1626-1641, 2016. [DOI]
  • S. Koyama, K. Furuya, K. Wakayama, S. Shimauchi, and H. Saruwatari, “Analytical approach to transforming filter design for sound field recording and reproduction using circular arrays with a spherical baffle,” Journal of the Acoustical Society of America, vol. 139, no. 3, pp. 1024-1036, 2016. [DOI]
  • Y. Oshima, S. Takamichi, T. Toda, G. Neubig, S. Sakti, and S. Nakamura, “Non-native text-to-speech preserving speaker individuality based on partial correction of prosodic and phonetic characteristics,” IEICE Transactions on Information and Systems, vol. E99-D, no. 12, 2016. [DOI]
  • S. Koyama, K. Furuya, Y. Haneda, and H. Saruwatari, “Source-location-informed sound field recording and reproduction,” IEEE Journal of Selected Topics in Signal Processing, vol. 9, no. 5, pp. 881-894, 2015. [DOI]
  • D. Kitamura, H. Saruwatari, H. Kameoka, Y. Takahashi, K. Kondo, and S. Nakamura, “Multichannel signal separation combining directional clustering and nonnegative matrix factorization with spectrogram restoration,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 4, pp. 654-669, 2015. [DOI]
  • F. D. Aprilyanti, J. Even, H. Saruwatari, K. Shikano, S. Nakamura, and T. Takatani, “Suppresion of noise and late reverberation based on blind signal extraction and wiener filtering,” Acoustical Science and Technology, vol. 36, no. 6, pp. 302-313, 2015. [DOI]
  • S. Koyama, K. Furuya, Y. Hiwasaki, Y. Haneda, and Y. Suzuki, “Wave field reconstruction filtering in cylindrical harmonic domain for with-height recording and reproduction,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, no. 10, pp. 1546-1557, 2014. [DOI]
  • S. Koyama, K. Furuya, H. Uematsu, Y. Hiwasaki, and Y. Haneda, “Real-time sound field transmission system by using wave field reconstruction filter and its evaluation,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E97-A, no. 9, pp. 1840-1848, 2014. [DOI]
  • R. Miyazaki, H. Saruwatari, S. Nakamura, K. Shikano, K. Kondo, J. Blanchette, and M. Bouchard, “Musical-noise-free blind speech extraction integrating microphone array and iterative spectral subtraction,” Signal Processing (Elsevier), vol. 102, pp. 226-239, 2014. [DOI]
  • T. Aketo, H. Saruwatari, and S. Nakamura, “Robust sound field reproduction against listener’s movement utilizing image sensor,” Journal of Signal Processing, vol. 18, no. 4, pp. 213-216, 2014. [DOI]
  • T. Miyauchi, D. Kitamura, H. Saruwatari, and S. Nakamura, “Depth estimation of sound images using directional clustering and activation-shared nonnegative matrix factorization,” Journal of Signal Processing, vol. 18, no. 4, pp. 217-220, 2014. [DOI]
  • D. Kitamura, H. Saruwatari, K. Yagi, K. Shikano, Y. Takahashi, and K. Kondo, “Music signal separation based on supervised nonnegative matrix factorization with orthogonality and maximum-divergence penalties,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E97-A, no. 5, pp. 1113-1118, 2014. [DOI]

Books

  • H. Saruwatari and R. Miyazaki, “Statistical analysis and evaluation of blind speech extraction algorithms,” in Advances in modern blind source separation techniques: theory and applications, G. Naik and W. Wang, Eds., Springer, 2014, pp. 291-322. [DOI]

Invited talks

  • S. Koyama, N. Murata, and H. Saruwatari, “Effect of multipole dictionary in sparse sound field decomposition for super-resolution in recording and reproduction,” in Proceedings of International Congress on Sound and Vibration (ICSV), London, 2017. (to appear)
  • S. Takamichi, “Speech synthesis that deceives anti-spoofing verification,” in NII Talk, 2016.
  • H. Nakajima, D. Kitamura, N. Takamune, S. Koyama, H. Saruwatari, Y. Takahashi, and K. Kondo, “Audio signal separation using supervised NMF with time-variant all-pole-model-based basis deformation,” in Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Jeju, 2016.
  • H. Saruwatari, K. Takata, N. Ono, and S. Makino, “Flexible microphone array based on multichannel nonnegative matrix factorization and statistical signal estimation,” in The 22nd international congress on acoustics (ICA2016), 2016.
  • S. Koyama, “Source-location-informed sound field recording and reproduction: a generalization to arrays of arbitrary geometry,” in Proceedings of 2016 AES International Conference on sound field control, Guildford, 2016.
  • S. Koyama, A. Matsubayashi, N. Murata, and H. Saruwatari, “Sparse sound field decomposition using group sparse bayesian learning,” in Asia-pacific signal and information processing association annual summit and conference (APSIPA ASC), 2015, pp. 850-855.
  • D. Kitamura, N. Ono, H. Sawada, H. Kameoka, and H. Saruwatari, “Relaxation of rank-1 spatial constraint in overdetermined blind source separation,” in In Proceedings of The 2015 European Signal Processing Conference (EUSIPCO2015), Nice, 2015, pp. 1271-1275.
  • H. Saruwatari, “Statistical-model-based speech enhancement with musical-noise-free properties,” in In Proceedings of 2015 IEEE International Conference on Digital Signal Processing (DSP2015), Singapore, 2015.
  • D. Kitamura, H. Saruwatari, S. Nakamura, Y. Takahashi, K. Kondo, and H. Kameoka, “Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration,” in APSIPA ASC, Siem Reap, 2014.

International conferences

  • S. Takamichi, T. Koriyama, and H. Saruwatari, “Sampling-based speech parameter generation using moment-matching network,” in Proceedings of Interspeech, Stockholm, 2017. (to appear)
  • H. Miyoshi, Y. Saito, S. Takamichi, and H. Saruwatari, “Voice conversion using sequence-to-sequence learning of context posterior probabilities,” in Proceedings of Interspeech, Stockholm, 2017. (to appear)
  • S. Koyama, N. Murata, and H. Saruwatari, “Effect of multipole dictionary in sparse sound field decomposition for super-resolution in recording and reproduction,” in Proceedings of International Congress on Sound and Vibration (ICSV), London, 2017. (to appear)  [Invited]
  • N. Ueno, S. Koyama, and H. Saruwatari, “Listening-area-informed sound field reproduction based on circular harmonic expansion,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), New Orleans, 2017, pp. 111-115.
  • N. Murata, S. Koyama, N. Takamune, and H. Saruwatari, “Spatio-temporal sparse sound field decomposition considering acoustic source signal characteristics,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), New Orleans, 2017, pp. 441-445.
  • Y. Mitsui, D. Kitamura, S. Takamichi, N. Ono, and H. Saruwatari, “Blind source separation based on independent low-rank matrix analysis with sparse regularization for time-series activity,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), New Orleans, 2017, pp. 21-25.  [Student Paper Contest Finalist]
  • R. Sato, H. Kameoka, and K. Kashino, “Fast algorithm for statistical phrase/accent command estimation based on generative model incorporating spectral features,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), New Orleans, 2017, pp. 5595-5599.
  • Y. Saito, S. Takamichi, and H. Saruwatari, “Training algorithm to deceive anti-spoofing verification for dnn-based speech synthesis,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), New Orleans, 2017, pp. 4900-4904.  [Spoken Language Processing Student Grant]
  • N. Ueno, S. Koyama, and H. Saruwatari, “Listening-area-informed sound field reproduction with gaussian prior based on circular harmonic expansion,” in Proceedings of Hands-free Speech Communication and Microphone Arrays (HSCMA), San Francisco, 2017, pp. 196-200.
  • N. Mae, M. Ishimura, D. Kitamura, N. Ono, T. Yamada, S. Makino, and H. Saruwatari, “Ego noise reduction for hose-shaped rescue robot combining independent low-rank matrix analysis and multichannel noise cancellation,” in Proceedings of International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), Grenoble, 2017, pp. 141-151.
  • H. Nakajima, D. Kitamura, N. Takamune, S. Koyama, H. Saruwatari, Y. Takahashi, and K. Kondo, “Audio signal separation using supervised NMF with time-variant all-pole-model-based basis deformation,” in Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Jeju, 2016.  [Invited]
  • S. Koyama, N. Murata, and H. Saruwatari, “Super-resolution in sound field recording and reproduction based on sparse representation,” in Proceedings of 5th Joint Meeting of the Acoustical Society of America and Acoustical Society of Japan, Honolulu, 2016.  [Invited]
  • D. Kitamura, N. Ono, H. Saruwatari, Y. Takahashi, and K. Kondo, “Discriminative and reconstructive basis training for audio source separation with semi-supervised nonnegative matrix factorization,” in Proceedings of International Workshop on Acoustic Signal Enhancement (IWAENC), Xian, 2016.
  • K. Kobayashi, S. Takamichi, S. Nakamura, and T. Toda, “The NU-NAIST voice conversion system for the voice conversion challenge 2016,” in Proceedings of Interspeech, San Francisco, 2016, pp. 1667-1671.
  • L. Li, H. Kameoka, T. Higuchi, and H. Saruwatari, “Semi-supervised joint enhancement of spectral and cepstral sequences of noisy speech,” in Proceedings of Interspeech, San Francisco, 2016, pp. 3753-3757.
  • M. Takakusaki, D. Kitamura, N. Ono, T. Yamada, S. Makino, and H. Saruwatari, “Noise reduction using independent vector analysis and noise cancellation for a hose-shaped rescue robot,” in Proceedings of International Workshop on Acoustic Signal Enhancement (IWAENC), Xian, 2016.
  • M. Ishimura, S. Makino, T. Yamada, N. Ono, and H. Saruwatari, “Noise reduction using independent vector analysis and noise cancellation for a hose-shaped rescue robot,” in Proceedings of International Workshop on Acoustic Signal Enhancement (IWAENC), Xian, 2016.
  • H. Nakajima, D. Kitamura, N. Takamune, S. Koyama, H. Saruwatari, N. Ono, Y. Takahashi, and K. Kondo, “Music signal separation using supervised nmf with all-pole-model-based discriminative basis deformation,” in Proceedings of The 2016 European Signal Processing Conference (EUSIPCO), Budapest, 2016, pp. 1143-1147.
  • N. Murata, H. Kameoka, K. Kinoshita, S. Araki, T. Nakatani, S. Koyama, and H. Saruwatari, “Reverberation-robust underdetermined source separation with non-negative tensor double deconvolution,” in Proceedings of The 2016 European Signal Processing Conference (EUSIPCO), Budapest, 2016, pp. 1648-1652.
  • S. Koyama, “Source-location-informed sound field recording and reproduction: a generalization to arrays of arbitrary geometry,” in Proceedings of 2016 AES International Conference on Sound Field Control, Guildford, 2016.  [Invited]
  • Y. Mitsufuji, S. Koyama, and H. Saruwatari, “Multichannel blind source separation based on non-negative tensor factorization in wavenumber domain,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, 2016, pp. 56-60.
  • S. Koyama and H. Saruwatari, “Sound field decomposition in reverberant environment using sparse and low-rank signal models,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, 2016, pp. 395-399.
  • N. Murata, S. Koyama, H. Kameoka, N. Takamune, and H. Saruwatari, “Sparse sound field decomposition with multichannel extension of complex nmf,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, 2016, pp. 345-349.
  • S. Koyama, A. Matsubayashi, N. Murata, and H. Saruwatari, “Sparse sound field decomposition using group sparse bayesian learning,” in Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2015, pp. 850-855.  [Invited]
  • N. Murata, S. Koyama, N. Takamune, and H. Saruwatari, “Sparse sound field decomposition with parametric dictionary learning for super-resolution recording and reproduction,” in Proceedings of IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 2015. [DOI]
  • S. Koyama, K. Ito, and H. Saruwatari, “Source-location-informed sound field recording and reproduction with spherical arrays,” in Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, 2015. [DOI]
  • D. Kitamura, N. Ono, H. Sawada, H. Kameoka, and H. Saruwatari, “Efficient multichannel nonnegative matrix factorization exploiting rank-1 spatial model,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brisbane, 2015, pp. 276-280.
  • Y. Murota, D. Kitamura, S. Koyama, H. Saruwatari, and S. Nakamura, “Statistical modeling of binaural signal and its application to binaural source separation,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brisbane, 2015, pp. 494-498.
  • S. Koyama, N. Murata, and H. Saruwatari, “Structured sparse signal models and decomposition algorithm for super-resolution in sound field recording and reproduction,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brisbane, 2015, pp. 619-623.
  • D. Kitamura, N. Ono, H. Sawada, H. Kameoka, and H. Saruwatari, “Relaxation of rank-1 spatial constraint in overdetermined blind source separation,” in Proceedings of European Signal Processing Conference (EUSIPCO), Nice, 2015, pp. 1261-1265. [DOI]  [Invited]
  • H. Saruwatari, “Statistical-model-based speech enhancement with musical-noise-free properties,” in In Proceedings of 2015 IEEE International Conference on Digital Signal Processing (DSP2015), Singapore, 2015.  [Invited]
  • D. Kitamura, H. Saruwatari, S. Nakamura, Y. Takahashi, K. Kondo, and H. Kameoka, “Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration,” in Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Siem Reap, 2014.  [Invited]
  • S. Koyama, P. Srivastava, K. Furuya, S. Shimauchi, and H. Ohmuro, “STSP: space-time stretched pulse for measuring spatio-temporal impulse response,” in Proceedings of International Workshop on Acoustic Signal Enhancement (IWAENC), 2014, pp. 309-313. [DOI]
  • F. Aprilyanti, H. Saruwatari, K. Shikano, S. Nakamura, and T. Takatani, “Optimized joint noise suppression and dereverberation based on blind signal extraction for hands-free speech recognition system,” in Proceedings of Hands-free Speech Communication and Microphone Arrays (HSCMA), Nancy, 2014. [DOI]
  • S. Nakai, H. Saruwatari, R. Miyazaki, S. Nakamura, and K. Kondo, “Theoretical analysis of biased MMSE short-time spectral amplitude estimator and its extension to musical-noise-free speech enhancement,” in Proceedings of Hands-free Speech Communication and Microphone Arrays (HSCMA), Nancy, 2014. [DOI]
  • Y. Murota, D. Kitamura, S. Nakai, H. Saruwatari, S. Nakamura, Y. Takahashi, and K. Kondo, “Music signal separation based on bayesian spectral amplitude estimator with automatic target prior adaptation,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Florence, 2014, pp. 7540-7544. [DOI]
  • Y. Haneda, K. Furuya, S. Koyama, and K. Niwa, “Close-talking spherical microphone array using sound pressure interpolation based on spherical harmonic expansion,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Florence, 2014, pp. 604-608. [DOI]
  • S. Koyama, S. Shimauchi, and H. Ohmuro, “Sparse sound field representation in recording and reproduction for reducing spatial aliasing artifacts,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Florence, 2014, pp. 4476-4480. [DOI]
  • D. Kitamura, H. Saruwatari, S. Nakamura, Y. Takahashi, K. Kondo, and H. Kameoka, “Divergence optimization in nonnegative matrix factorization with spectrogram restoration for multichannel signal separation,” in Proceedings of Hands-free Speech Communication and Microphone Arrays (HSCMA), Nancy, 2014. [DOI]