Posts by Collection

publications

ARET: Aggregated Residual Extended Time-Delay Neural Networks for Speaker Verification

Published in Proc. Interspeech 2020, 1900

Citation: Ruiteng Zhang, Jianguo Wei, Wenhuan Lu, Longbiao Wang, Meng Liu, Lin Zhang, Jiayu Jin, Junhai Xu. "ARET: Aggregated Residual Extended Time-Delay Neural Networks for Speaker Verification," in

Regional Resonance of the Lower Vocal Tract and its Contribution to Speaker Characteristics

Published in Proc. Interspeech 2020, 1900

Citation: Lin Zhang, Kiyoshi Honda, Jianguo Wei, Seiji Adachi. "Regional Resonance of the Lower Vocal Tract and its Contribution to Speaker Characteristics," in Proc. Interspeech 2020, pp. 1391-1395.

Multi-task Learning in Utterance-level and Segmental-level Spoof Detection

Published in Proc. ASVspoof Workshop 2021, 1900

Citation: Lin Zhang, Xin Wang, Erica Cooper, Junichi Yamagishi. "Multi-task Learning in Utterance-level and Segmental-level Spoof Detection," in Proc. ASVspoof Workshop 2021, pp. 9-15.

An Initial Investigation for Detecting Partially Spoofed Audio

Published in Proc. Interspeech 2021, 1900

Citation: Lin Zhang, Xin Wang, Erica Cooper, Junichi Yamagishi, et al. "An Initial Investigation for Detecting Partially Spoofed Audio," in Proc. Interspeech 2021, pp. 4264-4268.

CpAug: Refining Copy-Paste Augmentation for Speech Anti-Spoofing

Published in Proc. ICASSP 2022, 1900

Citation: Linjuan Zhang, Kong Aik Lee, Lin Zhang, Longbiao Wang, Baoning Niu. "CpAug: Refining Copy-Paste Augmentation for Speech Anti-Spoofing," in Proc. ICASSP 2022, pp. 7082-7086.

CS-REP: Making Speaker Verification Networks Embrace Re-Parameterization

Published in Proc. ICASSP 2022, 1900

Citation: Ruiteng Zhang, Jianguo Wei, Wenhuan Lu, Lin Zhang, Yantao Ji, Junhai Xu, Xugang Lu. "CS-REP: Making Speaker Verification Networks Embrace Re-Parameterization," in Proc. ICASSP 2022, pp. 7082-7086.

Deep Spectro-temporal Artifacts for Detecting Synthesized Speech

Published in Proc. DDAM 2022, 1900

Citation: Xiaohui Liu, Meng Liu, Lin Zhang, et al. "Deep Spectro-temporal Artifacts for Detecting Synthesized Speech," in Proc. DDAM 2022, pp. 69–75.

Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection

Published in Proc. Interspeech 2022, 1900

Citation: Kai Li, Li, S., Lu, X., Akagi, M., Liu, M., Lin Zhang, Zeng, C., Wang, L., Dang, J., Unoki, M. "Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection," in Proc. Interspeech 2022, pp. 664-668.

Spoofing-Aware Attention-based ASV Back-end with Multiple Enrollment Utterances and a Sampling Strategy for the SASV Challenge 2022

Published in Proc. Interspeech 2022, 1900

Citation: Chang Zeng, Lin Zhang, Meng Liu, Junichi Yamagishi. "Spoofing-Aware Attention-based ASV Back-end with Multiple Enrollment Utterances and a Sampling Strategy for the SASV Challenge 2022," in Proc. Interspeech 2022, pp. 2883-2887.

Optimal Transport with a Diversified Memory Bank for Cross-Domain Speaker Verification

Published in Proc. ICASSP 2023, 1900

Citation: Ruiteng Zhang, Jianguo Wei, Xugang Lu, Wenhuan Lu, Di Jin, Lin Zhang, Junhai Xu. "Optimal Transport with a Diversified Memory Bank for Cross-Domain Speaker Verification," in Proc. ICASSP 2023, pp. 1-5.

The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance

Published in IEEE/ACM Transactions on Audio, Speech and Language Processing, 1900

Citation: Lin Zhang, Xin Wang, Erica Cooper, Nicholas Evans, Junichi Yamagishi. (2023). The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 813-825.

Range-Based Equal Error Rate for Spoof Localization

Published in Proc. Interspeech 2023, 1900

Citation: Lin Zhang, Xin Wang, Erica Cooper, Nicholas Evans, Junichi Yamagishi. "Range-Based Equal Error Rate for Spoof Localization," in Proc. Interspeech 2023, pp. 3212-3216.

Self-supervised learning based domain regularization for mask-wearing speaker verification

Published in Speech Communication, 1900

Citation: Ruiteng Zhang, Jianguo Wei, Xugang Lu, Wenhuan Lu, Di Jin, Lin Zhang, Junhai Xu. (2023). "Self-supervised learning based domain regularization for mask-wearing speaker verification." Speech Communication: 102953.

TMS: Temporal multi-scale in time-delay neural network for speaker verification

Published in Applied Intelligence, 1900

Citation: Ruiteng Zhang, Jianguo Wei, Xugang Lu, Wenhuan Lu, Di Jin, Lin Zhang, Junhai Xu, and Jianwu Dang. "TMS: Temporal multi-scale in time-delay neural network for speaker verification." Applied Intelligence 53, no. 22 (2023): 26497-26517.

BUT Systems and Analyses for the ASVspoof 5 Challenge

Published in Proc. ASVspoof 2024 Workshop, 1900

Citation: Johan Rohdin, Lin Zhang, Oldřich Plchot, et al. "BUT Systems and Analyses for the ASVspoof 5 Challenge," in Proc. ASVspoof 2024 Workshop, pp. 24-31.

Do End-to-End Neural Diarization Attractors Need to Encode Speaker Characteristic Information?

Published in Proc. Odyssey 2024, 1900

Citation: Lin Zhang, Themos Stafylakis, Lukáš Burget, Federico Landini, Mireia Diez, Anna Silnova. "Do End-to-End Neural Diarization Attractors Need to Encode Speaker Characteristic Information?" in Proc. Odyssey 2024, pp. 123-130.

How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio?

Published in Proc. Interspeech 2024, 1900

Citation: Tianchi Liu, Lin Zhang, Rohan Kumar Das, Yi Ma, Ruijie Tao, Haizhou Li. "How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio?" in Proc. Interspeech 2024, pp. 1105-1109.

Spoof Diarization: “What Spoofed When” in Partially Spoofed Audio

Published in Proc. Interspeech 2024, 1900

Citation: Lin Zhang, Xin Wang, Erica Cooper, Mireia Diez, Federico Landini, Nicholas Evans, Junichi Yamagishi. "Spoof Diarization: “What Spoofed When” in Partially Spoofed Audio," in Proc. Interspeech 2024, pp. 502-506.

Unsupervised Adaptive Speaker Recognition by Coupling-Regularized Optimal Transport

Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing, 1900

Citation: Ruiteng Zhang, Jianguo Wei, Xugang Lu, Wenhuan Lu, Di Jin, Lin Zhang, and Junhai Xu. (2024). "Unsupervised Adaptive Speaker Recognition by Coupling-Regularized Optimal Transport." IEEE/ACM Transactions on Audio, Speech, and Language Processing. vol. 32, pp. 3603-3617.

Analysis of ABC Frontend Audio Systems for the NIST-SRE24

Published in Proc. Interspeech 2025, 1900

Citation: Sara Barahona, Anna Silnova, Ladislav Mošner, Junyi Peng, Oldřich Plchot, Johan Rohdin, Lin Zhang, et. al. "Analysis of ABC Frontend Audio Systems for the NIST-SRE24" in Proc. Interspeech 2025, 5763-5767.

An Adaptation Framework with Unified Embedding Reconstruction for Cross-Corpus Speech Emotion Recognition

Published in Applied Soft Computing Journal, 1900

Citation: Ruiteng Zhang, Jianguo Wei, Xugang Lu, Yongwei Li, Wenhuan Lu, Lin Zhang, and Junhai Xu. (2025) An Adaptation Framework with Unified Embedding Reconstruction for Cross-Corpus Speech Emotion Recognition, Applied Soft Computing Journal, vol. 174, pp. 112948.

Towards Generalized Source Tracing for Codec-Based Deepfake Speech

Published in Proc. IEEE ASRU 2025, 1900

Citation: I-Ming Lin, Xuanjun Chen, Lin Zhang, Haibin Wu, Hung-yi Lee, Jyh-Shing Roger Jang. "Towards Generalized Source Tracing for Codec-Based Deepfake Speech" (accepted to Proc. IEEE ASRU 2025)

CA-MHFA: A Context-Aware Multi-Head Factorized Attentive Pooling for SSL-Based Speaker Verification

Published in Proc. ICASSP 2025, 1900

Citation: Junyi Peng, Ladislav Mošner, Lin Zhang, Oldřich Plchot, Themos Stafylakis, et al. "CA-MHFA: A Context-Aware Multi-Head Factorized Attentive Pooling for SSL-Based Speaker Verification," in Proc. ICASSP 2025, pp. 1-5.

Codec-Based Deepfake Source Tracing via Neural Audio Codec Taxonomy

Published in Proc. Interspeech 2025, 1900

Citation: Xuanjun Chen, I-Ming Lin, Lin Zhang, Jiawei Du, Haibin Wu, Hung-yi Lee, Jyh-Shing Roger Jang. "Codec-Based Deepfake Source Tracing via Neural Audio Codec Taxonomy," in Proc. Interspeech 2025, 1538-1542.

CodecFake+: A Large-Scale Neural Audio Codec-Based Deepfake Speech Dataset

Published in IEEE Transactions on Audio, Speech and Language Processing, 1900

Citation: Jiawei Du, Xuanjun Chen, Haibin Wu, Lin Zhang, I-Ming Lin, I-Hsiang Chiu, Wenze Ren, Yuan Tseng, Yu Tsao, Jyh-Shing Roger Jang, Hung-yi Lee. (2025) "CodecFake+: A Large-Scale Neural Audio Codec-Based Deepfake Speech Dataset" (submitted to IEEE Transactions on Audio, Speech and Language Processing)

Continual Unsupervised Domain Adaptation for Audio Deepfake Detection

Published in Proc. ICASSP 2025, 1900

Citation: Xiaohuan Chen, Wenhuan Lu, Ruiteng Zhang, Junhai Xu, Xugang Lu, Lin Zhang, Jianguo Wei. "Continual Unsupervised Domain Adaptation for Audio Deepfake Detection," in Proc. ICASSP 2025, pp. 1-5.

LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation

Published in Proc. ICASSP 2025, 1900

Citation: Hieu-Thi Luong, Haoyang Li, Lin Zhang, Kong Aik Lee, Eng Siong Chng. "LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation," in Proc. ICASSP 2025, pp. 1-5.

Multi-Sinkhorn Teacher Knowledge Aggregation Framework for Adaptive Audio Anti-Spoofing

Published in IEEE Transactions on Audio, Speech and Language Processing, 1900

Citation: Ruiteng Zhang, Jianguo Wei, Xugang Lu, Lin Zhang, Di Jin, Wenhuan Lu, Junhai Xu. (2025) "Multi-Sinkhorn Teacher Knowledge Aggregation Framework for Adaptive Audio Anti-Spoofing" in IEEE Transactions on Audio, Speech and Language Processing, vol. 33, pp. 3850-3865.

Optimal Transport with Class Structure Exploration for Cross-Domain Speech Emotion Recognition

Published in IEEE/ACM Transactions on Audio, Speech and Language Processing, 1900

Citation: Ruiteng Zhang, Jianguo Wei, Xugang Lu, Junhai Xu, Yongwei Li, Di Jin, Lin Zhang, Wenhuan Lu (2025). "Optimal Transport with Class Structure Exploration for Cross-Domain Speech Emotion Recognition." IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 33, pp. 37-53, 2025.

PartialEdit: Identifying Partial Deepfakes in the Era of Neural Speech Editing

Published in Proc. Interspeech 2025, 1900

Citation: You Zhang, Baotong Tian, Lin Zhang, Zhiyao Duan. "PartialEdit: Identifying Partial Deepfakes in the Era of Neural Speech Editing," in Proc. Interspeech 2025, 5353-5357.

Self-distillation-based domain exploration for source speaker verification under spoofed speech from unknown voice conversion

Published in Speech Communication, 1900

Citation: Xinlei Ma, Ruiteng Zhang, Jianguo Wei, Xugang Lu, Junhai Xu, Lin Zhang, and Wenhuan Lu. (2025) "Self-distillation-based domain exploration for source speaker verification under spoofed speech from unknown voice conversion." Speech Communication, vol 167: 103153.

SHDA: Sinkhorn Domain Attention for Cross-Domain Audio Anti-Spoofing

Published in IEEE Transactions on Information Forensics and Security, 1900

Citation: Ruiteng Zhang, Jianguo Wei, Xugang Lu, Lin Zhang, Di Jin, Junhai Xu, Wenhuan Lu. (2025) "SHDA: Sinkhorn Domain Attention for Cross-Domain Audio Anti-Spoofing" IEEE Transactions on Information Forensics and Security, vol. 20, pp. 6474-6489, 2025, doi: 10.1109/TIFS.2025.3576576

ShiftySpeech: A Large-Scale Synthetic Speech Dataset with Distribution Shifts

Published in NeurIPS 2025 Datasets and Benchmarks Track (submitted), 1900

Citation: Ashi Garg, Zexin Cai, Lin Zhang, Henry Li Xinyuan, Leibny Paola Garcia Perera, Kevin Duh, Sanjeev Khudanpur, Matthew Wiesner, Nicholas Andrews "ShiftySpeech: A Large-Scale Synthetic Speech Dataset with Distribution Shifts" (Submitted to NeurIPS 2025 Datasets and Benchmarks Track)

WeDefense: A Toolkit to Defend Against Fake Audio

Published in Workshop / Toolkit (in preparation), 1900

Citation: Lin Zhang, Xin Wang, Johan Rohdin, Junyi Peng, Tianchi Liu, You Zhang, Hieu-Thi Luong, Shuai Wang, Anna Silnova, Chengdong Liang, Nicholas Evans. "WeDefense: A Toolkit to Defend Against Fake Audio" (in preparation)

talks

‘Whether and When’: Detection and Localization of Partially Spoofed Audio

Published: May 25, 2023

2023-06-27: Introducing talk at Brno University of Technology (BUT) during visiting.
2023-05-25: Invited talk at Huawei Shield Trustworthy Lab.

‘Whether, When, What’: Detection, Localization, and Diarization of Partially Spoofed Audio

Published: July 17, 2024

2025-02-25: Invited talk at EURECOM, France, in person
2024-07-17: Invited talk at ISCA-SPSC online,
2024-01-30: PhD Defense, NII, Japan
- slides

What’s Happening on Partial Spoof?

Published: March 24, 2025

2025-03-24: Invited talk at UEF, Finland, in person
2025-02-25: Invited talk at EURECOM, France, in person
2025-04-18: Speech Technologies reading group, JHU, USA, in person

An Overview of Partially Fake Speech

Published: November 20, 2025

2025-11-20, IEEE SPS Webinar, online.
- Title: “Minor Manipulations, Major Threat: An Overview of Partially Fake Speech”
- Slides
2025-11-10, CLSP Webinar, JHU, USA.
- Title: “Minor Manipulations, Major Threat: An Overview of Partially Fake Speech”.
2025-08-15, Invited talk, Reality Defender, online
- Title: “Small Changes, Big Threat: A Story of Partial Spoof”