FoCaL: Forum on Cantonese Linguistics

LSHK: The Linguistic Society of Hong Kong

LSA: Linguistic Society of America

Cantonese Corpus

only for those accessible online, for a full list see Wong Tak-sum’s page

  • Adult corpus
Year Corpus Paper
2020? CantoMap (The Cantonese MapTask corpus) Winterstein, Tang & Lai (2020)
2017 MYCanCor (Malaysia Cantonese Corpus) (not available yet) Liesenfeld (2017)
2016 UD Cantonese HK  
2012 PolyU Corpus of Spoken Chinese: Cantonese  
2010 广州话口语有声语料库 (Cantonese Spoken Corpus with Audio)  
1997-1998 HKCanCor (Hong Kong Cantonese Corpus), see also PyCantonese Luke & Wong (2015)
unknown CantoneseWaC  
Year Corpus Paper
2015-2018 Leo Corpus (Mandarin-English-Cantonese)  
2010s Bilingual Child Heritage Chinese Corpus (Chinese-English)  
2000s The Hong Kong Bilingual Child Language Corpus, or here (Cantonese-English)  
2000s Paidologos Corpus: Cantonese  
1990s HKU-70 Corpus  
1990s CANCORP (Hong Kong Cantonese Child Language Corpus), or here Lee et al. (1996)
1985 Guthrie Bilingual Corpus [Chinese-English]  
  • Diachronic corpus
Year Corpus Paper
1940s~1970s The Corpus of Mid-20th Century Hong Kong Cantonese (Phase 1&2)  
1870s~1930s Early Cantonese Tagged Database  
1860s~1920s The Early Cantonese Bible Database  
1860s~1920s Database of Early Chinese Dialects (Cantonese, Hakka, Mandarin, Hokkien, Wu)  
1860s~1890s Database of the 19th Century (1865- 1894) Cantonese Christian Writings  
1820s~1920s Early Cantonese Colloquial Texts: A Database  
  • Database
Year Corpus Paper
2021 Cantonese Wordnet, or here Sio & da Costa (2019)
2020 Cifu Lai & Winterstein (2020)
2010s Cantonese 4-word Idiomatic Expressions Excerpts and Audio Recording Database  
2012 English Loanwords in Hong Kong Cantonese  
2001 A Comparative Database of Modern Chinese and Cantonese  
1910s~1990s 粵音資料集叢 (A Database of Literature on Cantonese Phonology)  

Cantonese linguists

a self-serving handy list of online profiles; please remind me if I’ve missed any

Affiliation Linguist Specialization (not exhaustive)
HKU Robert S. BAUER phonology
HKUST Ka-wing CHAN  
OSU Marjorie K. M. CHAN phonetics & phonology, dialectology
UPenn May Pik Yu CHAN phonetics & phonology, music & language
CUHK Song Hing CHANG dialectology, phonology
LEI Lisa L.-S. CHENG syntax & its interfaces
CUHK Siu-Pong CHENG syntax
HKPolyU Candice Chi-Hang CHEUNG clinical linguistics, syntax
EdUHK Hin Tat CHEUNG acquisition, language disorders
CUHK Hung-nin Samuel CHEUNG historical linguistics, dialectology
HKPolyU SPEED Kwan-hin CHEUNG Cantonese Opera, tone-melody interface
CUHK Lawrence Yam-leung CHEUNG syntax, corpus linguistics
EdUHK Andy C. CHIN typology, sociolinguistics
HKBU Winnie CHOR cognitive linguistics, conversation analysis
UBC Una CHOW phonetics
Hokudai Yurie HARA semantics & pragmatics, phonology
UP7 Jiaying HUANG syntax, psycholinguistics
TMU Maki IIDA grammaticalization, SFPs
EdUHK Shin KATAOKA historcial linguistics
CUHK Bit-Chee KWOK dialectology, historical linguistics
UChicago Jackie Yan-Ki LAI syntax, morphology
CUHK Regine LAI phonology
UCSB Ryan Ka Yau LAI computational linguistics
EdUHK Yik-Po LAI typology, grammaticalization
HSUHK Charles LAM syntax & semantics
OUHK Cherry Chit-Yu LAM comparative syntax
UOM Chit Fung (Lawrence) LAM syntax, bilingualism
UBC Zoe LAM phonology, heritage languages
EdUHK Chaak-ming LAU NLP, lexicography, phonology
UCSC Jess H. K. LAW semantics & pragmatics
EdUHK Albert LEE phonetics, L2 acquisition
CUHK, TNU Hun-tak Thomas LEE acquisition, syntax & semantics
  Jackson L. LEE computational linguistics, phonology
CityUHK John LEE computational linguistics
UOH Man Ki Theodora LEE syntax & semantics
UConn Margaret Chui Yi LEE semantics, acquisition
CityUHK Po Lun Peppina LEE semantics
USC Tommy Tsz-Ming LEE syntax & semantics
CityUHK Wai Sum LEE phonetics & phonology
CUHK Margaret LEI acquisition, syntax & semantics
U of T Justin LEUNG heritage languages
UAEU Tommi LEUNG syntax, phonology
UBC Roger Yu-Hsiang LO phonetics & phonology, heritage languages
NTU Kang-kwong LUKE conversation analysis
CityUHK Suen Caesar LUN sociolinguistics
CityUHK Ziyin MAI L2 acquisition, bilingualism
HKU Stephen MATTHEWS typology, bilingualism
CUHK Peggy Pik Ki MOK phonetics
GDUFS Dingxu SHI syntax & semantics
UP Joanna Ut-Seong SIO syntax & semantics
UBC Rachel SOO phonetics
LEI Matthew SUNG dialectology
LEI Rint SYBESMA syntax
CUHK Kin Man Carmen TANG syntax processing, music & language
CUHK Sze-Wing TANG comparative syntax
CUHK Kwok Wai TING historical phonology
VTCHK Man Shan TONG syntax-phonology interface
UCSD Crono Ming San TSE syntax & semantics
Ronin Keith TSE syntax, variation
CityUHK, HKUST Benjamin K. TSOU variation, computational linguistics
HKBU John C. WAKEFIELD sociolinguistics, syntax
CUHK Patrick C.M. WONG neurolinguistics
HKPolyU Tak-sum WONG historical phonology, computational linguistics
HSUHK Yiu Kwan WONG historical phonology
Shue Yan Yike YANG phonetics, acquisition
CUHK(SZ) Foong Ha YAP language change, typology
UCL Moira YIP phonology
CUHK Virginia YIP bilingualism, acquisition
UW Anne YUE-HASHIMOTO grammar & phonology, typology
HKPolyU Caicai ZHANG neurolinguistics, phonetics & phonology
EdUHK Ling ZHANG phonetics & phonology