How to learn Cantonese step by step

Learning Cantonese can feel overwhelming at first. Six tones, thousands of characters, and a reputation as one of the hardest languages for English speakers. But with the right approach, Cantonese is absolutely achievable. The key is following a clear, step by step path that builds each skill on top of the last.
This guide breaks down the Cantonese learning journey into seven practical steps, from your very first day to confident conversation. Whether you're a complete beginner or someone who grew up hearing Cantonese but never formally studied it, this roadmap will help you make consistent, measurable progress.
Step 1: master the six tones
Tones are the foundation of Cantonese. Every single syllable in Cantonese is spoken with one of six tones, and using the wrong tone changes the meaning of the word completely. This is not optional or something you can "figure out later." Start with tones from day one.
The six Cantonese tones are:
- Tone 1 (high level): your pitch stays high and flat. Example: 詩 (si1, poem). Think of the pitch you use when saying "hmmm" while thinking.
- Tone 2 (mid rising): your pitch starts in the middle and rises. Example: 史 (si2, history). Similar to the rising intonation at the end of an English question.
- Tone 3 (mid level): your pitch stays at a comfortable middle level. Example: 試 (si3, try). Like your normal speaking voice held steady.
- Tone 4 (low falling): your pitch starts low and falls slightly. Example: 時 (si4, time). Similar to the low, dropping tone you might use when saying "oh" in disappointment.
- Tone 5 (low rising): your pitch starts low and rises. Example: 市 (si5, city). Like asking a surprised question with a low start.
- Tone 6 (low level): your pitch stays low and flat. Example: 是 (si6, yes). Like muttering something in a low, steady voice.
The most effective way to practice tones is through minimal pairs: words that differ only in tone. Listen to native speakers pronounce each tone, then record yourself and compare. Many learners find it helpful to associate each tone with a physical gesture or a familiar English intonation pattern.
Don't worry about perfection right away. Tone perception and production improve dramatically with exposure. The important thing is to be aware of tones from the start and actively practice them with every new word you learn. Within a few weeks of daily practice, you'll start hearing the distinctions naturally.
Step 2: learn Jyutping romanization
Jyutping is the standard romanization system for Cantonese, developed by the Linguistic Society of Hong Kong. It uses the Latin alphabet plus a number (1 through 6) to represent every possible Cantonese syllable and its tone. Learning Jyutping is like learning the phonetic alphabet for Cantonese: once you know it, you can pronounce any word correctly just from its Jyutping spelling.
For example, the word 你好 (hello) is written in Jyutping as "nei5 hou2." This tells you that 你 is pronounced "nei" with tone 5 (low rising) and 好 is pronounced "hou" with tone 2 (mid rising).
Some key Jyutping conventions to know:
- The letter "c" represents a "ts" sound (like the "ts" in "cats"). So 茶 (caa4, tea) sounds like "tsah."
- The letter "j" represents a "y" sound. So 人 (jan4, person) sounds like "yan."
- The combination "ng" can appear at the beginning of a syllable, which doesn't happen in English. Practice saying "ng" by starting from the end of the word "sing" and dropping the "si." The Cantonese word for five, 五 (ng5), is just this sound with tone 5.
- Final consonants like "p," "t," and "k" are unreleased, meaning you close your mouth or throat but don't push out any air. The word 一 (jat1, one) ends with your tongue touching the roof of your mouth but no puff of air.
There's also the Yale romanization system, which was historically popular in academic settings and textbooks. Yale uses diacritics and the letter "h" to indicate tones instead of numbers. For example, nei5 hou2 in Jyutping becomes néih hóu in Yale. Both systems are valid, and many modern resources like YumCha support both, letting you choose whichever feels more intuitive.
Spend a week or two getting comfortable with Jyutping before moving on. Practice reading Jyutping syllables aloud, and you'll have a powerful tool for learning pronunciation accurately throughout your entire Cantonese journey.
Step 3: build core vocabulary
With tones and Jyutping under your belt, it's time to start building your vocabulary. Focus on high frequency words that you'll use every day. Research shows that the most common 500 words in any language cover roughly 80% of everyday conversation, so your early vocabulary choices have an outsized impact.
Start with these essential categories:
Pronouns: 我 (ngo5, I/me), 你 (nei5, you), 佢 (keoi5, he/she), 我哋 (ngo5 dei6, we), 你哋 (nei5 dei6, you plural), 佢哋 (keoi5 dei6, they).
Common verbs: 係 (hai6, to be), 有 (jau5, to have), 去 (heoi3, to go), 嚟 (lai4, to come), 食 (sik6, to eat), 飲 (jam2, to drink), 睇 (tai2, to look/watch), 聽 (teng1, to listen), 講 (gong2, to speak), 買 (maai5, to buy), 做 (zou6, to do), 知 (zi1, to know).
Question words: 咩 (me1, what), 邊度 (bin1 dou6, where), 點解 (dim2 gaai2, why), 幾時 (gei2 si4, when), 邊個 (bin1 go3, who), 幾多 (gei2 do1, how many), 點樣 (dim2 joeng2, how).
Daily life words: 水 (seoi2, water), 錢 (cin2, money), 嘢 (je5, thing/stuff), 屋企 (uk1 kei2, home), 朋友 (pang4 jau5, friend), 工作 (gung1 zok3, work), 學校 (hok6 haau6, school).
Use spaced repetition to lock these words into long term memory. This means reviewing words at gradually increasing intervals: first after a few hours, then a day, then three days, then a week. Apps with built in SRS systems make this effortless because they automatically schedule your reviews at the optimal time.
Step 4: learn basic grammar patterns
Cantonese grammar is actually more straightforward than many European languages. There are no verb conjugations, no grammatical gender, and no plural forms for nouns. Tense is indicated by context and time words rather than verb changes. This means you can start forming sentences very early in your learning journey.
The basic sentence structure is Subject + Verb + Object, the same as English. 我食飯 (ngo5 sik6 faan6) means "I eat rice." Simple and direct.
Key grammar concepts to learn early:
Negation: use 唔 (m4) before most verbs to negate them. 我唔食 (ngo5 m4 sik6) means "I don't eat." For "don't have," use 冇 (mou5): 我冇錢 (ngo5 mou5 cin2) means "I don't have money."
Measure words (classifiers): when you put a number before a noun, you need a measure word in between. The most common one is 個 (go3), which works as a general purpose classifier. 一個人 (jat1 go3 jan4) means "one person." Other common classifiers include 杯 (bui1) for cups and drinks, 碟 (dip2) for plates of food, and 本 (bun2) for books.
Aspect markers: instead of verb tenses, Cantonese uses particles to indicate the state of an action. 咗 (zo2) indicates a completed action: 我食咗 (ngo5 sik6 zo2) means "I have eaten." 緊 (gan2) indicates an ongoing action: 我食緊 (ngo5 sik6 gan2) means "I am eating." 過 (gwo3) indicates a past experience: 我食過 (ngo5 sik6 gwo3) means "I have eaten (this before)."
Sentence final particles: these are uniquely Cantonese and add nuance to your speech. 啦 (laa1) softens a statement or makes a suggestion, 㗎 (gaa3) emphasizes something as obvious, and 呀 (aa3) makes a sentence gentler. Start by listening for these particles in native speech and gradually incorporate them into your own.
Step 5: practice listening and speaking
Cantonese is a spoken language first and foremost. Many learners make the mistake of focusing too heavily on reading and writing while neglecting their ears and mouth. Active listening and speaking practice should be part of your daily routine from the very beginning.
For listening practice, start with content designed for learners: slow, clear speech with familiar vocabulary. As your level improves, gradually transition to native content. Hong Kong movies and TVB dramas are excellent because they feature natural conversational Cantonese. Start with Chinese and English subtitles, then switch to Chinese subtitles only, and eventually try watching without subtitles.
For speaking practice, don't wait until you feel "ready." Start speaking from day one, even if it's just repeating words and phrases to yourself. Read your vocabulary lists aloud, paying careful attention to tones. Shadow native speakers by playing audio and trying to match their pronunciation in real time.
If possible, find a conversation partner. This could be a tutor on platforms like italki, a language exchange partner, or a Cantonese speaking friend or family member. Even 15 minutes of conversation practice per week makes a significant difference in your speaking confidence and fluency.
Speech recognition technology has also become a valuable tool for pronunciation practice. Apps like YumCha use speech recognition to give you instant feedback on your pronunciation and tones, so you can practice speaking even when a conversation partner isn't available.
Step 6: learn to read and write characters
Chinese characters might seem like the most intimidating part of learning Cantonese, but they're also one of the most rewarding. Each character is a small piece of art that often carries clues to its meaning within its visual structure.
Cantonese primarily uses Traditional Chinese characters, which are more visually complex than the Simplified characters used in mainland China. While this means more strokes per character, Traditional characters are often praised for being more logical and easier to distinguish from one another.
Start by learning radicals: the building blocks that make up characters. There are approximately 214 radicals, but learning the most common 50 will help you understand the structure of hundreds of characters. For example, the radical 水 (water, often written as 氵 when it appears on the left side of a character) appears in characters related to water and liquids: 海 (hoi2, sea), 河 (ho4, river), 湯 (tong1, soup).
Learn characters in context rather than in isolation. When you learn a new word, learn its character along with its Jyutping and meaning. Seeing the character used in sentences and real contexts helps it stick in your memory much better than rote memorization of individual characters.
For writing practice, start with stroke order. Every character has a specific stroke order that, once internalized, makes writing feel natural and fluid. Many learning apps and resources include stroke order animations that guide you through each character.
A realistic goal for your first year is to learn approximately 500 to 800 characters. This covers most of the characters you'll encounter in everyday contexts like restaurant menus, street signs, text messages, and social media.
Step 7: use a structured learning app
While all of the steps above can be done with free resources, using a well designed learning app ties everything together into a coherent, progressive curriculum. The best Cantonese apps integrate tones, Jyutping, vocabulary, grammar, listening, and speaking into a single structured experience.
YumCha was built specifically for Cantonese learners and addresses many of the challenges that general purpose language apps fail to handle well. It offers full Jyutping and Yale romanization support with the ability to toggle between them, spaced repetition for vocabulary retention, speech recognition for tone and pronunciation practice, an AI conversation tutor for natural dialogue practice, and support for both Traditional and Simplified characters.
Having a structured app gives you a daily practice routine without the decision fatigue of figuring out what to study next. Just open the app, do your next lesson, review your vocabulary, and practice your speaking. Fifteen to twenty minutes a day is enough to make steady progress.
Setting realistic expectations
Language learning is a marathon, not a sprint. Here's a rough timeline of what you can expect with consistent daily practice of 20 to 30 minutes:
- After 1 month: you can greet people, introduce yourself, count, order food, and ask basic questions. You recognize the six tones and can produce them with reasonable accuracy.
- After 3 months: you can have simple conversations about daily life topics. You know 200 to 300 words and can read basic signs and menus. Your tone accuracy is improving.
- After 6 months: you can handle most everyday situations in Cantonese. You understand the gist of conversations between native speakers when they speak clearly. You know 500+ words.
- After 1 year: you can hold extended conversations on familiar topics, understand most of what you hear in daily life, and read simple texts. You're solidly at an intermediate level.
These timelines assume consistent daily practice. Missing days is normal, but try to maintain a streak as much as possible. Even five minutes on a busy day is better than skipping entirely, because it keeps the language active in your brain.
Your first week action plan
To make this concrete, here's exactly what to do in your first week of learning Cantonese:
- Days 1 and 2: Learn the six tones. Listen to examples of each tone, practice producing them, and start recognizing the differences. Download a Cantonese learning app like YumCha and complete the introductory lessons.
- Days 3 and 4: Study the Jyutping system. Practice reading Jyutping syllables aloud. Learn your first 20 words: basic greetings (你好 nei5 hou2, 多謝 do1 ze6, 唔該 m4 goi1), numbers one through ten, and essential phrases.
- Days 5 and 6: Expand to 40 words including pronouns, common verbs (去 heoi3, 食 sik6, 飲 jam2), and question words (咩 me1, 邊度 bin1 dou6). Practice forming simple sentences.
- Day 7: Review everything from the week. Test yourself on tones and vocabulary. Watch a short Cantonese video and see how many words you can pick out. Celebrate your progress.
The best way to learn Cantonese is to start today and keep going. Every word you learn, every tone you practice, and every sentence you speak brings you closer to fluency. Cantonese is a beautiful, expressive, and culturally rich language that rewards dedicated learners with incredible experiences and connections. 加油 (gaa1 jau4)!


