The notion of syllables
Much like Bengali or Hindi, Thai reading is based on the concept of syllables.A thai syllable is made of an initial consonnant, a vowel, and an optionnal end consonnant.
A tonal language
the thai language is a tonal language, meaning the same sound has different meanings depending on the tone. There are 5 tones in thai : low, medium, high, falling and rising.
Thai alphabet is the reflect of this complexity. Different consonnants exist for the same sound, and vowels are doubled to differentiate short and long sounds.
There is a total of 44 consonnants and 18 "simple" vowels, plus a dozen "complex" vowels.
Spaces
Thai is written with little to no spaces, it is often difficult to know where words start and end.
Transcription
Transcription rules from thai alphabet to latin alphabet are explained here : Royal Thai General System of Transcription
Unlike other alphabets such as cyrillic or greek, latin transcription of thai loses many information, for example short/long vowels and tones.
It is also important to note that there may be differences between what's explained in thai learning classes (which focus on pronounciation) and transcription.
Since our goal here is to link the characters you find in streetview and the data shown on google maps, we will focus strictly on transcription.
Basics of Syllables
The initial consonnant is the central part of the syllable, then the vowel is placed somewhere around the consonnant.
As an example we will take the พ = ph consonnant and add a vowel to it :
the vowel โ = o is written on the left : โพ = pho.
the vowel i is written as an accent : พี = phi.
the vowel u is written as a subscript : พุ = phu.
the vowel ะ = a i written after the consonnant : พะ = pha.
the vowel เ-ะ = e is around the consonnant : เพะ = phe.
The "non-consonnant" aw-ang : อ
Since thai vowels are always written around a consonnant, in order to write a single vowel without a consonnant attached, there is a อ letter, meaning the syllable has no initial consonnant.
It is commonly used in words starting with a vowel, or when multiple vowels follow each other.
Mute accent Gaaran
The mute accent Gaaran tells you the letter is mute and not used in transcription. Its most frequent place is at the end of some words, for example :
ศุกร์ = suk = friday. the final r is neither pronounced nor transcripted.
เสาร์ = sao = saturday. the final r is neither pronounced nor transcripted.
อาทิตย์ = atit = sunday. the final y is neither pronounced nor transcripted.
final consonnants
In thai, the same letter can be pronounced and transcripted in different manners depending whether it is an initial consonnant or a final consonnant.
For example, initial consonnant พ = ph becomes พ = p as a final consonnant.
Consonnants without vowels
When several consonnants are next to each other without vowels, there are different cases :
-it could be a cluster. A cluster is a set of 2 consonnants that go well together. In thai, clusters finish with r,l or w : kr, phr, pr, dr, kl, pl, gw, dw...
-it could be invisible vowels : it is a specificity of thai language, sometimes some vowels are not written, but they are pronounced and transcripted.
Invisible vowels are "a" or "o", there are 2 rules but those rules do not cover all cases, sometimes you'll have to guess.
Rule 1 : 2 syllable words with no vowels at all have "a" as vowel for the 1st syllable and "o" in the second syllable. you will find it in the word นคร = nakhon = city.
Rule 2 : single syllable words without vowel have "o" as vowel.