Distribution of consonants in onset and coda positions

The following tables of segmental frequencies were extracted from the phonetically transcribed part of the Dutch Celex database (Baayen et al. 1995). The syllable boundaries provided in Celex were used. Each syllable was parsed into a positional syllable template differentiating onset, nucleus and coda positions. The numbers in the following tables are based on the number of entities per syllable position.

Please note that ambisyllabic consonants are not tagged as such in the Celex database. They are consistently classified as onset consonants, which means that B-class vowels in polysyllabic words appear in open syllables in the Celex transcriptions. As a result, the numbers presented for coda consonants may be skewed.

Furthermore, the Celex (word) frequency count of 486 cases (out of 5380) is specified as zero - although these words are present in the Celex database. The frequency count of zero was taken over for the syllable counts.

A searchable xls-file with the raw Celex count data can be found here. Examples are provided for each syllable type. Moreover, the data set can be filtered with respect to word type (monosyllabic or polysyllabic word), stress type (stressed or unstressed syllable), each syllable position and all combinations of those elements. Celex token and type frequencies of the filtered data are given in the top left corner of the xls-file.


The following two tables list the absolute type and token frequencies (see grand totals) of each consonantal segment given in the (phonetically transcribed part of the) Celex database. The absolute frequencies are additionally split into onset and coda positions (see total). Furthermore, the relative frequencies of how often a segment occurs in onset vs. coda position are given (see % of grand total) for types and tokens.

Figure 1

[click image to enlarge]

Figure 2

[click image to enlarge]

Segmental frequency data are also available for all Dutch segments combined, as well as separately for vowels and consonants.

