Human language is our way of expressing ourselves verbally, communicating effectively, and breaking down barriers. Language, according to scholars, is a "cultural, social and psychological phenomenon" that can "help us better understand ourselves and why we behave the way we do." Small choices in language can have tremendous consequences, especially when powerful people choose their words hastily.

In education, though, language is treated as a separate domain from mathematics. While some curricula promote science, technology, engineering, and math (STEM) education, others emphasize the learning of languages, arts, and humanities. But this divisive outlook on education ignores the deep connection between mathematics and human language.

There are two ways in which language is directly linked to math: usage and evolution. Humans' use of words can be predicted using established mathematical laws - like Zipf's law, which says that the rank-frequency distribution of certain events is an inverse relation.

The language that adheres most closely to Zipf's law is Korean, whose rank-frequency distribution has the highest correlation coefficient - meaning that it creates a predictable mathematical relation.

graph of data
Frequency of word per million vs. word frequency rank, for the fifteen most common words in analyzed Korean popular culture. See our data.

Korean isn't the only language that follows a distinctly mathematical trend, though. Spanish, the second most-spoken language in the world, also has a high correlation coefficient, with only some variation from the trendline.

graph of data
Frequency of word per million vs. word frequency rank, for the fifteen most common words in analyzed Spanish popular culture. See our data.

While Spanish (and the closely related Portuguese) have words with far higher frequencies than the top-ranked word in Korean (Spanish's "de" has a frequency greater than 80,000 per million words, while Korean's "있다" has a frequency of only about 10,000 per million words), the mathematical relationship is preserved across nearly all of the world's most prevalent languages for which frequency charts are available.

graph of data
Frequency of word per million vs. word frequency ranking, for the fifteen most common words in many of the most widespread languages in the world. See our data.

The inverse relationship between word rank and frequency (as predicted by Zipf's law) is even more evident when all of the observed languages are modeled together on the same graph. By taking the sum of frequencies at each rank for all of the languages, the graphs condense into a nearly perfect mathematical relationship representing human language as a whole.

graph of data
Sum of frequencies per million by rank vs. word frequency ranking, for the fifteen most common words in many of the most widespread languages in the world. See our data.

Humans use established languages in deeply mathematical ways. But at the same time, the evolution of those languages can also be explained mathematically. Martin Nowak, a leading biological researcher, suggests that language, like life, has evolved by natural selection, where the words and grammar best suited to expressing the nature of the world prevail.

In fact, Nowak suggests that a universal grammar exists, representing the fundamental laws of nature, which is directly related to neural networks. This universal grammar is not learned, but understood and intrinsic to the learning of languages. And it can be modeled mathematically.

In the end, math and language are inextricably connected to one another, and should not be considered separate. When we force them apart in our rigid educational systems, we ignore the underlying connections and alienate students who prefer one. Language and mathematics should be taught as connected, because they are, undeniably, linked.

Sources of Data

Lexiteria Word Frequency Lists