
Source: Loaivat/Pixabay
The science of language and linguistics may help advance our understanding of the human mind and behavior. A new study published in the Proceedings of the National Academy of Science of the United States of America (PNAS) by University of Chicago researchers uses artificial intelligence (AI) to determine the timing of an important pediatric language milestone: when children are able to say something novel that they have not heard before by using a language rule.
“A difficult problem in describing language acquisition is knowing when children go beyond their input to produce novel, structured utterances—that is, to achieve linguistic productivity, the hallmark of human language,” wrote the study’s corresponding author and University of Chicago psychology professor Susan Goldin-Meadow, PhD, along with co-authors Raquel Alhama, PhD, Ruthe Foushee, PhD, Allyson Ettinger, PhD, and Afra Alishahi, PhD, and Dan Byrne.
When do children go beyond mimicking what they have heard and start producing their own original structured expressions? In other words, at what moment do children attain linguistic productivity? This question is a challenging one to answer scientifically as it requires knowing every utterance a child has encountered.
Language acquisition is the process where humans achieve the ability to understand and produce language, and linguistic productivity is the ability to generate and comprehend an unlimited number of expressions from a limited set of components and rules. Linguistics includes the sub-fields of phonetics (the study of speech sounds), phonology (the study of language sound systems), morphology (the study of word structure), syntax (the study of the construction of linguistic units that exceed a single word), and semantics (the study of meaning). It can be further subdivided into psycholinguistics (the study of how the mind processes language), neurolinguistics (the study of how the brain encodes language), sociolinguistics (the study of language and society), historical linguistics (the study of how language evolves over time), and computational linguistics (the study of speech and language using applied computer science).
To solve this, the researchers used massive real-world behavioral data captured over an extended period of time and a sophisticated AI model to make sense of the data.
The behavioral data used contains transcriptions of over a million spontaneous utterances of 90-minute interactions between 64 English-learning children and their parents at home captured every four months during the ages of 14 months to 58 months old that was gathered from a prior language development study by Goldin-Meadow et al. published in 2014 in American Psychologist journal by the American Psychological Association.
From this massive database of utterances, the team aimed to use computational models to determine the onset and development pathways of when the children started using determiner-noun combinations in English, such as “a book” and “the book,” and more.
“Our behavioral data gave us a rich picture of when children begin to productively combine determiners a and the with the same noun,” wrote the researchers.
The computational model the researchers used was an adaptation of the AI transformer model BERT (Bidirectional Encoder Representations from Transformers) that was developed by Alhama et al. in a prior paper introduced a year prior at the Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics.
Transformer models are deep learning models that are fundamental in natural language processing (NLP) that was first introduced in 2017 by Google researchers with their landmark paper, “Attention Is All You Need,” playful riff on the classic Beatles lyrics from the 1967 classic hit “All You Need Is Love.”
Transformer models sparked the meteoric rise of generative AI. Examples of transformer models include ChatGPT, Siri, Alexa, Google Translate, AlphaFold, and more.
What makes transformer models innovative is the ability to consider sequential information via positional encoding, and a self-attention mechanism that enables the AI to learn relationships among the words.
The research team discovered that on average, children began to make productive determiner-noun combinations at the age of 30 months old, which is approximately nine months after saying their first determiner.
“Marrying behavioral observations and computational modeling provides an approach that can be used to assess productivity in any language, spoken or signed,” the researchers wrote.
The researchers proved that they could computationally model the onset and pathways of linguistic productivity. As next steps, the researchers say that they can use the same model to understand the possible contributing factors that lead to the different timing and rate of productivity in the future.
Language bridges linguistics with psychology, the science of the mind and behavior. It is a foundational component of both communication and cognition. The application of groundbreaking AI is accelerating the understanding of the development of what makes us uniquely human.
Copyright © 2024 Cami Rosso All rights reserved.