When my children were much younger and took some interest in what I do for a living, they would occasionally ask me to explain what computational linguistics is, and to give them an idea of what computational linguists do. After my initial attempts to respond with academic verbiage were roundly dismissed as uninformative, I found myself facing a tough audience of irritatingly bright kids, who quite rightly insisted on clear, comprehensible answers to very good questions.
A typical conversation would go something like this:
Kids: So tell us, what do you really do at work Dad?
Me: We try to build precise models of how natural languages (like the ones that you speak) work. We test these models by using them to write computer programs that can automatically analyse certain aspects of a language.
Kids: So you teach computers to talk, like they do on Star Trek?
Me: Sort of, but not quite.
Kids: What’s a precise model of a language?
Me: It is a theory that gives a clear enough description of a set of linguistic properties to allow us to translate it into a computer program that recognises those properties in real language data.
Kids: Yeah right. So why don’t you work for Google and make more money?
Me: Good question. Now do you want to see our current grammar parser?
Kids: Thanks dad, but we’re actually pretty busy right now. By the way, we can teach you how to set up your smart phone properly, and how to fix the problem that you have been having with your computer graphics. But it will cost you.
My kids were right to demand a straightforward, non-technical account of computational linguistics that accurately captures its main features as a scientific discipline, while remaining accessible to non-specialists. Let me try to correct the inadequacies of my previous efforts to provide one.
Computational linguistics (CL) may be thought of as the study of natural language in the intersection of linguistics and computer science. It is a relatively young scientific field that developed out of the integration of theoretical linguistics, mathematical linguistics, artificial intelligence, and software engineering.
One of the reasons that it is difficult to identify CL as a well defined domain of research is that it faces Janus-like in two distinct, but clearly related directions. One of these is an engineering and technology face.
In its engineering aspect, CL focuses on natural language processing (NLP). It seeks to develop systems that facilitate human-computer interaction, and to automate a range of practical linguistic tasks. These tasks include (among others) machine translation, text summarization, speech recognition and generation, information extraction and retrieval, and sentiment analysis of text. In the past few decades NLP has grown into a major area of industrial research and development, with large information technology companies like Google, IBM, Microsoft, and Facebook investing increasing amounts of money and research effort into the creation of more refined language technology. A host of small startups devoted to these tasks also now populate the industrial research landscape. As a result CL/NLP has become an important part of the job market for people with degrees in linguistics and related fields.
The second face of CL is scientific. Looking in this direction, CL seeks to model natural languages as formal combinatorial systems. It attempts to understand the procedures through which humans are able learn and to represent these systems, given the processing resources of the human brain, and the linguistic data available to human learners. In this, CL shares many of the research objectives of theoretical linguistics and cognitive science.
So what is the connection between the two aspects of CL? In order to do good engineering it is necessary to have a solid scientific account of the area of the world that one seeks to manipulate through technology. A complex engineering task like landing a spacecraft on a comet requires a good theory of the physical processes and materials involved in implementing the task. Conversely, engineering work often generates important scientific insights.
The situation is not different in CL and NLP. In order to build a piece of language technology that works reliably over a large range of input, one must be able to explain and to model the properties of language that the application is designed to identify, and to modify.
Consider machine translation. In the 1950s, when computers were first making their appearance as research tools, it was naively thought that high quality, broad coverage machine translation might be achieved with large electronic dictionaries specifying lexical mappings between language pairs, and simple rules for constructing translated sentences in the target language. It was quickly discovered that this strategy produced poor output, much of it incomprehensible.
Machine translation systems have improved considerably since those days, but they are still highly variable in quality. Some of the progress made in the intervening years has been due to detailed study of the formal syntactic and semantic properties of language.
More significantly, the use of powerful statistical learning and modeling techniques has made it possible to analyse large amounts of data involving correspondences between source and target languages. Cognitive scientists have also fruitfully applied some of these models to account for different aspects of human learning and cognition. This is a case in which engineering methods developed for machine learning have yielded interesting and important insights into the way in which humans may acquire and represent knowledge of their languages. So the two aspects of CL inform each other.
Finally, it is worth emphasizing that CL offers a paradigm of interdisciplinary research between the sciences, both social and computational, and the humanities. CL applies computational and mathematical methods to the insights and the data of traditional linguistic theory. It also integrates the study of natural language in its formal dimension into cognitive science.
Although I have not yet succeeded in offering a satisfactory reply to my children’s questions, I hope that I have made some progress over my earlier efforts. I am also relying on my current audience to be less stringent in their demands. I may eventually get it right.
Shalom Lappin is Professor of Computational Linguistics in the Department of Philosophy, King’s College London. He was elected a Fellow of the British Academy in 2010. Through its Language Programme, the Academy is seeking to provide a range of perspectives on languages, as well as to showcase relevant research that our Fellowship is engaged in.