Opening the Door to Our World: Machine Literacy and Beyond
by Doug Hulette
Suppose you want to take on some stern challenge like, say, summiting Annapurna in the Himalayas of Nepal. You had better know what you’re doing.
The logical place to start would be to peruse books and guides that relate the adventures of other people who have done it, to learn from their experiences (including their missteps). With that knowledge, you might follow someone else’s lead, create your own unique experience, or, for that matter, scrap the whole idea.
Regardless of the final decision, it’s the process that captivates Assistant Professor Karthik Narasimhan, who joined the Computer Science Department in the fall of 2018. “Why can’t we build machines that, just like humans, can read manuals, wikis, and other literature to gain some knowledge of an environment and its conditions, rather than having them learn to perform tasks through trial and error?” he asks.
photo by David Kelly Crow
Narasimhan’s research focuses on natural language processing, or NLP, and reinforcement learning (RL), with the aim of developing “intelligent agents” that not only can read and understand English (or Russian or any other ordinary language, for that matter) but also can use their language skills to perform various tasks and adapt what they know to novel situations. The end goal: to create machines that learn to operate in the real world through both experience and existing human knowledge.
“Traditionally most of NLP has focused on supervised approaches, where you tell the system what’s what” he says. “Instead of training the system to understand and use natural language by providing labels, reinforcement learning requires the machine to interact with the world, make its own mistakes, and adjust its understanding accordingly.
“RL allows us to use weak, delayed forms of supervision, such as rewards in a game environment or successful completion of a task, to learn. This alleviates the dependence on large amounts of supervision that most current NLP techniques require. My research aims to learn semantic representations for language that are
optimized for a downstream task such as a robot following instructions or an agent playing a text adventure game.”
Associate Professor Barbara Engelhardt says she is eager to see the direction of Narasimhan’s research. “If machine learning is to have any impact on society, it is clear that it needs to understand and reason about images and language as well as humans do.” she says. “The field of NLP was born from this critical need. Princeton is fortunate now to have Karthik, a rising superstar in this area who has already shown that he can think creatively about and push the boundaries of problems in NLP.”
photo by David Kelly Crow
Narasimhan earned his doctorate from MIT in 2017 and then spent a year as a visiting research scientist at OpenAI, a San Francisco-based nonprofit research company dedicated to “discovering and enacting the path to safe artificial general intelligence.”
In an email exchange, he discussed his research and shared some thoughts regarding his work:
If an intelligent agent can digest all relevant human knowledge on a particular subject (or at least everything that’s in print) and apply it to random situations, have we erased the difference between man and machine?
We’re still a long way from erasing the difference, but my long-term goal is to enable machines to reach human-like capabilities in language understanding and decision making. As a research community, we’re making good progress, especially with the advent of deep neural networks that has given us the capability to significantly scale up statistical learning.
I’m especially excited about developing more holistic AI systems that can also incorporate other stimuli to learn joint representations for various concepts. For instance, the semantics of a word like ‘cup’ contains several aspects like shape and color attributes (visual), functional attributes like the ability to hold liquids, tactile attributes like the ability to be held and be broken, and so on.
In order to truly gain a sufficient representation of even such a commonly used, simple concept, it is not enough to learn just from textual corpora, which has traditionally been the focus of much research in natural language processing. Being able to simultaneously handle other modalities and interact with the environment can result in better language understanding systems.
Your research appears to parallel that of OpenAI; that is, to expand artificial intelligence into artificial general intelligence. Why is that important?
The (loose) notion of artificial general intelligence (AGI) is a machine that can equal humans at performing any task, physical or intellectual. Of course, this has deep roots in our curiosity to understand more about human intelligence, while also providing several technological advances that enhance human lives. Personally, I think AGI will be more of an evolving goal: as machines get smarter, they will enable humans to do things we’ve not been able to before and allow us to set new targets. The Turing test — achieving conversational capability that is indistinguishable from humans — is a similar example.
If you look at some of the latest systems like Google Assistant, which can make phone calls to book appointments on your behalf, can we say that the Turing test has been passed? In some sense, yes, but there are still a lot of unsolved problems in language understanding, some of which were previously not even on our radar.
What I think is even more interesting are machines that can help us go beyond what is currently possible. Examples from recent history include search engines that can return results from billions of webpages in less than a second, or systems that can discover new chemical reaction pathways that were previously unknown to scientists. This is where I think AI systems will play a critical role in the path of human evolution.
So, do you have any plans to climb Annapurna?
No, but someday I hope to!
photo by JuiIndy Joy