As stated in the glossary, one of the problems in Artificial Intelligence is software’s lack of common-sense knowledge about the world. Of course AI is a wide field and lack of common sense does not hurt Deep Blue’s chess playing abilities or the capabilities of an OCR program to recognize characters, both of which are the subjects of certain subcategories of AI. In communicating with humans and making sense of natural language on the other hand, this lack of common-sense is the main reason for computers’ lousy performance.
Several projects are attempting to solve this problem and, using different methods, trying to teach computers common-sense. This article discusses many of these projects, their approaches and the problems they face.
Common sense database projects Here are the most notable common sense database projects and a short description of each: Open Mind: Common Sense
There are more, but these give you a feeling. User’s entries are stored in a database and this database is readily available for download to be used in development. The database currently holds over 600.000 sentences and a lot of different relations of objects, concepts and actions. Word Expert is about teaching computers to make out the different senses of a word with multiple meanings. Currently working with English and French, Word Expert shows users different senses for the same word and then asks users to identify the meaning of the word in different sentences, hoping that this info can be used to help computers to make out the meaning of these words when encountered in a text based on the context. This will certainly help improve automated translations like the famous “Blind idiot” for “Out of sight, out of mind”. The Lost in Translation site allows you to test a few of your own. Open Mind – 1001 questions This seems to be working out quite nicely to group togeter related words, and is actually quite fun to teach because of the sometimes idiotic but often brilliant questions it asks. Cyc Knowledge Server and OpenCyc This is by far the most commercial of these projects (Cyc that is). My fear however is that it is too dependent on using logic and thereby avoids the blurry boundaries centric to human knowledge. The data in the database is however without doubt valuable, like the data from any of the other projects. I wonder why Cycorp doesn’t have live demos of applications made with their technology and why they don’t use volunteers browsing the web to help gather entries as the other projects do. MindPixel MindPixel is quite limited in its information gathering and as said in the text I am not a high beliver in binary representation of knowledge. Still the MindPixels are valuable data for systems with different approaches that may appear later. This project, under development by a single man, Erik T. Mueller, since 1994, seems still to have gotten quite a lot of respect from people working on the other projects as “the most sophisticated existing commonsense story understanding system.” The database is built up with categorization similar to the Cyc projects’ databases. The data can be browsed on the web, but a download of Mueller’s software is needed to add data or use it in one’s own applications. |
The individual projects and a short description of each can be found in the box to the right.
These are all interesting projects and can certainly be useful tools in some cases, but they make me wonder however about approaches to solve the common sense problem. How is our common sense knowledge stored in the brain and how can that be replicated or imitated for use in software. Marvin Minsky writes an excellent article about the subject in relation to the Open Mind: Common Sense – project. It states some of the problems involved and makes a good case for why this is such a big problem. Here’s an example from the article.
- Thus, when you hear a sentence like: “Fred told the waiter he wanted some chips,” you will infer all sorts of things. Here are just a few of these […].
- The word “he” means Fred. That is, it’s Fred who wants the chips, not the waiter.
- This event took place in a restaurant. Fred was a customer dining there at that time. Fred and the waiter were a few feet apart at the time. The waiter was at work there, waiting on Fred at that time. Fred wants potato chips, not wood chips, cow chips, or bone chips. There’s no particular set of chips he wants.
- Fred wants and expects the waiter to bring him a single portion (1-5 ounces, 5-25 chips) in the next few minutes. Fred will start eating the chips very shortly after he gets them.
- Fred accomplishes this by speaking words to the waiter. Fred and the waiter speak the same language. Fred and the waiter are both human beings. Fred is old enough to talk (2+ years of age). The waiter is old enough to work (4+ years, probably 15+). This event took place after the date of invention of potato chips (in 1853).
- Fred assumes the waiter also infers all those things.
This nicely states the problem. How are we going to teach a computer all these things that are needed to understand such a simple utterance? And more straight forward, how do we learn them ourselves as small children?
I said that I thought the projects listed here can be useful tools in some cases, but in order to make a general knowledge foundation, I get the feeling that many of them are off on the wrong foot. Take Cyc and OpenCyc as examples, they use very rigid definitions; everything is defined using a logical statement. A cat is a carnivore, carnivores are animals, animals are living things. Hence a cat is a living thing. Of course a cat is many other things as well and that can be and is represented in Cyc’s approach as well. But such categorizations can even in simple examples become very difficult. A car is a vehicle, a vehicle is a thing that transports people, but a toy-car is also a car and it is not used to transport people?!?
One theory about meaning or sense says that things are defined by the necessary and sufficient conditions that make it what it is. A car must have wheels, but having wheels is not sufficient to make it a car. Such definitions soon run short as well.
Another uses stereotypical things to define the sense. A stereotypical car might have four wheels, fit 5 people, have an engine and a steering wheel (and many other things as well of course). Something that would fit the stereotypical definition would be 100% car, and anything that does not fit the full criteria is less so. A toy car might be 70% car, but still a car. A dead cat would be a little less than 100% cat for the simple fact that it is not a living thing.
Dictionaries tend to describe roughly the stereotypical sense. A car (automobile actually), according to The American Heritage Dictionary is: “A self-propelled passenger vehicle that usually has four wheels and an internal-combustion engine, used for land transport.”
In any case, understanding a word requires a whole lot of common-sense knowledge and a child, even at the age of 2 has a lot better understanding of these basic things than we have managed to teach any such system.
MindPixel gathers people’s opinions on whether different statements are true or false and also stores information on how people rate the individual statement’s quality. This is based on the notion by Landauer that “the capacity of human long term memory to be 109 [one billion] bits, based on known rates of learning and forgetting”. I’m pretty sure this is not the way the human brain stores its knowledge. The human brain has around 100 billion neurons, each of them storing, not a binary, but an analog threshold value. Of course the brain is involved in a lot more than storing our common knowledge but somehow, our common sense knowledge is stored within this complex network, doubtfully in anything that resembles a binary format.
I will later write more on theories of how meaning is stored in the brain, but in the meantime let’s refer to The Brain as a more probable account of how this information might be represented (see also Web Brain). See also in this Open Mind: Common Sense article, how the data collected in that project has been used to create a tree-like representation of of knowledge.
Regardless of whether they are going in the right direction or not, what all of these projects do is gather important data on how people interpret things. And when we are closer to finding a decent way to represent common knowledge in software, this data will be important as a basis to load into these systems.
Why can’t we rather use the vast information on the web and mine it for this info? Because this information is not available there. Quoting Minsky’s article again:
- “[…] much of our commonsense knowledge information
has never been recorded at all because it has always
seemed so obvious we never thought of describing it.”
To test Minsky’s point, I Googled a few basic things about cars. At first I thought I might be proving Minsky wrong. I tested “car has wheels”, “car has a driver” and “car has a steering wheel”, which returned 189, 145 and 84 results respectively. But hang on, “car has no wheels” returns 15, “car has no driver” 27 and “car has no steering wheel” 12. And if you think you can just go with the higher number, think again: “car has no roof” returns 30 results whereas “car has a roof” returns 28. And if you try “What is car?” on Googlism, you will see sentences like “car is vibrating”, “car is thinking” and “car is unhappy” which would probably mislead our knowledge-thirsty web crawler somewhat 😉
The question of teaching computers common sense, in the end boils down to: How would you teach “someone” about the world if it had to be told everything, no matter how obvious and it would not have any chance to experiment with the world on its own, like a child does when it repeatedly throws its toys to the ground? Some people think that in order to teach common sense, the learner has to have a body to interact with the world. I’m not convinced that a body is necessarily needed, but lets put it this way: If you got a phone call (probably wrong number) from an extraterrestrial living in a universe where completely different rules apply (but fortunately with fundamental English skills as aliens commonly have, at least in the movies), how would you describe our world? Where would you start? And what common understanding would be needed for it to be possible at all?
I think it is very interesting to compare Cyc-and-like project to google and tools based on that. On one hand, you have a rather strict system, that must be “handfed” training information, on the other hand you have the caos of the combined consensus of whoever wants to put something out there.
I think it would be very interesting to see a common sense system build on Google or some other web-search tool and see what it comes up with.