Artificial Intelligence and online security go head to head

Identifying what is hard for computers and easy for humans is important for a surprising combination of conflicting reasons: We want computers to be better so that they can help us, but we also want computers to be limited so that we are able to differentiate them from humans.

My little brother Guy took a computer programming class recently, and complained that programming is like telling an idiot savant how to do simple things: You need to spell out every step precisely, but you need not worry that a step will be missed or performed incorrectly. Why, he asked me, can we not just tell a computer what we want it to do? The answer is not clear, but it is clear that artificial intelligence applications are making it easier to do so.

We all know implicitly what things are hard for computers — we experience those difficulties every day. Try to tell your computer to look through your e-mails and find out if you won a million dollars. For one, there is no way to "tell" the computer that you want this.

Second, even if a computer is smart or has an application that can search your e-mail, it would do so and most likely say "no." In contrast, ask a human friend or an assistant to do the same, and he would laugh at you. What even a smart computer does not know is that any e-mail announcing a million-dollar windfall is probably a hoax.

While successful, present AI is far from the capabilities seen in futuristic motion pictures in delicate and telling ways. Computers cannot find out how to get the city of Chicago to pay for a pot-hole-caused flat tire. My assistant did; he searched web sites and found that Chicago will pay half of the expenses without need for a lawsuit. Autonomous cars cannot foresee that a kid will run across the street to his mom. web search engines cannot distinguish between positive and negative use of a word or relationships between words. The phrases "vegetable prices do not influence oil prices," "vegetables and oil," and "vegetable oil," would all yield very similar search results. And computer vision algorithms cannot identify a person reliably unless specific lighting and orientation conditions hold.

Identifying what is hard for computers and easy for humans is important for a surprising combination of conflicting reasons: We want computers to be better so that they can help us, but we also want computers to be limited so that we are able to differentiate them from humans.

Telling computers apart from humans helps decrease online banking breaches, fraudulent impersonations and spam e-mails. In the stone age of the Internet (10-15 years ago), a typical spammer sent e-mail spam by hand, eventually finding that his account was blocked. Employing a person to open accounts is expensive, so the spam industry developed a solution: a computer can open a million accounts with the same ease as it opens one. Distinguishing computers from humans helps prevent such virtual robots from opening e-mail accounts or breaking into bank accounts, thus increasing the cost of malice.

Nowadays, when a website wishes to verify that you are human, it asks you to read a distorted image of numbers and letters, called a CAPTCHA, which stands for "Completely Automated Public Turing test to tell Computers and Humans Apart." It asks you to type the result in a text box. Once your answer is verified, you are allowed to continue your transaction. Visual interpretation is hard for computers, so our ability to read those characters correctly verifies that we are humans. Unfortunately, CAPTCHAs are not fundamentally hard for computers, and an arms race is on between computer vision researchers and CAPTCHA programmers, who are creating ever more difficult to read images. This has resulted in CAPTCHAs that are hard for many humans to read too, so an end to this arms race seems near with a victory for computer vision and a failed test for humanity.

We need a new way to distinguish between computers and humans, based on a more fundamental understanding of the problems computers face. The crux of difficulty for computers is with knowledge of the everyday world. Computers do not know the many small details about the world that we know, and they cannot use textual information available from the web and encyclopedias. That is why my computer cannot replace my assistant. Even though it has access to information about flat tires and potholes in Chicago, it does not know how to present the options to me.

The inability to connect applications such as computer vision or natural-language processing with knowledge about the everyday world is a fundamental open problem for AI researchers. It is about creating computers that have both reasoning power and access to large databases, such that computers themselves can perform the desired task.

Can we take this understanding to create a new breed of CAPTCHAs? Perhaps, but this new breed will still have to verify answers given by humans (or computers), and will have to be able to generate those riddles automatically at random to counter computers' strength — particularly their ability to try many riddles quickly and in parallel. Generating the new breed of CAPTCHAs would require a non-trivial advance in AI, one that enables generating and verifying riddles filled with arbitrary world knowledge though are still solvable by almost all humans.

Finding better tests for distinguishing computers from humans is critical for effective computer security. Such tests are essential protections against attacks on the Internet, such as identity theft. But we have to be careful: If we develop stronger AI applications, especially those that draw on general knowledge the same way humans do, we will endanger Internet security. The challenge faced by AI researchers on the one hand and computer security researchers on the other is to defuse this conflict, or else one of the two groups will face failure.