Microsoft CaptionBot Needs Glasses

Microsoft CaptionBot

From the AI that brought you "How Old Do I Look?" and "What Dog?" comes a new cognitive capability: photo captioning.

Introduced at last month's Build conference, Microsoft's CaptionBot claims to understand the content of any image, and tries to describe it "as well as any human." But the site—which uses a combination of object recognition and natural language—may need to get its eyes checked.

Users can upload their own photo or paste the URL to an online image, and let Redmond's Cognitive Services work its magic. (The same magic that got Twitter chatbot Tay put in time out over racist tweets). Like Tay, CaptionBot has some kinks to work out: A picture of the Eiffel Tower, for instance, spit out an unsure diagnosis of "a clock tower in the middle of a field." The Statue of Liberty, meanwhile, was determined to be "the tower of clouds."

The AI is sometimes more on point, recognizing that President Obama is "a man wearing a suit and tie," and that LA's famous Hollywood sign is "a sign in front of a rocky hill."

Other efforts speak (hilariously) for themselves:

"These new capabilities are the result of years of research advancements. Specifically, I use Computer Vision and Natural Language to describe contents of images," CaptionBot said on its website. "I am still learning, so sometimes I get things wrong."

More than sometimes, according to the public: Folks around the world have been testing the AI and sharing their results online. nailed it

— Jeff Atwood (@codinghorror) April 13, 2016
Today I got body-shamed by Microsoft's #captionbot :-(

— Peter Oomsels (@Peteroomsels) April 14, 2016

Microsoft Cognitive Services are also available to iOS, Android, and Windows developers.

Share on Google Plus