As the internet becomes dominated by images, Facebook is launching a system which can “read” photos and tell visually impaired people what appears in them.
The internet is changing. From a medium based almost entirely on text, it is now becoming increasingly picture-led. An estimated 1.8 billion images are uploaded every day to social networks such as Twitter, Instagram and Facebook.
Good news for aspiring photographers, bad news for blind or partially sighted users who often have no way of telling what is in an image – despite the available modern assistive technologies.
But a new service from Facebook, being launched on Tuesday, is attempting to remedy that.
Blind people use sophisticated navigation software called screenreaders to make computers usable. They turn the contents of the screen into speech output or braille. But they can only read text and can’t “read” pictures.
Using artificial intelligence (AI), Facebook’s servers can now decode and describe images uploaded to the site and provide them in a form that can be read out by a screenreader.
Facebook says it has now trained its software to recognise about 80 familiar objects and activities. It adds the descriptions as alternative text, or alt text, on each photo. The more images it scans, the more sophisticated the software will become.
Some of the objects the new technology can recognise are:
- Transport – car, boat, aeroplane, bicycle, train, road, motorcycle, bus
- Environment – outdoor, mountain, tree, snow, sky, ocean, water, beach, wave, sun, grass
- Sports – tennis, swimming, stadium, basketball, baseball, golf
- Food – ice cream, sushi, pizza, dessert, coffee
- Appearance – baby, eyeglasses, beard, smiling, jewellery, shoes – and selfie
The man behind the development is Matt King, a Facebook engineer who lost his sight as a result of retinitis pigmentosa – a condition which destroys the light sensitive cells in the retina.
“On Facebook, a lot of what happens is extremely visual,” King says. “And, as somebody who’s blind, you can really feel like you’re left out of the conversation, like you’re on the outside.”
The technology that King and his team have developed uses Facebook’s in-house object-recognition software to decipher what an image contains. It has been trained to recognise items such as food and vehicles.
“Our artificial intelligence has advanced to the point where it’s practical for us to try to get computers to describe pictures in a meaningful way,” King says.
“This is in its very early stages, but it’s helping us move in the direction of that goal of including every single person who wants to participate in the conversation.”
The system currently describes images in fairly basic terms such as: “There are two people in this image and they are smiling.”