Notes on Apple HEVC and HEIF from WWDC17

Apple are standardising on the next generation HEVC codec for video and image encoding, decoding and playback. HEVC (H.265) is a much better codec for dealing with video resolutions greater than HD. Here are my notes from Apple’s  2017 Worldwide Developers Conference sessions so far this week.

Here’s what Apple said about HEIF in the Platforms State of the Union address (from 1:08:07):

We’ve also selected a new image container called HEIF… HEIF supports the concept of compound assets. In a single file you can have one or more  photos or one or more images, you can have videos, you can have auxiliary data such as alpha and depth. It is also highly extensible: It supports rich metadata, animations and sequences, and other media types such as audio. HEIF is an ISO standard, which is critical for ecosystem adoption.

The pictures shown on screen during this section shows how flexible a HEIF contain can be.

A moment in time can be made up of multiple shots taken by cameras at the same time – such as the two in a iPhone 7 Plus. It can also have computed content, such as the depth map derived from the two images:

HEIF documents can also include multiple timelines of stills, video, metadata and data that structures all these things together:

I watched the first WWDC17 session on HEVC and HEIF. Here are my live tweets:

Here are some frames from the presentation.

The nature of HEVC .mov files. Each frame is an HEVC-encoded image. Both 8-bit and 10-bit encoding.

These devices will be able to decode HEVC movies. They may not be fast enough to play them back in real time. That might require a transcode to .H264.

Only some iOS and macOS devices will have HEVC hardware encode support, but all Macs that run macOS Sierra today will be able to encode in software.

More on the advantages of HEIF:


AV Foundation for HEIC capture

The AV Foundation Camera and Media Capture subsystem provides a common high-level architecture for video, photo, and audio capture services in iOS and macOS.


Class AVCaptureDepthDataOutput ‘A capture output that records scene depth information on compatible camera devices.’

Class AVDepthData ‘A container for per-pixel distance or disparity information captured by compatible camera devices.’

It has been extended to deal with ‘Synchronised Capture’ – for metadata as well as depth maps.

Superclass: AVCaptureSynchronizedData ‘The abstract superclass for media samples collected using synchronized capture.’

Class AVCaptureDataOutputSynchronizer ‘An object that coordinates time-matched delivery of data from multiple capture outputs.’

Class AVCaptureSynchronizedDataCollection ’A set of data samples from multiple capture outputs collected at the same time.’

Class AVCaptureSynchronizedDepthData ’A container for scene depth information collected using synchronized capture.’

Class AVCaptureSynchronizedMetadataObjectData ’A container for metadata objects collected using synchronized capture.’

Class AVCaptureSynchronizedSampleBufferData ’A container for video or audio samples collected using synchronized capture.’

A last thought from me:

iPhone 7 Plus cameras capture depth maps. iOS 11 can store them in HEVC .mov files. Camera manufacturers had better step up!

6th June 2017

Using Adjustment Layers as coloured scene markers in Final Cut Pro X

7th June 2017

BBC R&D’s IP Studio: Live production of big TV events in a web browser