New Apple job = Machine learning coming to Final Cut Pro X?

Monday, 24 July 2017

Computer vision and machine learning coming to Apple ProApps? On Friday Apple added a new opening to their jobs website:

Video Applications Senior Software Engineer

Combining the latest Mac OS with Apple quality UI design, the editing team builds the innovative next generation timeline experience in Final Cut Pro X.

The job requirements have some interesting clauses:

Work on the architecture that provides the canvas for telling a story in video and incorporate cutting edge technology to build future versions of the product.

What cutting edge technology could they be thinking of here?

Experience developing computer vision and/or audio processing algorithms

Do you have experience applying machine learning solutions with video and audio data in a product?

Seems like object and pattern recognition will be useful, perhaps for automatic keywording and point, plane and object tracking. This is to be expected as smartphones can do real-time face and object tracking in social media apps today.

At 2017 WWDC in June Apple announced that they will add object tracking to iOS and macOS later this year (link to video of tracking demo). Here's an excerpt from the video of the session:

Another new technology, brand-new in the Vision framework this year is object tracking. You can use this to track a face if you've detected a face. You can use that face rectangle as an initial condition to the tracking and then the Vision framework will track that square throughout the rest of your video. Will also track rectangles and you can also define the initial condition yourself. So that's what I mean by general templates, if you decide to for example, put a square around this wakeboarder as I have, you can then go ahead and track that.

They also talked about applying machine learning models to content recognition:

Perhaps for example, you want to create a wedding application where you're able to detect this part of the wedding is the reception, this part of the wedding is where the bride is walking down the aisle. If you want to train your own model and you have the data to train your own model you can do that.

Machine learning and audio?

Interesting that they are planning to recognise aspects of audio as well - keywording is straughtforward. What could machine learning automatically determine about captured audio? This could be the beginning of automatic audio mixing to produce rudimentary placeholders before audio professionals take over.

Recently there have been academic experiments investigating automatic picture editing (for example the Stanford/Adobe research I wrote about in May). I wonder when similar experiments will investigate sound editing and mixing?

Not just Final Cut Pro X

Although people are expecting machine learning to be applied to video and audio in Final Cut Pro X, remember that iMovie is the same application with the consumer-friendly UI turned on. What works in Final Cut Pro X can also be introduced to a wider market in iMovie for macOS, for iOS and Clips for iOS.

Logic Pro X 10.3.2 update sees Final Cut Pro X interchange improvements

Wednesday, 19 July 2017

It looks like the Logic Pro team are spending time making it work better with Final Cut Pro X. Logic Pro X was updated to version 10.3.2 yesterday. In the extensive list of new features and bug fixes, here are the points related to Final Cut:

  • When importing Final Cut Pro XML projects containing multichannel audio files, Logic Pro now reliably maintains channel assignments.
  • Large Final Cut Pro XML files now load more quickly.
  • Final Cut Pro X XML files now reliably import with the correct Sub-Role Names.
  • Logic Pro now creates a Summing Stack for each Parent Role when importing FCPX XML files.

I don't use Final Cut to Logic workflows, so I can’t say how reliable Logic is when interpreting Final Cut XML. It seems that the Logic team are more like the Adobe Premiere team when it comes to implementing features: don't wait until a feature is perfect, get it in, then make it better based on user feedback.

If you have bought Logic Pro X, the update is from the Mac App Store.

VR jobs at Apple: July 2017

Monday, 17 July 2017

There are a number of positions available at Apple in July 2017 whose job descriptions mention VR.

VR hardware

IMG CoreMedia VR Pipeline Engineer

The Interactive Media Group (IMG) provides the media and graphics foundation across all of Apple's innovative products, including iPhone, AppleTV, Apple Watch, iPad, iPod, Macs as well as professional and consumer applications from Final Cut to iTunes and iWork.

  • Strong coding skills in C with ARM on embedded platforms
  • 2+ years experience developing and debugging large software systems
  • Direct experience with implementing and/or designing VR or 360 video playback systems

The role requires the ability to help design, build and troubleshoot media services for playback and export.

Spatial Audio Software Engineer

  • Key Advantage : Experience with audio software subsystems including DAWs, Game Audio Engines including Unreal, Unity and/or audio middleware for game and AR/VR applications.
  • Experience with Spatial audio formats (Atmos, HOA etc) is desirable.
  • Experience with Metal and general working of GPU systems.
  • Experience with SIMD, writing highly optimized audio algorithms

What would be in VR file format?

IMG - CoreMedia VR File Format Engineer

  • Proven experience with Audio/Video components of a media software system
  • Direct experience with implementing and/or designing media file formats
  • Experience with VR and 360 video

Interesting that Apple feel the need for a VR file format. I wonder what will make Apple’s VR file format stand out? It will probably be able to be recorded/encoded on iOS and macOS. I wonder if will also work on tvOS and watchOS. If it doesn't work on non-Apple hardware, it could be part of an Apple plan for technological lock-in.

VR marketing

Creative Technologist

As a member of Apple’s Interactive Team, the Creative Technologist is responsible for driving innovation that enhances and enlivens the marketing of Apple’s products and services. This role requires collaboration with the design, UX, motion graphics, film/video, 3D, and development groups across Apple’s Marcom group.

  • Developing interactive prototypes in order to conceptualize and develop innovative approaches for Apple marketing initiatives.
  • Experience with Adobe Creative Suite.
  • Experience with Unity/Unreal and AR/VR development is a plus.
  • Motion graphics and 3D software (AfterEffects, Maya) skills.

It's a pity Apple Marketing doesn't require knowledge of Apple’s motion graphics application. 

Route-to-Market Mgr, WW In-Store Channel Digital Experience

The Route-to-Market Manager, WW In-Store Channel Digital Experience, is responsible for driving and executing all digital marketing communications as related to in-store Apple-led branded product presentation and campaigns.

  • Detailed knowledge of digital experience technologies - including but not limited to, on-device engagement tactics, digital content development, app development, beaconing, AR/VR, etc.

Using VR to make Apple products

Also a job requirement shows that Apple are using VR simulations to design power systems:

Senior Electromagnetics Analyst

  • Engineer will also need to do fundamental analyses and run SPICE simulations for VR conversion.

SPICE is a system that takes circuit designs and simulates specific results given specific inputs. 30 years ago it was a command-line based UNIX tool. Now Apple engineers are using VR to look around inside their hardware designs.

If you choose to apply for any of these jobs, good luck. Tell them Alex Gollner sent you!

Apple Pro Apps and macOS High Sierra compatibility

Friday, 14 July 2017

What versions of Final Cut Pro X are compatible with macOS High Sierra?

During Apple’s 2017 Worldwide Developer Conference, macOS High Sierra was announced. Apple has a public beta test programme, where you can sign up to try early versions of Apple operating systems before they are released. 

macOS High Sierra is supposed to a version of the Mac operating system that consolidates on previous features and stability. This gives Apple and third-party developers the chance to catch their breath for a year. They can concentrate on reliability and stable improvement.

The question for Final Cut Pro X, Motion 5, Compressor and Logic Pro X users is whether to update their Macs to High Sierra.

Apple says that if they are using older versions of these applications, if they want to use macOS High Sierra, they will need to update to 

  • Final Cut Pro X 10.3.4 or later
  • Motion 5.3.2 or later
  • Compressor 4.3.2 or later
  • Logic Pro X 10.3.1 or later
  • MainStage 3.3 or later

If you still use Final Cut Pro 7 - or any other applications in the Final Cut Studio suite (including DVD Studio Pro and Soundtrack Pro), or need to use them once in a while to open older projects, don't update all your Macs to macOS High Sierra:

Previous versions of these applications, including all apps in Final Cut Studio and Logic Studio, are not supported in macOS High Sierra.

Interesting that the ProApps team are pushing users forward this way. It will be interesting to see if new application features and bug fixes require newer versions of macOS than previous transitions. 

Final Cut Pro 7 was last updated in September 2010. It is impressive that it still runs on Macs being released in 2017.

If you have more than one Mac, perhaps it is worth keeping one on macOS Sierra for the foreseeable future. When the next major version of Final Cut appears, it is likely it will work on Sierra. If you don't have more than one Mac, prepare a clone of your most reliable macOS Sierra startup drive for future use when you need to revisit old projects.

Investigate HEVC/H.265 encoding using free chapter from Jan Ozer FFmpeg book

Wednesday, 28 June 2017

Apple have decided to stadardise on HEVC/H.265 video encoding in macOS High Sierra and iOS 11. Jan Ozer has written a book about how to encode video using the free FFmpeg encoding system.

He has made the chapter on HEVC encoding from the book free to download:

Below you can download a sample chapter of my new book, Learn to Produce Video with FFmpeg in 30 Minutes or Less. It’s Chapter 12 Encoding HEVC

If you have already installed FFmpeg (which includes the libx265 encoder), visit Jan's site to download the chapter and do some experiments. Check your results using the free VLC player.

PS: Although he doesn't cover HDR in this free chapter, investigate the X.265 documentation on the subject.

Apple’s HEVC choice: Codec battle 2018?

Wednesday, 21 June 2017

What does Apple’s choice of HEVC (H.265) mean for developers, users, viewers and streamers? Jan Ozer writes that it will take a year a so to find out. His predictions include:

No major publishers implement HEVC/HLS support before 3-6 months after iOS 11/MacOS Sierra ship. This leaves the door open for a full codec analysis between AV1 and HEVC, including encode and decode requirements, hardware support, cost, IP risk, HDR support, software support, the whole nine yards. At least in the US and Europe, one of these codecs will be codec next.

Marketing hype is global, codecs are local. Premium content distributors around the world will choose the best codec for their markets. In second and third world markets, iPhones play a very small role, and there will be plenty of low-cost Android phones, and perhaps even tablets and computers, without HEVC hardware support. In these environments, VP9/AV1 or another codec (PERSEUS?) might be best.

Frame.io Enterprise - online team edit reviews for enterprises

Tuesday, 20 June 2017

Today Frame.io announced that their online video production team collaboration system now has features that are useful for larger organisations:

Enterprise offers everything large companies need to manage their creative process at scale. Admins can organize teams by department, brand, production or whatever best suits your company structure.

With this organization teams can work in their own workspaces much like they do with Frame.io today. Admins can control team access and visibility and manage thresholds for team size and resource allocations all from a single platform.

Interesting news for Final Cut Pro X users who need to share edits and notes with other team members online.

Frame.io is a edit review system. Editors can share edits and rushes with others online.

Non-editors review edits in a web browser and can access media used in the edit and selected unused media. They can review edits and make notes at specific times in the edit. They can also make drawings that other team members can see. Useful when planning new shots or briefing changes that need to be made using VFX. Team members can even compare edits with side-by-side version control.

Editors can then import these notes as markers with comments so they can see the exact point in the edit the note is associated with.

Media companies are the beginning

Interesting that Frame.io chose the 'Enterprise' suffix for this new service. The announcement may say that Vice, Turner Broadcasting Systems and BuzzFeed are already using Frame.io Enterprise, but media companies should be the tip of the video collaboration iceberg. The very features described in the press release seem more suited to non-media companies and organisations.

Although desktop video has been around for over 20 years, it hasn't yet properly broken into the world of work as a peer to the report (word processing), financial documents (spreadsheet) and presentation (presentation). Microsoft and Adobe never got video production - or at least editing - into most offices. Now that everyone has a video camera in their pocket, it is time for someone to make this happen. Online or network collaboration will help.

Trojan Horse for Final Cut Pro X

At this point the Final Cut Pro X angle becomes relevant. Although frame.io integrates very well into the Adobe Premiere and Adobe After Effects user interfaces, those applications aren't big-business friendly. Due to their history, their metaphors are for editors and motion graphics designers. The very multiplicity of windows, panels and preferences are the kind of features that experienced editors and animators like. They look pretty threatening to people with other jobs. Final Cut Pro X is the application that can be used by people who need to get an edit done, or make last-minute changes based on some notes entered into frame.io by the CEO on her iPhone.

The question for the Final Cut ecosystem is whether a future version of X will allow the kind of third-party integration that makes the notes review process for frame.io in Adobe Premiere so much better than it is in Final Cut Pro X.

HDR production: Five concepts, 10 principles

Tuesday, 20 June 2017

It is likely that the next major versions of common NLEs will support HDR. As editors we will be asked about the right HDR workflow. For now it is a matter of picking a standard, following some guidelines and maintaining metadata.

Jan Ozer writes:

HDR sounds complex, and at a technical level it is. Abstractly, however, it involves just five simple concepts.

First, to acquire the expanded brightness and color palette needed for HDR display, you have to capture and maintain your video in 10-bit or higher formats. Second, you’ll need to color grade your video to fully use the expanded palette. Third, you’ll have to choose and support one or more HDR technologies to reach the broadest number of viewers. Fourth, for several of these technologies, you’ll need to manage color and other metadata through the production workflow to optimize display on your endpoints. Finally, although you’ll be using the same codecs and adaptive bitrate (ABR) formats as before, you’ll have to change a few encoding settings to ensure compatibility with your selected HDR TVs and other devices.

Jan is a great commentator on streaming technologies, read his HDR production workflow guide at StreamingMedia.com

What happens to cross-platform post applications when OSes are less equal?

Monday, 19 June 2017

Steven Sinofsky runs down his take on Apple's 2017 Worldwide Developer Conference announcements in a Medium post. He writes that Apple’s announcements on Machine Learning…

further the gap between device OS platforms (not just features, but how apps are structured) while significantly advancing the state of the art.

On how the iPad will soon be the best solution for day-to-day productivity:

Developers take note, as iPad-specific apps will become increasingly important in productivity categories.

In case you think Steven Sinofsky is an Apple-only commentator who believes they can do no wrong, he spent years competing with Apple at Microsoft. He started there in 1989, going on to run the team developing cross-platform technologies for Microsoft Office in the 90s and ended up as President of the Windows Division in 2009.

For the past 20 years it has been assumed that the Mac and Windows operating systems will have roughly the same features. Features for users to use every day and features for post production application developers to take advantage of.

Where there are gaps or differences in implementation, developers create solutions that work on both sides. Macromedia (in the 1990s) and Adobe created their own cross-platform media control layer to even out the abilities of the Windows and Mac operating systems. Companies that developed their code on Linux workstations had to implement many features not available in the operating system.

Allow me some Apple fanboy-ism: What if one OS pulls ahead? Can Adobe take advantage of new macOS abilities and add the required code to their applications so those features are also available on Windows? Will Blackmagic Design limit the features of DaVinci Resolve to those it can implement on Linux, Windows and macOS?

As Steven says, it is about application structure as well as operating system features. Can Adobe efficiently make Premiere work with macOS in very different ways than it works with Windows? 

Will there be a point where the fact that an application works on multiple operating systems - each benefitting from different hardware ecosystems - be less important than adding features to support evolving post production needs?

That point will come sooner if the ProApps team are able to update Final Cut Pro X, Motion 5 and Logic Pro X to make the most of the new features in Apple operating systems in coming months.

BBC R&D’s IP Studio: Live production of big TV events in a web browser

Wednesday, 07 June 2017

Interested in cloud-based high end post production? Live events and TV shows need live production. A post on the BBC R&D blog explains the challenges of making a system that can do live TV production in a web browser.

They are basing their research on a system ‘IP Studio’:

…a platform for discovering, connecting and transforming video streams in a generic way, using IP networking – the standard on which pretty much all Internet, office and home networks are based.

No buffering allowed:

It’s unacceptable for everyone watching TV to see a buffering message because the production systems aren’t quick enough.

Production systems (and their IP networks) must be able to handle 4K streams - even if final broadcast is viewed at lower resolution:

We’re not just transmitting a finished, pre-prepared video, but all the components from which to make one: multiple cameras, multiple audio feeds, still images, pre-recorded video. Everything you need to create the finished live product. This means that to deliver a final product you might need ten times as much source material – which is well beyond the capabilities of any existing systems.

The trick is dealing with time. All the varying delays from hardware and software have to be synchronised.

IP Studio is therefore based on “flows” comprising “grains”. Each grain has a quantum of payload (for example a video frame) and timing information. the timing information allows multiple flows to be combined into a final output where everything happens appropriately in synchronisation. This might sound easy but is fiendishly difficult – some flows will arrive later than others, so systems need to hold back some of them until everything is running to time.

The production setup has to be able to deal with all this data, so browser-based switching and mixing software has to be tuned to fit the PC/tablet/phone it is running on and the servers it interacts with:

…we are showing lower resolution 480p streams in the browser, while sending the edit decisions up to the output rendering systems which will process the 4k streams, before finally reducing them to 1080p for broadcast.

Find out more at the BBC R&D blog.

Notes on Apple HEVC and HEIF from WWDC17

Wednesday, 07 June 2017

Apple are standardising on the next generation HEVC codec for video and image encoding, decoding and playback. HEVC (H.265) is a much better codec for dealing with video resolutions greater than HD. Here are my notes from Apple’s  2017 Worldwide Developers Conference sessions so far this week.

Here’s what Apple said about HEIF in the Platforms State of the Union address (from 1:08:07):

We've also selected a new image container called HEIF… HEIF supports the concept of compound assets. In a single file you can have one or more  photos or one or more images, you can have videos, you can have auxiliary data such as alpha and depth. It is also highly extensible: It supports rich metadata, animations and sequences, and other media types such as audio. HEIF is an ISO standard, which is critical for ecosystem adoption.

The pictures shown on screen during this section shows how flexible a HEIF contain can be.

A moment in time can be made up of multiple shots taken by cameras at the same time - such as the two in a iPhone 7 Plus. It can also have computed content, such as the depth map derived from the two images:

HEIF documents can also include multiple timelines of stills, video, metadata and data that structures all these things together:

I watched the first WWDC17 session on HEVC and HEIF. Here are my live tweets:

Here are some frames from the presentation.

The nature of HEVC .mov files. Each frame is an HEVC-encoded image. Both 8-bit and 10-bit encoding.

These devices will be able to decode HEVC movies. They may not be fast enough to play them back in real time. That might require a transcode to .H264.

Only some iOS and macOS devices will have HEVC hardware encode support, but all Macs that run macOS Sierra today will be able to encode in software.

More on the advantages of HEIF:

 

AV Foundation for HEIC capture

The AV Foundation Camera and Media Capture subsystem provides a common high-level architecture for video, photo, and audio capture services in iOS and macOS.

New:

Class AVCaptureDepthDataOutput ‘A capture output that records scene depth information on compatible camera devices.’

Class AVDepthData ‘A container for per-pixel distance or disparity information captured by compatible camera devices.’

It has been extended to deal with ‘Synchronised Capture’ - for metadata as well as depth maps.

Superclass: AVCaptureSynchronizedData ‘The abstract superclass for media samples collected using synchronized capture.’

Class AVCaptureDataOutputSynchronizer ‘An object that coordinates time-matched delivery of data from multiple capture outputs.’

Class AVCaptureSynchronizedDataCollection ’A set of data samples from multiple capture outputs collected at the same time.’

Class AVCaptureSynchronizedDepthData ’A container for scene depth information collected using synchronized capture.’

Class AVCaptureSynchronizedMetadataObjectData ’A container for metadata objects collected using synchronized capture.’

Class AVCaptureSynchronizedSampleBufferData ’A container for video or audio samples collected using synchronized capture.’

A last thought from me:

iPhone 7 Plus cameras capture depth maps. iOS 11 can store them in HEVC .mov files. Camera manufacturers had better step up!

Using Adjustment Layers as coloured scene markers in Final Cut Pro X

Tuesday, 06 June 2017

Here’s an interesting use of my Alex4D Adjustment Layer to label scenes using different colours:

Will uses the new coloured Roles features introduced in Final Cut Pro X 10.3.

Download Alex4D Adjustment Layer from my old free Final Cut Pro X plugins site.

More on assigning roles to clips and  changing the names and colours of roles

Apple WWDC17 post-production and VR sessions

Monday, 05 June 2017

Here are the sessions worth tuning into this week for those interested in what Apple plans for post-production and VR. You can watch these streams live or review the video, slides and transcripts in the weeks and months to come.

Interesting sessions include ones on

  • Vision API to detect faces, compute facial landmarks, track objects, and more. It can recognise and track, faces, elements of faces, rectangles, barcodes, QR codes and other common shapes. The Vision API can be combined with machine learning models to recognise new objects. For example, if I buy a machine learning model that recognises car number places (license plates) or even whole cars, that can be fed into the Vision API so that those things can be recognised in stills and footage, and also be tracked.
  • Depth: In iOS 11, iPhone 7 camera depth data is now available to iOS apps - both for stills and as a continuous low-resolution stream to go with video. This means iOS video filters will be able to replace the backgrounds of stills and videos, or apply filters to objects in the middle distance without affecting the background or foreground.

Monday

Keynote

Apple covered all the announcements of interest to the media and general public, including a high-end focus in post production, VR and AR on iOS.

Platforms State of the Union

Apple went into more details on all news of updates to macOS, iOS, tVOS and watchOS. Video and PDF of presentation now available.

Tuesday

What's New in Audio

1:50 PM (PDT)

Apple platforms provide a comprehensive set of audio frameworks that are essential to creating powerful audio solutions and rich app experiences. Come learn about enhancements to AVAudioEngine, support for high-order ambisonics, and new capabilities for background audio recording on watchOS. See how to take advantage of these new audio technologies and APIs in this session.

Introducing Metal 2

1:50 PM (PDT)

Metal 2 provides near-direct access to the graphics processor (GPU), enabling your apps and games to realize their full graphics and compute potential. Dive into the breakthrough features of Metal 2 that empower the GPU to take control over key aspects of the rendering pipeline. Check out how Metal 2 enables essential tasks to be specified on-the-fly by the GPU, opening up new efficiencies for advanced rendering.

Introducing HEIF and HEVC

4:10 PM (PDT)

High Efficiency Image File Format (HEIF) and High Efficiency Video Coding (HEVC) are powerful new standards-based technologies for storing and delivering images and audiovisual media. Get introduced to these next generation space-saving codecs and their associated container formats. Learn how to work with them across Apple platforms and how you can take advantage of them in your own apps.

Advances in HTTP Live Streaming

5:10 PM (PDT)

HTTP Live Streaming allows you to stream live and on-demand content to global audiences. Learn about great new features and enhancements to HTTP Live Streaming. Highlights include support for HEVC, playlist metavariables, IMSC1 subtitles, and synchronized playback of multiple streams. Discover how to simplify your FairPlay key handling with the new AVContentKeySession API, and take advantage of enhancements to offline HLS playback.

Introducing ARKit: Augmented Reality for iOS

5:10 PM (PDT)

ARKit provides a cutting-edge platform for developing augmented reality (AR) apps for iPhone and iPad. Get introduced to the ARKit framework and learn about harnessing its powerful capabilities for positional tracking and scene understanding. Tap into its seamless integration with SceneKit and SpriteKit, and understand how to take direct control over rendering with Metal 2.

Wednesday

VR with Metal 2

10:00 AM (PDT)

Metal 2 provides powerful and specialized support for Virtual Reality (VR) rendering and external GPUs. Get details about adopting these emerging technologies within your Metal 2-based apps and games on macOS High Sierra. Walk through integrating Metal 2 with the SteamVR SDK and learn about efficiently rendering to a VR headset. Understand how external GPUs take macOS graphics to a whole new level and see how to prepare your apps to take advantage of their full potential.

SceneKit: What's New

11:00 AM (PDT)

SceneKit is a fast and fully featured high-level 3D graphics framework that enables your apps and games to create immersive scenes and effects. See the latest advances in camera control and effects for simulating real camera optics including bokeh and motion blur. Learn about surface subdivision and tessellation to create smooth-looking surfaces right on the GPU starting from a coarser mesh. Check out new integration with ARKit and workflow improvements enabled by the Xcode Scene Editor.

What's New in Photos APIs

1:50 PM (PDT)

Learn all about newest APIs in Photos on iOS and macOS, providing better integration and new possibilities for your app. We'll discuss simplifications to accessing the Photos library through UIImagePickerController, explore additions to PhotoKit to support new media types, and share all the details of the new Photos Project Extensions which enable you to bring photo services to Photos for Mac.

Vision Framework: Building on Core ML

3:10 PM (PDT)

Vision is a new, powerful, and easy-to-use framework that provides solutions to computer vision challenges through a consistent interface. Understand how to use the Vision API to detect faces, compute facial landmarks, track objects, and more. Learn how to take things even further by providing custom machine learning models for Vision tasks using CoreML.

Capturing Depth in iPhone Photography

5:10 PM (PDT)

Portrait mode on iPhone 7 Plus showcases the power of depth in photography. In iOS 11, the depth data that drives this feature is now available to your apps. Learn how to use depth to open up new possibilities for creative imaging. Gain a broader understanding of high-level depth concepts and learn how to capture both streaming and still image depth data from the camera.

Thursday

SceneKit in Swift Playgrounds

9:00 AM (PDT)

Discover tips and tricks gleaned by the Swift Playgrounds Content team for working more effectively with SceneKit on a visually rich app. Learn how to integrate animation, optimize rendering performance, design for accessibility, add visual polish, and understand strategies for creating an effective workflow with 3D assets.

Image Editing with Depth

11:00 AM (PDT)

When using Portrait mode, depth data is now embedded in photos captured on iPhone 7 Plus. In this second session on depth, see which key APIs allow you to leverage this data in your app. Learn how to process images that include depth and preserve the data when manipulating the image. Get inspired to add creative new effects to your app and enable your users to do amazing things with their photos.

Advances in Core Image: Filters, Metal, Vision, and More

1:50 PM (PDT)

Get all the details on how to access the latest capabilities of Core Image. Learn about new ways to efficiently render images and create custom CIKernels in the Metal Shading Language. Find out about all of the new CIFilters that include support for applying image processing to depth data and handling barcodes. See how the Vision framework can be leveraged within Core Image to do amazing things.

Friday

Working with HEIF and HEVC

11:00 AM (PDT)

High Efficiency Image File Format (HEIF) and High Efficiency Video Coding (HEVC) are powerful new standards-based technologies for storing and delivering images and video. Gain insights about how to take advantage of these next generation formats and dive deeper into the APIs that allow you to fully harness them in your apps.

Apple courts high-end post production with macOS High Sierra and new Macs

Monday, 05 June 2017

Today’s Apple’s Mac announcements were prefaced by saying that the next version of macOS will not be about adding new features, but about improving current and adding new underlying technologies for future versions. Despite that, their software and hardware announcements seemed to be going after high-end media producers. 

The new version of macOS (High Sierra) will include support for H.265 (High Efficiency Video Coding) support. This produces the same HD quality at 40% of the data rate, and better quality at much higher resolutions: 4K, 6K and 8K. Although there isn't much use for normal 4K video playback (3840x2160), it is a minimum resolution (3840x1920) for good quality VR video playback. 

Support will be available for encoding as well as playback for all Macs that can run macOS High Sierra (all the current Macs that can currently run macOS Sierra). For more powerful Macs, hardware H.265 encoding will also be supported. Final Cut Pro X and other pro applications were specifically mentioned as going to be able to support H.265 in future versions.

Apple chooses between Oculus Rift and HTC Vive

OS support of high-end graphics will improved with Metal 2, which now explicitly supports GPU cards that are installed in external GPU boxes.

The new Apple External Graphics Development Kit seems to bless the HTC Vive is the VR headset of choice for now. Kit includes ‘Promo code for $100 towards the purchase of HTC Vive VR headset.‘ A hint as to the kind of VR device Apple might be interested in making soon - room-scale VR. The kit includes a Sonnet’s new eGFX Breakaway Box and a AMD Radeon RX 580 8GB GPU.

Apple also announced that Metal 2 is also designed to support high-end development for all kinds of VR on the Mac:

  • Unity engine
  • Unreal engine

Unreal is used by many VR experience vendors. The keynote included a demo from ILMx featuring VR experience authoring running on a new iMac.

Final Cut Pro X was mentioned a second time when it was mentioned in the context of being able to edit VR 360º Video without plugins.

High-end video features of iOS 11

Apple also announced sessions that cover features in iOS 11 that would be useful on Macs doing post production.

In iOS 11, apps can get access to depth map information from the iPhone 7 Plus camera system. That means applications will be able to build 3D models of what is captured. The API includes giving video capture apps the ability to capture a stream of depth data alongside video information. Very useful for being able to composite CG graphics into scenes, so that imaginary objects can be drawn further away and be drawn behind real-life objects and people closer to the camera.

iOS 11, macOS High Sierra and tvOS will also have a new Vision framework:

a new, powerful, and easy-to-use framework that provides solutions to computer vision challenges through a consistent interface. Understand how to use the Vision API to detect faces, compute facial landmarks, track objects, and more. Learn how to take things even further by providing custom machine learning models for Vision tasks using CoreML.

CoreML is a the new Machine Learning part of Apple’s OSes.

New MacBooks and iMacs available today

High-end pros are also being courted through new hardware today and the promise of new hardware tomorrow. Today sees improvements in speed and configurations for MacBooks, MacBook Pros and iMacs.

iMac

  • Faster Kaby Lake processors 2.3/4.5 GHz
  • SSD storage twice as fast
  • Thunderbolt 3
  • 50% faster Radeon Pro 500-series graphics 
  • 43% brighter 500-nit display

MacBook Pro

  • Faster Kaby Lake processors 3.1/4.1 GHz
  • More RAM in discrete graphics

Press information on Mac updates

iMac Pro in December

Earlier this year Apple said that they are working on a new Mac Pro with more power and more modularity. Today Apple made their task that much harder by previewing a new iMac Pro that is much more powerful than the current Mac Pro.

  • Up to 18-core processors
  • 22 Teraflops of GPU performance using Radeon Vega GPUs with 16GB of RAM
  • Up to 4TB of SSD
  • Up to 128GB of ECC RAM
  • 2 Thunderbolt 3 controllers (so two RAID arrays and 5K displays can be connected)
  • 4 Thunderbolt 3 ports
  • 10Gb Ethernet port
  • Available in Space Grey
  • Prices starting at $4,999

Press information in the iMac Pro.

Hardware for Pros

As the new iMac Pro is much more poweful than the current Mac Pro and at lower prices, there is no doubt that Apple is still interested in pros.

Who wouldn’t want to do post-production on the new iMac Pro? It looks like Apple are going directly after companies hoping to get high-end postproduction folk to switch from Mac. Could this be bad news for Dell and HP? Perhaps. Now that Apple have revealed the specs of a future iMac, competitors might be able to make sure that their hardware matches Apple by December. Apple must be confident though, otherwise they would not have revealed the price. That they did means that they probably think that their competitors don't have the ability to compete even with 5 months notice.

VR, Final Cut Pro X and Apple WWDC17

Saturday, 03 June 2017

This week at their Worldwide Developer Conference in San Jose, Apple are making announcements about future products and services. They are also giving presentations for developers about macOS, iOS, tvOS and watchOS.

Over the week I’m hoping to see updates relevant to VR and Final Cut Pro X at WWDC17.

VR

To mix reality (make it seem that graphics is interacting with the real world) it is useful to be able to record the distance of objects from the camera. This means graphics can be seen to be hidden behind objects caught on camera. So I'm hoping for depth maps recording by Apple’s device cameras ( Apple’s patent on Depth mapping based on pattern matching and stereoscopic information ) - as well as the iPhone, this would be useful for iPad, Macs and Apple TVs.

If Apple want to help applications deal with mixed reality, it would be very useful if there was at lease one new flavour of ProRes, one that records depth maps too. ProRes 44444 anyone? While we are on the subject, context sensitive video overlays are likely to become more popular in coming years, so it would be very useful if more PreRes flavours included alpha channel information: ProRes 4224 and 4224 Proxy would be a start!

Siri Speaker room sensor. If Apple to release a high-quality speaker that customises the sound it produces based on position relative to a screen and depending on the shape of the space it is in, those sensors could also be used for room-scale VR and MR. The speaker could help detect where VR/MR headsets being used are in 3D space.

The problem with phone screen resolution and VR headset resolution is that VR resolution on an iPhone is overkill, but iPhone resolution is a little low for high-quality VR. For when Mixed Reality becomes popular, you won't be able to see what is in the real world if an iPhone is in the way. However, the GPU in iPhones is very powerful. If it could be kept in a pocket (to provide relative position in 3D space) while providing GPU power to an external device via Lightning (or Thunderbolt 3), then that lightweight device can be that much lighter and need less battery. So look for iPhone GPU power for external devices via lightning or thunderbolt.

Final Cut Pro X

As regards Final Cut Pro X, look out for updates to Core Media and AV Foundation in macOS (and iOS if Final Cut Pro X/iMovie is to move to the iPad). Although some say that the ProApps team don’t use developer frameworks, you can see how the OS team and the ProApps team are thinking about media in modern OSes with the Core Media and AV Foundation updates. 2015 AV Foundation presentation, 2016 AV Foundation presentation.

For those interested in cloud-based Apple services for pros, also look out for updates on HLS - HTTP Live Streaming and iCloud. 2016 HLS presentation.

The good news is that Apple will publish videos and transcripts of all the WWDC 2017 presentations. I'll update this post with links to relevant videos as they become available.

 

 

 

The H.265/HEVC state of play

Thursday, 01 June 2017

Apple seem pretty quiet when it comes to blessing the H.265 codec. It is a codec dedicated to better quality for large raster video at lower bandwidths. These kind of codecs are needed for 4K broadcast and streaming, and are useful for 360º/VR video distribution.

Like DV, HDV and H.264, new codecs are designed to be efficient using hardware that is expected to be commonly available a few years after launch.

That means that H.265 (aka HEVC or ‘High Efficiency Video Coding’) algorithms expect to access the kind of power that isn't mainstream yet. Although today’s commonly available hardware can decode H.265 quickly, encoding is more of a problem. This is especially true of Apple’s currently anaemic Mac hardware.

Another reason is that patents and algorithms mean that the best way of encoding 4K, 6K and 8K video streams hasn't yet been settled on by the industry.

The state of High Efficiency Video Coding codecs has been summarised by Jan Ozer of Streaming Learning Center. A PDF of the presentation he gave at Streaming Media East in May describes the state of play comparing different HEVC implementations, VP9 and Bitmovin AV1.

His conclusions include: 

  • Particularly at lower bitrates, x265, Main Concept (H.265) and VP9 deliver substantially better performance than H.264
  • Both HEVC codecs and VP9 produce very similar performance
  • Choice between x265 and Main Concept (H.265) should be based on factors other than quality
  • AV1 Encoding times are still very inefficient
  • AV1 is at least as good as HEVC now, and will likely be quite a lot better when specification has been fully decided on - it is still in development

Find out more on Jan’s blog.

The bottom line? Here is Jan commenting on some feedback to a post of his at Streaming Media

HEVC will do well in broadcast, no doubt. Still not available in any browser, iOS, and Netflix prefers VP9/AV1 over HEVC for Android. VP9 gets you most browsers and many smart TVs and OTT boxes (like Roku 4), so it's the smart money UHD codec if you don't need HDR.

Automated video editing will very soon be ‘good enough’

Tuesday, 30 May 2017

A team of Stanford researchers have published a paper on automatic editing of dialogue scenes. Their system may not automatically edit well, but it can now create edits that the majority of people will see as ‘good enough.’ This means editors have new competition. As well as being technically proficient, and be able to handle all sorts of political and psychological situations, they will have baseline edits to improve on.

Computational Video Editing for Dialogue-Driven Scenes describes a system where different combinations of editing priorities (which kind of shots to favour, which kind of performances to prioritise) are defined as editing idioms. These idioms can then be applied to footage of dialogue scenes when accompanied by a script.

Identify clips

Their system takes a script formatted in an industry standard way, analyses multiple takes of multiple camera setups and divides ranges of each take into candidate clips. These shots are assigned automatically generated labels defining

  • The name of the character speaking the line of script
  • The emotional sentiment of the line of script (ranging from negative to positive via neutral)
  • The number of people in the clip
  • The zoom level of the clip, i.e. the framing of the clip (long, wide, medium, closeup, extreme closeup)
  • Who is in the clip
  • The volume of the audio in the clip
  • The length of the clip (as part of a much longer take - the speed a given line is said)

Editing idioms

The researchers then analysed multiple editing idioms (pieces of editing advice) and worked out what combination of clip styles would result in edits that match a given style:

Change zoom gradually: Avoid large changes in zoom level

Emphasize character: Avoid cutting away from an important character during short lines from the other characters Favor two kinds of transitions; (1) transitions in which the length of both clips is long, and (2) transitions in which one of the clips is short and the important character is in the set of visible speakers for the other clip and both clips are from the same take.

Mirror position: Transition between 1-shots of performers that mirror one another’s horizontal positions on screen.

Peaks and valleys: Encourage close ups when the emotional intensity of lines is high, wide shots when the emotional intensity is low, and medium shots when it is in the middle.

Performance fast: Select the shortest clip for each line

Performance slow: Select the longest clip for each line

Performance loud: Select the loudest clip for each line

Performance quiet: Select the quietest clip for each line

Short lines: Avoid cutting away to a new take on short lines

Zoom consistent: Use a consistent zoom level throughout the scene

Zoom in/out: Either zoom in or zooming out over the scene

Combine idioms to make a custom style

Using an application the researchers showed how individual idioms (or pieces of editing advice) and specific instructions (“start on a wide shot’ or ‘keep speaker visible’) can be combined to make an editing style. Each element can be given a weight ranging from ‘always follow instruction’ to ‘always do opposite of instrcution.’

This UI mockup shows how an editing style can be built where the elements are ‘start with a wide shot, avoid jump cuts, show speaker’:

 

The paper comes with a demo video that explains the process and give examples of a scene professionally edited and the same scene automatically edited using different editing styles.

To see more example videos and source footage visit the paper’s site at Stamford.

Time savings

The impetus behind developing system was to save time, and to save the cost of hiring a professional editor. 

For multiple dialogue scenes the researchers timed how long it took for an professional editor to review all footage and come up with an edited scene. As this method is at the research stage, the kind of analysis that the tools need to do on the video takes a long time. In the case of the scene shown in the demo and in the screenshot, a 27 line scene with 15 takes (of varying shot size and angle) amounting to 18 minutes of rushes took 3 hours and 20 minutes to analyse. The professional editor took 3 hours to come up with an edit.

The advantage came when changes needed to be made in editing style. The automated system could re-edit the scene in 3 seconds. It would take many times longer for an editor to re-edit a scene following new instructions. The analysis stage was done on a 3.1 GHz MacBook Pro with 16GB of RAM. With software and hardware improvements the time it takes to turn multiple takes into labelled clips will reduce significantly.

What does ‘good enough’ mean for editors and post production?

To me this method marks a tipping point. For productions with many hours of rushes, these kind of automated pre-edits are good enough. Good enough to release (with a few minutes of tidying up) in some cases. Good enough to based production decisions on (such as ‘We can now strike this set’). Good enough so that a skilled editor can spend a short time tidying some of the automated edits and preparing it to be shared with the world.

Although the researchers haven't encoded the kind of editing idioms many good editors actually follow, the ones they have chosen will do for many situations. There are two reasons for this: the researchers don’t know these practices, or they don't yet have a way to detect elements of scripts and source footage that editors currently base their personal editing idioms on.

One of the great things about the job of being an editor is that it is hard for others to compare your editing abilities with other editors. Up until now, a person would have to look at all the footage and all the versions of the script for a given production to judge whether the editor got the best possible result. Even then, that judgement would only be one more person’s opinion.

Now an editor’s take can be compared with automated edits like the ones described in this paper. Their style will soon be able to be detected and encoded as an editing style for automated edits. Could I sell a plugin based on my editing idiom? 0.1% of receipts would be big enough royalty of me!

The good news for editors who are worried about being replaced is that once your skills get to the level of ‘not obviously bad’ - which is the ability to do edits that aren't jarring, that flow from moment to moment and scene to scene - other factors take over: to be the kind of person who fits into the wider organisation, to be the person who you can share a small space with for hours on end, a person who can judge the politics and psychology of situations with collaborators at all levels.

Who knows when this kind of technology will be available outside academia? For now it is worth bearing in mind that alongside the three researchers from Stanford University, Mackenzie Leake, Abe Davis and Maneesh Agrawala the authorship of the paper was also shared with Any Truong of Adobe Research. 

Today at Apple: Hours of free classes on Final Cut Pro X

Monday, 29 May 2017

‘Today at Apple’ is a new programme of events and training at Apple locations worldwide. ‘Pro Series Sessions’ is a category of free training for Final Cut Pro X and Logic Pro X. Here is a rundown of free education for those looking to learn about Final Cut Pro X. The sessions are designed so that you run your copy of Final Cut Pro X on a MacBook you bring in. If you don’t yet have Final Cut Pro X, a MacBook Pro with it installed can be provided for each session.

Go to the 'Today at Apple - Pro Series Sessions’ page for US · UK · France to find out when these sessions are available at an Apple Store near you and book your free places. There are other training sessions at Apple stores, visit the ‘Today at Apple’ page, choose your country to find out more.

Intro To Final Cut Pro X

No matter how you plan to use Final Cut Pro, this 90-minute session takes a deep dive into its features. Let us show you how to arrange clips to tell your story, perfect the look of your video and improve audio quality. A MacBook with Final Cut Pro can be provided, or bring your own. Attendees should have a good understanding of movie editing and Mac basics, or be stepping up from iMovie.

Pro Series: Import, Sort and Organise

Film editors know that to be efficient, you need to be organised. Join us for a Final Cut Pro session on workflow. We’ll show you smart settings for video importing, ways to sort your media and how to organise like a pro. Attendees should have a good understanding of movie editing and Mac basics, or be stepping up from iMovie.

Pro Series: Techniques for Storytelling with Final Cut Pro X

Join us as we explore creative storytelling in Final Cut Pro X. You’ll discover how techniques in colour, music and editing can push your narrative forward and captivate your audience. Attendees should have a good understanding of Final Cut Pro X or be stepping up from iMovie.

Pro Series: Refine Your Audio with Final Cut Pro X

Join us as we explore controls and techniques in Final Cut Pro X that allow you to sculpt sound to match your scenes. We’ll explore noise reduction, add effects and music, and mix down to create stunning audio to go with your visuals. Attendees should have a good understanding of Final Cut Pro X or be stepping up from iMovie.

Pro Series: Colour Correction and Grading with Final Cut Pro X

Join us as we explore how colour can make your project visually and emotionally stunning. We’ll use colour correction and grading to balance colours in your movie. Then we’ll explore techniques that emphasise colour in stylistic ways. Attendees should have a good understanding of Final Cut Pro X or be stepping up from iMovie.

Pro Series: Create Studio-quality Titles

Join us and we’ll focus on when and how to use titles and text to set the tone of your movie. You’ll create, alter and add exciting effects to your movie’s text. Then we’ll practise our skills by completing a mini challenge. Attendees should have a good understanding of Final Cut Pro X or be stepping up from iMovie.

If you also want to learn Logic Pro X, there is a 90-minute intro and 60-minute sessions on Looping and Layering, Editing for Emotion plus Mixing and Mastering.

Studio Hours

The Pro Series sessions aren't yet available in many countries, but Apple Stores all over the world offer ‘Studio Hours’ - these are sessions where people who have started or who are about to start a project can work in a store with an Apple Creative nearby. Creatives are there to offer advice and tips on how to design, setup and progress your project. These hours are grouped by topic. As well as video projects, studio hours are also available for music, for photos, for documents, presentations and spreadsheets and for art & design projects.

Apple’s new free year-long course in app development - How about film making next?

Wednesday, 24 May 2017

Today Apple announced a free course that is available for school and university students to learn coding:

Apple today launched a new app development curriculum designed for students who want to pursue careers in the fast-growing app economy. The curriculum is available as a free download today from Apple’s iBooks Store.

App Development with Swift is a full-year course designed by Apple engineers and educators to teach students elements of app design using Swift, one of the world’s most popular programming languages. Students will learn to code and design fully functional apps, gaining critical job skills in software development and information technology.

There is currently an iOS app development gold rush. Stories of individuals making thousands by selling on the iOS app store have captured the mainstream imagination.

In reality - much like the 19th century US gold rushes - only a small proportion of app developers will be able to support themselves on iOS app royalties.

Many who make videos and film believe that storytelling with video - video literacy - is a skill that almost everyone would benefit from having. I think Apple could offer a very similar course based on the tools they make:

Apple today launched a new media development curriculum designed for students who want to accelerate their chances of success through video literacy. The curriculum is available as a free download today from Apple’s iBooks Store.

Telling stories with Apple Applications is a full-year course designed by Apple engineers and educators to teach students the fundamentals of storytelling using Apple’s iOS and macOS applications. iMovie for iOS and macOS is the most widely distributed video editing software. Final Cut Pro X has been bought over 2 million times from the Mac App Store. Students will learn to develop stories using these tools and more - including Clips for iOS, FileMaker Pro and Motion 5 for macOS, gaining critical job skills in all fields.

For now the iBooks store offers a free enhanced book: iMovie for Mac macOS Sierra. It is a full introduction to editing with iMovie - including source video and audio footage. That's a good start. Once this is combined with similar lessons for other Apple apps and applications alongside the theory of how to communicate with video, Apple could change the lives of thousands of students and adults all over the world.

 

Documentary on Apple’s Final Cut Pro X - an echo chamber inside a bubble?

Tuesday, 23 May 2017

Off the Tracks is a forthcoming documentary about the launch and adoption of Final Cut Pro X. The first trailer dropped yesterday.

One of the bubbles that some are in is ‘editing software.’ An echo chamber inside that bubble is ‘Final Cut Pro X fans’ - who are a subset of Final Cut Pro X users. I wonder how many non-X editors will be interested in this film. Have the makers made it appealing enough for non Final Cut fans to watch? Or non-editors? Maybe trailer 2 will hint at what their take is. 

There is a chance that they have included lessons that apply outside the #fcpx echo chamber, outside the editing software bubble. That might attract a wider audience.

PS: Fellow bubble-folk: Here is the transcript of my interview with Randy Ubillos, creator of Adobe Premiere, Final Cut Pro 1.0 and Final Cut Pro X.