Computer vision and machine learning coming to Apple ProApps? On Friday Apple added a new opening to their jobs website:
Video Applications Senior Software Engineer
Combining the latest Mac OS with Apple quality UI design, the editing team builds the innovative next generation timeline experience in Final Cut Pro X.
The job requirements have some interesting clauses:
Work on the architecture that provides the canvas for telling a story in video and incorporate cutting edge technology to build future versions of the product.
What cutting edge technology could they be thinking of here?
Experience developing computer vision and/or audio processing algorithms
Do you have experience applying machine learning solutions with video and audio data in a product?
Seems like object and pattern recognition will be useful, perhaps for automatic keywording and point, plane and object tracking. This is to be expected as smartphones can do real-time face and object tracking in social media apps today.
At 2017 WWDC in June Apple announced that they will add object tracking to iOS and macOS later this year (link to video of tracking demo). Here’s an excerpt from the video of the session:
Another new technology, brand-new in the Vision framework this year is object tracking. You can use this to track a face if you’ve detected a face. You can use that face rectangle as an initial condition to the tracking and then the Vision framework will track that square throughout the rest of your video. Will also track rectangles and you can also define the initial condition yourself. So that’s what I mean by general templates, if you decide to for example, put a square around this wakeboarder as I have, you can then go ahead and track that.
They also talked about applying machine learning models to content recognition:
Perhaps for example, you want to create a wedding application where you’re able to detect this part of the wedding is the reception, this part of the wedding is where the bride is walking down the aisle. If you want to train your own model and you have the data to train your own model you can do that.
Machine learning and audio?
Interesting that they are planning to recognise aspects of audio as well – keywording is straughtforward. What could machine learning automatically determine about captured audio? This could be the beginning of automatic audio mixing to produce rudimentary placeholders before audio professionals take over.
Recently there have been academic experiments investigating automatic picture editing (for example the Stanford/Adobe research I wrote about in May). I wonder when similar experiments will investigate sound editing and mixing?
Not just Final Cut Pro X
Although people are expecting machine learning to be applied to video and audio in Final Cut Pro X, remember that iMovie is the same application with the consumer-friendly UI turned on. What works in Final Cut Pro X can also be introduced to a wider market in iMovie for macOS, for iOS and Clips for iOS.