Integrating External Video Source API For AVCaptureMultiCamSession In Flutter WebRTC On IOS

by StackCamp Team 92 views

Hey guys! Let's dive into an exciting topic: integrating the External Video Source API for AVCaptureMultiCamSession in Flutter WebRTC on iOS. This is a game-changer for those looking to leverage the power of dual cameras on iOS devices. We'll break down the problem, explore the proposed solution, and discuss why this is such a crucial enhancement for the flutter-webrtc package. So, buckle up and let's get started!

Understanding the Need for External Video Source API

In this section, we'll cover the necessity for the External Video Source API.

The Power of AVCaptureMultiCamSession

First off, let's talk about AVCaptureMultiCamSession. This is an awesome feature on iOS 13+ devices that allows you to capture video from both the front and rear cameras simultaneously. Think about the possibilities! Imagine creating apps that can record your reaction while you're filming something, or building immersive video experiences that utilize multiple perspectives. However, there's a catch. The current flutter-webrtc package primarily uses getUserMedia() for camera access, which unfortunately only supports a single camera feed at a time. That's a bummer, right?

Current Limitations of Flutter WebRTC

Right now, if you're using flutter-webrtc, you're limited to accessing one camera at a time. Here’s a snippet to illustrate this:

// This only captures from ONE camera at a time
final stream = await navigator.mediaDevices.getUserMedia({
  'video': {'facingMode': 'user'}
});

// Switching cameras stops the current stream
await Helper.switchCamera(videoTrack);

As you can see, this code only allows you to capture from one camera. If you try to switch cameras, the current stream gets stopped. This is a major limitation when you want to harness the full potential of dual cameras on iOS devices. On the native iOS side, capturing from both cameras is totally doable:

// This works on iOS natively
let multiCamSession = AVCaptureMultiCamSession()
// Setup front camera output → CVPixelBuffer stream 1
// Setup rear camera output → CVPixelBuffer stream 2

This native code shows that iOS can handle simultaneous camera feeds, but we need a way to bridge this capability into flutter-webrtc. This is where the External Video Source API comes into play, offering a solution to this gap and paving the way for innovative applications.

Why This Matters

So, why is this important? Well, for apps that require simultaneous video streams—think video conferencing with picture-in-picture, augmented reality applications, or even advanced surveillance systems—this feature is a game-changer. By enabling dual camera support, flutter-webrtc can unlock a whole new realm of possibilities for developers. It's all about giving you the tools to create richer, more engaging, and more functional video experiences. The demand for such features is growing, and bridging this gap is crucial for keeping flutter-webrtc at the forefront of real-time communication solutions.

The Proposed Solution: External Video Source API

Let's explore the heart of the solution: the External Video Source API. This is where things get really exciting! We're going to break down the Dart API, discuss the native iOS implementation, and see how this API can revolutionize dual-camera support in flutter-webrtc.

Diving into the Dart API

The proposed solution introduces a new class called ExternalVideoSource. This class will provide the necessary methods to create custom video sources and feed frames into WebRTC video tracks. Here’s a closer look at the Dart API:

// Dart API
class ExternalVideoSource {
  static Future<int> create() async {
    // Returns sourceId
  }
  
  Future<void> feedFrame(Uint8List pixelBuffer, int rotation, int timestampNs) async {
    // Feed frame to this source
  }
  
  Future<MediaStreamTrack> createTrack(String trackId) async {
    // Create track from this source
  }
  
  Future<void> dispose() async {
    // Cleanup
  }
}

Let's break down each method:

  • create(): This static method is responsible for creating an external video source. It returns a sourceId, which is essentially a unique identifier for the video source.
  • feedFrame(Uint8List pixelBuffer, int rotation, int timestampNs): This method is the workhorse of the API. It takes a pixelBuffer (the video frame data), a rotation value, and a timestampNs (timestamp in nanoseconds) as input. This allows you to feed video frames from your native iOS code into the WebRTC video track. It’s like injecting video directly into the stream!
  • createTrack(String trackId): This method creates a MediaStreamTrack from the external video source. The trackId is a unique identifier for the track. This is crucial for integrating the video source into a WebRTC peer connection.
  • dispose(): This method is for cleanup. It releases any resources associated with the external video source, preventing memory leaks and ensuring your app runs smoothly.

Example Usage

Here’s how you might use the ExternalVideoSource API:

// Usage
final frontSource = await ExternalVideoSource.create();
final rearSource = await ExternalVideoSource.create();

// Feed frames from native code
await frontSource.feedFrame(frontPixelBuffer, 0, timestamp);
await rearSource.feedFrame(rearPixelBuffer, 0, timestamp);

// Create tracks
final frontTrack = await frontSource.createTrack('front_camera');
final rearTrack = await rearSource.createTrack('rear_camera');

// Add to peer connection
peerConnection.addTrack(frontTrack, stream1);
peerConnection.addTrack(rearTrack, stream2);

This example demonstrates how to create two external video sources (one for the front camera and one for the rear camera), feed frames from native code, create tracks from these sources, and add them to a peer connection. It's a powerful and flexible way to handle dual-camera input!

Native iOS Implementation

On the native iOS side, we need a class to handle the actual video frame processing. Here’s a glimpse of the proposed Objective-C implementation:

// FlutterRTCExternalVideoSource.h
@interface FlutterRTCExternalVideoSource : NSObject
- (instancetype)initWithFactory:(RTCPeerConnectionFactory *)factory;
- (void)feedFrame:(CVPixelBufferRef)pixelBuffer 
         rotation:(RTCVideoRotation)rotation 
      timeStampNs:(int64_t)timeStampNs;
- (RTCVideoTrack *)createTrackWithId:(NSString *)trackId;
@end

This FlutterRTCExternalVideoSource class will handle the heavy lifting of receiving CVPixelBuffer frames, rotating them if necessary, and creating the RTCVideoTrack. The RTCPeerConnectionFactory is used to create the video track, ensuring it’s compatible with WebRTC.

Why This Solution Rocks

The External Video Source API is a robust and elegant solution because it provides a clear and concise way to feed external video frames into flutter-webrtc. It decouples the video capture process from the WebRTC streaming process, giving you the flexibility to use any video source you want. This is a huge win for developers who need to integrate custom video sources or take advantage of advanced features like dual-camera support.

Alternative Workarounds (And Why They Fall Short)

Before proposing the External Video Source API, several alternative workarounds were considered. While some of these might seem viable at first glance, they each have significant drawbacks that make them unsuitable for a robust, dual-camera solution. Let's break them down and see why the proposed API is the best way forward.

1. Two getUserMedia() Calls: The Illusion of Simplicity

The first idea that might pop into your head is, “Why not just call getUserMedia() twice, once for each camera?” Here’s how that would look in Dart:

final front = await getUserMedia({'facingMode': 'user'});
final rear = await getUserMedia({'facingMode': 'environment'});

Sounds simple, right? Unfortunately, this approach hits a brick wall on iOS. iOS has a limitation where it only allows one active camera session at a time. So, when you make the second getUserMedia() call, instead of opening a second camera stream, it simply switches the active camera. This means you can only capture from one camera, negating the whole point of simultaneous dual-camera support. This method might work in some other environments, but it’s a no-go for iOS.

2. Helper.switchCamera(): The Sequential Shuffle

Another workaround could be using a Helper.switchCamera() function to quickly switch between the front and rear cameras. This approach would allow you to display video from one camera at a time, but it doesn't achieve simultaneous streaming. Here’s the gist of it:

await Helper.switchCamera(videoTrack); // Fast but sequential

While this might provide a fast way to switch between cameras, it’s fundamentally limited to showing video from only one camera at a time. The key word here is sequential. You can't display both camera feeds at the same time, which is a major drawback for applications requiring concurrent video streams. This approach falls short when the goal is to provide a true dual-camera experience, where both streams are active and visible simultaneously.

3. Screen Sharing Workaround: The Hacky Solution

Perhaps the most convoluted workaround involves using one camera with getUserMedia() and then employing screen recording to capture the view from the second camera. This might sound like a clever hack, but it's fraught with issues. The idea is to use one camera normally and then use screen recording to grab the output from the other camera's view. Think of it as a duct-tape solution.

Problem: This method introduces a whole host of problems:

  • Additional Permissions: Screen recording requires extra user permissions, which can be a barrier to adoption. Users might be wary of granting screen recording access to an app, especially if it seems unnecessary.
  • Poor Performance: Screen recording is resource-intensive. It adds significant overhead, leading to reduced performance and potentially choppy video. This is far from ideal for real-time communication applications.
  • Not a Real Solution: Ultimately, this workaround is just that—a workaround. It's not a clean, efficient, or reliable way to handle dual-camera input. It adds complexity and potential points of failure without providing a robust solution.

Why the External Video Source API Wins

In contrast to these workarounds, the External Video Source API provides a clean, efficient, and reliable solution for integrating dual-camera support in flutter-webrtc. It’s designed to handle simultaneous video streams without the limitations and drawbacks of the alternatives. By allowing developers to feed CVPixelBuffer frames directly into WebRTC video tracks, it unlocks the true potential of dual-camera functionality on iOS devices. It’s the right tool for the job, providing a robust foundation for building advanced video applications.

Related Issues and Additional Context

To give you a broader picture, let's touch on some related issues and additional context surrounding this feature request. Understanding the existing discussions and the broader ecosystem can help you appreciate the significance of the External Video Source API.

Connecting the Dots: Related Issues

There are a couple of key issues that are closely tied to this proposal. One notable issue is #1020, which likely discusses similar feature requests or challenges related to video input in flutter-webrtc. By examining these related issues, you can see the community's interest and the existing efforts in this area. It's always a good idea to understand the history and context of a feature request to ensure you're building on the collective knowledge and avoiding duplication of effort.

The Broader WebRTC Ecosystem: External Video Source Pattern

This proposal isn't happening in a vacuum. The idea of an external video source is a well-established pattern in the broader WebRTC ecosystem. Other platforms have already implemented similar mechanisms for feeding external video frames into WebRTC pipelines. This pattern allows developers to bring in video from various sources, whether it's custom camera implementations, pre-recorded video files, or even video streams from other applications. By aligning flutter-webrtc with this pattern, we make it easier for developers familiar with WebRTC concepts to jump in and start building.

The Missing Piece: Bridging Native iOS with Flutter WebRTC

As mentioned earlier, the core challenge here is bridging the gap between native iOS capabilities and the flutter-webrtc framework. Native iOS code can already capture video from both cameras simultaneously using AVCaptureMultiCamSession and deliver those frames as CVPixelBuffer objects. The missing piece is a way to feed these CVPixelBuffer frames into flutter-webrtc video tracks. The External Video Source API is designed to be that bridge, allowing developers to leverage the power of native iOS video capture within their Flutter applications.

Proof of Concept: Ready to Roll

The person who proposed this feature has even gone the extra mile by creating working native iOS code that captures from both cameras using AVCaptureMultiCamSession and delivers CVPixelBuffer frames. This is a huge step forward because it demonstrates the feasibility of the solution. Having a proof-of-concept implementation makes it much easier to evaluate the proposal, identify potential issues, and move towards a final implementation. It also signals a strong commitment to the feature and a willingness to contribute to the flutter-webrtc project. If this feature interests you and you are ready to know more, don't hesitate to ask for more details or even a peek at the proof-of-concept implementation.

In Conclusion: Dual Cameras, Here We Come!

So, there you have it, guys! The proposed External Video Source API for integrating AVCaptureMultiCamSession in flutter-webrtc on iOS is a game-changing enhancement. It addresses a critical limitation, opens up a world of possibilities for dual-camera applications, and aligns with established patterns in the WebRTC ecosystem. By providing a clean, efficient, and reliable way to feed external video frames into WebRTC video tracks, this API empowers developers to create richer, more engaging, and more functional video experiences.

The alternative workarounds simply don't cut it. They either fail to deliver simultaneous streams, introduce performance issues, or require hacky solutions that are far from ideal. The External Video Source API is the right approach, providing a robust foundation for building advanced video applications.

With working native iOS code already in place and a clear vision for the API, this feature has the potential to significantly enhance flutter-webrtc and make it an even more powerful tool for real-time communication. Keep an eye on this space, because the future of dual-camera support in Flutter WebRTC looks bright!