March 14, 2022

Can Robotic Cameras Replace Human Camera Operators ?

Automated sports production technology is already used by thousands of clubs worldwide, from grassroots to elite clubs. With the addition of OTT platforms, we could soon see 200 million more sporting events reach our screens: a feat that was previously unthinkable due to limited resources.

But for this to happen, robotic cameras for sports must meet viewer expectations when it comes to production quality.

A robotic camera system can use different operational models. But for the purpose of this blog, we’ll consider just two:

  1. Robotic capture: a system that uses a robot action camera to track the play, both panning, and zooming. The main challenge for the robotic camera operator is to establish what to track in real-time while ensuring smooth footage. Imagine a goal kick in soccer: in such an instance, the auto tracking robot camera will struggle to keep the ball in focus unless it moves quickly.
  2. Panoramic capture: a system that relies on a wide-angle camera (or multiple wide-angle cameras to capture single images, which it then stitches together) to form the panoramic view. Advanced auto-tracking software allows the system to focus on tracking the flow of play, capturing high-resolution panoramic footage.

For it to simulate a human-operated camera, the robotic sports camera makes decisions within a five-second latency buffer. Thanks to the twenty-second latency of most modern internet broadcasts, this latency isn’t noticeable.

The buffer effectively lets the robotic production camera ‘peer into the future’ and establish how a play will unfold before choosing how to capture it.

How does a robotic camera system for sport work?

As each sport is distinctive in its own right, to achieve good results, the algorithmic implementation must be customized to a sports’ unique characteristics.

These three principles sit at the core of nearly every robot action camera.

Automatic Ball Detection

In soccer, basketball, or any sport with a ball, the software looks to detect and track the ball. That’s because the ball usually sits in the heart of the action, so this principle ensures the camera always catches key plays.

Player Detection

More advanced systems use player detection to complement ball detection. The setup allows the system to better understand what’s happening, which, in turn, enables game state detection. 

Both principles rely on the software’s ability to distinguish the ball or the players from the background. The challenge with payer detection is that if a player stands still for more than a few seconds (common during foul plays or injuries), the algorithm still needs to pick out the player from the background.

Another challenge is that the algorithm needs to recognize which players are playing (including those not directly involved in the current play) versus, say, a substitute standing on the sidelines, or indeed, the referee.

Game State Detection

The game state refers to the current type of play. Player and ball detection lay the foundations for the software to detect the state, whether it’s a free kick, throw-in, penalty, or any event with distinct visual characteristics.

Understanding the game state allows the algorithm to predict what players might do next, making for smoother video capture. Each sport has an extensive list of game states, which makes this task a particular challenge for any robotic camera system.

Deep learning is one way to overcome the hurdle as it enables the software to identify plays like free kicks based on a data set with as few as fifty examples. In such an instance, the system uses the data to create rules to distinguish a free kick from a corner kick from a throw-in. Once the system understands the game state, it can decide how to capture each frame. An AI system is continually “learning” and improving its “perception” of what’s going on.

The following image visualizes how the process works. It shows how the system can stitch together a panoramic view from multiple images captured by several cameras. The red rectangle indicates the frame to capture, which the software has recognized as an attack.

You might notice that uninvolved players are marked with an X, while active players are within the frame (with data showing their speed and other metrics).

Soccer Camera
Pixellot S1 Camera

Baseline characteristics of a robotic camera for streaming

If you’re going to offer a live stream of sports coverage, the footage must meet a baseline quality. Put another way: the robotic camera operator must produce a stream that’s as smooth and steady as any human camera operator can achieve.

The true test of the quality of a robotic camera system is the perceived quality of the video stream. If the look and feel of the game coverage lets the viewer feel like they are watching a typical TV sports broadcast, then the camera system is doing its job.

How a broadcast robotic camera system deals with 4 unique scenarios

Camera operators have to handle hundreds of potential sporting scenarios. Here are four unique events and how different systems capture each one.

Scenario 1: An extra ball on the sideline

In soccer, it’s common for an extra ball to be present on the sideline as substitutes use it for warm-ups. Still, the camera must know to focus on the in-play ball, not the extra one, even if the in-play ball disappears from view (behind a player, for example).

  • Human camera operator: a person knows which ball to look at, so this isn’t a concern.
  • Robotic camera operator: an auto tracking robot camera might mistake the second ball for the in-play ball, especially if the in-play ball disappears. The robotic production camera will then jump back to the in-play ball as soon as it re-appears in view.
  • Panoramic capture with latency: the five-second latency gives the system time to realize when a ball is not in play. It can then use this buffer to correct any tracking mistakes, and the spectator will be none the wiser.

Scenario 2: Unexpected ball movements

In basketball, soccer, and most ball sports, the ball can veer off in any direction. Suppose a player feints a pass or performs a complex skill; the camera may move in the wrong direction or zoom in on the wrong action, if only for a split second.

  • Human camera operator: even human operators find this scenario tricky, but thanks to their experience, they can quickly correct mistakes without a noticeable impact on the footage.
  • Robotic camera operator: a robotic sports camera can’t predict where a ball will go, no matter how sophisticated its software and the way it quickly switches angles isn’t an optimal viewing experience.
  • Panoramic capture with latency: combining a high-resolution panoramic view with the five-second latency allows the robotic camera system to make decisions based on its ‘knowledge’ of the future. That means no ball movement is ever unpredictable, allowing the system to simulate the smooth footage of a video tripod with a fluid head.

Scenario 3: Linear simultaneous actions

Suppose you’re watching a player prepare to take a free kick. While you want to see what the free kick taker is doing, you may also want to see how the goalkeeper is preparing. One camera has to produce two simultaneous views in this scenario, an impossible feat.

  • Human camera operator: as you might expect, one camera cannot capture two separate views.
  • Robotic camera operator: even a robotic action camera can’t do this.
  • Panoramic capture with latency: thanks to the five-second latency and panoramic view, the software can cut from one view to another in a linear fashion, working as if multiple cameras are capturing the action.

Scenario 4: Non-linear simultaneous actions

In baseball, we often want to see non-linear simultaneous actions (like when a player tries to steal a base during a pitch). In this instance, the broadcast might show a replay of the steal while showing the pitch in real-time.

  • Human camera operator: as in the above scenario, one camera cannot capture two separate views.
  • Robotic camera operator: again, even a robotic action camera can’t do this.
  • Panoramic capture with latency: while one camera can’t capture two simultaneous actions, the broadcast robotic camera system can show a replay of the second event as if there were multiple cameras at work.

Robotic Camera for Sports

What’s the best robotic camera for sports?

As you can see, the combination of a high-resolution camera that offers a panoramic view with a five-second latency can simulate a human camera operator. Moreover, the system can use a single camera to do the work of a multi-camera rig.

This makes it the best robotic camera for sports. What’s more, computer vision, artificial intelligence, and deep learning can now work together to deliver a robot-powered viewing experience that’s just as good as any human-operated camera.

As a matter of fact, a robotic sports camera can offer a better experience, especially given these systems will only improve and cover more sporting events.

That said, the current systems are already field-tested and proven to work well.

Related Articles

It’s showtime

Let’s Talk