The four judges and the dozens of people in the audience await the next singer to come on stage. The doors open, but instead of a person you see a purple digital avatar walking and waving to the public. It talks, laughs, cries, moves and sings exactly like a human being. At the end of the show, two avatars proceed to the next round until we have the champion.
Does it sound like a scene pulled from some sci-fi movie? Could be, but that is happening in 2021, with actual humans’ movements being captured and transmitted real-time to the stage, as you can see in the video below.
Alter Ego, the latest singing contest in the United States, premiered on Fox in September, borrows part of the successful formula that made Masked Singer viral – not knowing who’s hidden in the costume – but bringing a different component: the real-time avatars that reproduce a person’s movement.
The magic happens behind the scenes, literally. While judges and the audience watch the avatars dance and move, the actual singer is performing right behind the walls, with a motion capture system attached to their body and an iPhone capturing the person’s face.
At first glance, the avatars themselves may look a bit rough, and some of its animations – especially regarding the mouth – feel out of sync, facts that resulted in a few bad reviews across the entertainment media. But Michael Zinman, co-executive producer of Alter Ego, is not too worried about this criticism at the moment.
“The reviews don’t bother me at all. I give a lot more value to the actual subculture of the metaverse and what [these] people want and what they believe the future of programming is going to be. And it’s going to be a lot of avatars,” he said in an exclusive interview with 6GWorld.
Zinman is no stranger to the world of entertainment, particularly on television. His resumé includes producing TV shows like The Masked Singer, Miss Universe, and Spin the Wheel, just to cite a few. But Alter Ego has proved to be challenging in many ways.
“We all became experts in learning, both from the network side and the production side. Because you can’t call anybody and say ‘hey, how did you do your avatar show?’”
The idea of creating Alter Ego came up in December, when a team of producers at Fox joined to think about a new kind of TV attraction. In February, they started researching, developing and assembling the technology they would use, shaping how the show would look and sound. In July, they were ready to kick off auditions and the head-to-head performances – which lasted a month until the finale.
“I would say the overall success of the show is not just that we did it. It’s that we actually did it on time, on budget,” Zinman said, also noting that the crew worked six days a week for months to deliver the show we watch today. “And the minute you forget that’s an avatar, you believe the avatar, then you’ve checked the box.”
Lack of time is one of the reasons why, according to Zinman, the technology behind the show clearly evolved throughout the episodes to the point that the finale feels much better than the premiere.
“Show technology-wise, every episode got better and better. So by the time you get to the sixth episode, it looks like it’s night and day from the first five. And then by the time you do the finale, it’s like ‘wow, it’s coming.’ We’re basically testing the system in a sense. It was nuts,” Zinman summarised.
How Alter Ego Works
Before auditions, the developers designed avatars for each competitor based, among other considerations, on how the singer wanted to look on stage. Some can fly, others have animated tattoos all over their body, purple skin, spikes, or stylised costumes.
Each singer stays behind an LED wall that separates them from the actual stage – where the audience and the judges are. The contestant is then attached to several motion capture devices, including an iPhone that captures the person’s face.
All this information is sent to a control room via cable and processed using Pixotope, an open software-based solution for creating virtual studios, augmented reality and on-air graphics. Pixotope works in tandem with Unreal Engine, a gaming engine used to create 3D characters and visuals.
From there, the data is sent over to the stage – but probably not how you expect. Instead of visualising the avatar moving and singing in 3D, judges and the audience watch the performance on several screens spread across the room, where the avatar and the real world – broadcast by 14 cameras – are blended exactly as we see on television. The stage is actually empty.
“A lot of things have to work together [to have everything functioning correctly],” Zinman said. “There was a lot of time spent on making sure that we challenged ourselves, but we also understood what the limit was going to be as well with 2021 technology.”
On top of this, Alter Ego’s crew also suffered from the semiconductor shortage that has been ongoing since 2020, making it even more challenging to deliver a high-quality show that relies on innovation. “It was not easy to do [the show] in 2021 because there’s a semiconductor shortage, and getting equipment and pieces was very difficult to do. Also, we had massive export costs because you’re paying four times what you should be paying because of the shortage,” Zinman explained.
Even though Zinman and his team at Fox are proud to do such an innovative show, there are intrinsic flaws that need to be addressed – both because of the timetable to finish the competition and the fact that Alter Ego is the first real-time avatar show on television.
“Right now, there is a good half-a-second delay, which is natural because, at the end of the day, you’re asking a computer in 2021 to do a lot,” he explained. “So getting over that delay [is the most urgent issue]. We think we’ve a way of getting down to half that, which would be a great help.”
Facial motion capture, something that has been subject of certain criticism from the entertainment media, is another area where Zinman’s staff believe there are opportunities to get better results.
“We’re working on the face now to get more emotions and get it more fluid, less choppy. By the time we got to halfway through the season, the choppiness is night and day compared to what it was in the first six episodes. But to me we can improve the facial motion capture,” he shared.
But not everyone understands the amount and the type of work required to deliver such a novel experience. Zinman says that a typical audience is not that savvy in terms of understanding the limitations of today’s technology and infrastructure. “They are expecting [the show] to be a flawless, perfect-like movie ‘Avatar’. Obviously, you couldn’t do the show like ‘Avatar’ because that’s all been composited after, and then the audience on stage would not be engaged and be completely phoney,” he added.
That’s why the producer pays more attention to what the tech and developing world has to say in terms of the technical aspects of the show. “I think Alter Ego is a home run for a lot of people because they also understand the tech. And they understand the flaws of 2021 technology. So they’re not nitpicking it because they’re looking at it from ‘wow, that’s not supposed to happen. How did they get that far?’”
What About 5G and 6G?
While the show doesn’t rely on wireless connectivity to transmit the data – except for one tracking camera – it doesn’t mean 5G and beyond technologies are considered out of contention when it comes to avatars and augmented reality. Far from that, actually.
“I think it’s all about speed and getting the latency down right now. So I believe wireless technology [5G and 6G] can do that. Obviously, there’s going to be a big win for all of us,” Zinman said.
For the entertainment field, he believes mobility will enable a whole new set of experiences, especially regarding live and in-person events. “In terms of broadcasting, I absolutely see that [5G and 6G] being a huge influence coming up, especially when you get to live [events]. I think if you want to do a Google Glass type of thing, where you have a full stadium of people wearing some kind of goggles and you’re seeing something virtually, you must have 6G or beyond.”