Revolutionizing Game Design with AI: Endless Ai Super Mario Models
OpenAI's latest unveiling of its remarkable generative model, Sora, has expanded the boundaries of what can be achieved with text-to-video technology. Now, Google DeepMind introduces text-to-video games.
All Images created with Ai
OpenAI's latest unveiling of its remarkable generative model, Sora, has expanded the boundaries of what can be achieved with text-to-video technology. Now, Google DeepMind introduces text-to-video games.
Dubbed Genie, the new model possesses the ability to transform a brief description, a hand-drawn sketch, or a photograph into a playable video game reminiscent of classic 2D platformers like Super Mario Bros. However, it's important to note that these games don't offer fast-paced action. Running at one frame per second, they stand in contrast to the typical 30 to 60 frames per second found in most contemporary games.
Matthew Guzdial, an AI researcher at the University of Alberta, expresses admiration for the project, stating, "It's impressive work." Guzdial had previously developed a comparable game generator a few years ago.
30,000 Hours of Game Training
Genie underwent training using 30,000 hours of video footage featuring hundreds of 2D platform games sourced from the internet. According to Guzdial, this approach has been explored by others in the past. His own game generator learned from video data to craft abstract platformers. Nvidia also utilized video data to train GameGAN, a model capable of generating replicas of games such as Pac-Man.
However, all these instances involved training the model using input actions, such as button presses on a controller, in addition to video footage. For instance, a video frame depicting Mario jumping was linked with the Jump action, and so forth. Tagging video footage with input actions requires considerable effort, which has constrained the availability of training data.
Each new frame being generated from scratch
In contrast, Genie was exclusively trained on video footage. It subsequently learned to associate one of eight potential actions with altering the position of the game character in a video. This process effectively transformed numerous hours of existing online video content into potential training data.
Genie dynamically generates each new frame of the game based on the player's actions. When the player presses Jump, Genie promptly updates the current image to illustrate the game character jumping. Similarly, pressing Left triggers a change in the image to depict the character moving to the left. The game progresses incrementally, with each new frame being generated from scratch as the player engages with the game.
Tim Rocktäschel, a research scientist at Google DeepMind leading the team behind the project, suggests that future versions of Genie may operate at a faster pace. He asserts, "There is no inherent limitation preventing us from achieving 30 frames per second. Genie leverages many of the same technologies as contemporary large language models, where significant progress has been made in enhancing inference speed."
Genie acquired insights into typical visual characteristics observed in platformers. A prevalent feature in many games of this genre is parallax, wherein the foreground moves sideways at a faster pace than the background. Genie frequently incorporates this effect into the games it produces.
Although Genie remains an internal research endeavor and will not be made publicly available, Guzdial mentions that the Google DeepMind team has indicated its potential transformation into a game development tool—an area he's also exploring. "I'm certainly curious to witness their developments," he remarks.
Exploring Virtual Environments
However, the researchers at Google DeepMind are not solely focused on game generation. The team behind Genie is engaged in open-ended learning, a domain where AI-controlled bots are introduced into virtual environments and tasked with solving diverse challenges through trial and error—a methodology known as reinforcement learning.
XLand will play a pivotal role in training future bots
In 2021, a separate DeepMind team developed XLand, a virtual playground where bots learn to collaborate on simple tasks like navigating obstacles. Sandboxes like XLand will play a pivotal role in training future bots to tackle various challenges before deploying them in real-world scenarios. The success of video-game examples suggests that Genie has the potential to generate similar virtual environments.
Others have also developed similar tools for world-building. For instance, David Ha at Google Brain and Jürgen Schmidhuber at the AI lab IDSIA in Switzerland created a tool in 2018 aimed at training bots in game-based virtual environments, known as world models. However, unlike Genie, these tools required training data to incorporate input actions.
The team showcased the practical applications of this capability in robotics as well. By exposing Genie to videos of real robot arms manipulating various household objects, the model learned the feasible actions the arm could execute and how to control it. In the future, robots could acquire new skills by watching instructional videos.
"It is challenging to anticipate the potential applications," remarks Rocktäschel. "We aspire for projects like Genie to eventually furnish individuals with innovative tools to unleash their creativity."