Plotting Behind the Scenes: Towards Learnable Game Engines

Animation module comparison to baselines

We evaluate our animation model against the Playable Environments baseline (PE) on the task of reconstructing a video from the initial state and actions for each player.

Minecraft

PE

Note the irrealistic player animations and lack in matching between text prompts and generated results.

Ours small

The full version of our model, trained with a reduced amount of computational resources, matching the one used for the baselines.

Ours

The full version of our model

Tennis

PE

Note the irrealistic player animations resulting from the model's inability to capture the multimodal distribution of player poses conditioned on text.

Ours small

The full version of our model, trained with a reduced amount of computational resources, matching the one used for the baselines.

Ours

The full version of our model