Enhancing Virtual 3D Worlds: A Novel Approach Using Stable Diffusion and Control Net for Real-Time Visual Complexity
Summary
This thesis explores how an AI-driven visualization pipeline can support level designers during the early phases of 3D game level design. Making use of Stable Diffusion combined with ControlNet through an accessible Unity-based tool, the aim was to streamline and enhance both the efficiency and creative output of traditional White Boxing and Set Dressing workflows. A qualitative and quantitative user study with twenty students from Utrecht University and the Hogeschool voor de Kunsten Utrecht, all experienced in game and level design, compared the demo tool to both drawing on paper and a bare txt2img Stable Diffusion workflow. Results show that our implementation gave participants a significantly stronger sense of control and allowed for higher visual expression than the bare Stable Diffusion workflow; while offering a similar sense of control and capability for visual expression as manual drawing, without requiring or relying on strong drawing or prompting skills. Our implementation also enabled for quick iteration with lower wall-clock times across two level design iterations. Both AI-based approaches were equally easy to use, suggesting that adding spatial control features did not negatively affect usability. Insights from qualitative answers highlighted various obstacles for participants with the manual approach of drawing on paper, underscoring the possible practical advantages of AI-driven tools in creative prototyping contexts. Despite some limitations such as missing quality-of-life features and AI-rendering quirks, like objects being merged or missing from the final image, the pipeline shows strong potential for enhancing creative workflows in the early phases of level design and game development as a whole.