dc.description.abstract | Large Language Models (LLMs) show promise for personalized storytelling, but generating age-appropriate content remains challenging due to rapid developmental changes in early childhood. This study investigates the use of GPT-4.1 mini to generate Dutch stories for five child age groups (0–1 to 8–9 years), using both text and image inputs. Developed with the organization Monkey Moves for integration into an interactive application, the system adapts story content, vocabulary, and story-integrated physical activities based on age and context with prompt engineering. The effectiveness of GPT in adapting stories across age groups was assessed in terms of language, engagement, and activity suitability. LLM-based evaluations of 900 stories showed consistently high scores, while human ratings on a subset were more conservative yet still indicated general appropriateness. Key differences between evaluation methods emerged in the assessment of language use, engagement, and activity relevance. The results underscore GPT’s potential for age-tailored story generation, while also highlighting the importance of human oversight in
the evaluation process. This work contributes to child-centered AI and shows how prompt engineering and multimodal inputs can support language development and imaginative physical play. | |