Role-playing activities offer opportunities for developing individuals’ creativity, communication, and problem-solving skills. Recent advances in large language models (LLM) facilitate fluent conversations with machines. To investigate benefits and pitfalls of LLMs in a relatively unexplored context of human-agent role-play as a culturally contextualised activity, a dataset of twelve human-agent interactions produced by two researchers with two state-of theart LLMs was annotated based on a frame analysis scheme from literature. The pilot study shows that human-agent play has a similar complexity as human human play in which players maintain identities of themselves, external observers and play characters simultaneously going beyond the pretend-reality dualism. Results suggest that, while the LLMs can maintain and shift between roles, they play some roles better than others, and display cultural and gender stereotypes. Additionally, the coding scheme shows potential to help identify LLM outputs that require embodied enactment, and to be used for LLM bench-marking for role-play.