LAK 2026workshop

LLM Psychometrics

Applying Assessment Theories to Evaluate LLMs and LLM-Supported Learners

Bergen, NorwayApril 27, 2026

Large Language Models (LLMs) are increasingly used for tasks traditionally performed by human learners, such as reading, writing, problem solving, and programming. While current evaluations rely heavily on benchmarks, mature frameworks from educational measurement – such as Item Response Theory (IRT), cognitive diagnostic models (e.g., DINA), and learning taxonomies – offer principled approaches for understanding their capabilities and limitations. This workshop will explore how these theories can inform the evaluation of LLMs and human-AI collaboration, highlight divergences and alignments with human learning processes, and address concerns around responsible AI use in education.

Our workshop goals are to advance theory-grounded evaluation of LLMs and LLM-supported learners, connect researchers across learning analytics, educational measurement, and AI, and surface actionable patterns in LLM performance and translate them into practices for responsible assessment and human–AI collaboration.

Call for Papers

We accept two types of contributions: (1) short empirical work-in-progress papers, and (2) short discussion papers that provoke debate on key issues and challenges in assessing LLMs and LLM-supported learners.

Topics of interest include:

  • Application of educational measurement theories and methods to understanding AI and Human-AI learning and performance
  • Systematic comparisons of human vs AI learner performance
  • Systematic comparisons of human-human vs human-AI team performance
  • Adaptation of conceptual frameworks for designing tasks involving AI and human-AI learners
  • Methods for detecting AI participation in solving complex tasks, including detection of GenAI-enabled cheating, misuse, and implications for academic integrity
  • Methods for studying AI knowledge representation and its impact on learning and performance
  • Error analyses of AI performance on complex tasks that provide insights into AI knowledge, skills, and learning
  • Ethical considerations and limitations of using human-based psychometric models to study LLMs

Important Dates

  1. Submission deadline

    Dec 4, 2025

  2. Notification

    Dec 19, 2025

  3. Camera-ready

    Jan 12, 2026

  4. Workshop Day

    Apr 27, 2026

* All dates in local conference time.

Submission Guidelines

  • Length: 4–6 pages (not including tables, figures, references, acknowledgements, AI declarations, and ethics statements)
  • Format: CEUR Workshop Proceedings template
  • Review: Double-blind
  • Proceedings: Accepted papers will be presented during the workshop and published in CEUR-WS open workshop proceedings (workshop papers are not included in the Companion Proceedings of LAK2026).

Program Schedule

  • 09:00–09:15
    • Welcome and overview of workshop goals
    • Introduction of organizers and agenda
    • Brief participant introductions and ice-breaker
  • 09:15–10:00
    Alina von Davier

    Keynote: Alina von Davier

    Chief of Assessment, Duolingo — talk + Q&A

  • 10:00–11:00

    Presentations (12 min + 2–3 min Q&A)

  • 11:00–11:15

    Coffee break

  • 11:15–12:15
    • Introduction to the activity
    • Group discussions on sub-themes (evaluating LLMs, assessment of hybrid human+AI work, theoretical frameworks, assessment design, etc.)
    • Groups report back (Plenary discussion)
  • 12:15–13:00

    Flash talks and poster session

  • 13:00–13:30

    Summary, next steps (further collaboration, etc.), closing

Bergen, Norway