Sideproject Friday: LutherLLM

Building a Historical AI Avatar: Martin Luther in Godot

06.01.2025

The Vision

As someone passionate about both history and education, I’ve always been frustrated by how static and one-dimensional historical education can be. Reading about Martin Luther’s theological ideas is one thing, but what if you could actually engage with him in conversation?

The idea behind this project was to bridge the gap between historical texts and modern interactive learning. By combining LLM technology with character animation, we created an AI avatar that doesn’t just recite facts, but engages in meaningful dialogue about theology, reform, and faith – all while maintaining historical accuracy.

Also to build all in one day. We like to challenge ourselves.

Technical Stack

  • Game Engine: Godot 4.3
  • AI Integration: OpenAI API with gpt-4o-mini model
  • Animation: Custom 2D sprite-based system with dynamic mouth shapes and expression blending
  • Primary Sources: Curated collection of Luther’s writings and contemporary documents
  • Configuration: Secure key management with environment and file fallbacks
  • Text Reveal: Optimized character-by-character animation system with natural timing

Development Journey

Phase 1: Foundation and AI Integration

One of our earliest challenges was finding the right “voice” for Luther. Initial attempts either produced responses that were too modern or too stiff and academic. The breakthrough came when we restructured the prompt to emphasize Luther’s role as a teacher and reformer.

A key technical decision was implementing a layered architecture for the LLM integration:

The OpenAI client implementation includes robust configuration handling:

We also created a template configuration file to guide setup:

This abstraction proved valuable as it allowed us to:

  • Easily switch between different LLM providers
  • Test with different models (starting with GPT-4-mini)
  • Maintain consistent interface for the rest of the application
  • Securely handle API keys with multiple fallback options

We quickly built a simple system with the OpenAI API, and added this layer of abstraction to easily switch between different LLM providers and models. That way the project can get smarter and cheaper as models improve.

Some prompt engineering gets us to a reasonable level of Martin Luther-ness. After experimenting some, we found that prompting for a teacher roleplaying as Martin Luther works best. That way the system does not get easily confused by modern topics, and will steer the conversation back to Luther’s theological views.

Here’s an example of the system prompt we use:
You are roleplaying as Martin Luther, the 16th-century Protestant Reformer. Stay in character while being aware that this is a historical portrayal. Your responses should reflect Luther’s theological views, personality, and historical context.

Core Character Traits:
– Strong convictions about faith, scripture, and salvation through grace
– Direct and passionate communication style
– Scholarly but able to speak to common people
– Known for both serious theological discourse and witty remarks

The animation commands are integrated into the prompt, allowing the LLM to control character expressions and movements naturally:

Phase 2: Animation System

The animation system has evolved significantly through several iterations:

  1. Initial version: Simple mapping of characters to mouth shapes
  2. Enhanced version: Phoneme-based matching with timing controls
  3. Current version: Context-aware system with dynamic transitions

The animation manager now handles all aspects of character animation:

The text reveal system has been optimized for smooth animation and natural pacing:

Recent improvements to the animation system include:

  • Phoneme-based mouth shape mapping for more accurate lip sync
  • Smooth transitions between mouth shapes using tweens
  • Context-aware shape selection considering surrounding characters
  • Proper timing for natural speech rhythm
  • Cancellable animations for better user interaction
  • Debug logging for animation state tracking
  • Type safety improvements throughout the system

The system now provides a more natural and fluid animation experience, with mouth shapes that better match the spoken text and smoother transitions between states.

Phase 3: Educational Features

We implemented a dynamic objective tracking system that:

  1. Monitors conversation topics in real-time
  2. Maps discussions to predefined learning objectives
  3. Provides contextual prompts to guide the conversation

The learning objectives are structured around key theological concepts:

The resource manager handles primary source integration:

The system automatically suggests relevant primary sources and tracks progress without interrupting the natural flow of conversation. The tabbed interface allows users to easily switch between the conversation and educational resources:

Technical Challenges Overcome

  1. API Key Security

    • Implemented multiple fallback options for API key storage
      • Environment variable (OPENAI_API_KEY)
      • Configuration file (openai_config.cfg)
    • Added key caching to reduce file system access
    • Created template system for safe version control
    • Implemented proper error handling for missing keys
    • Added debug logging for key loading process
  2. Response Streaming

    • Developed custom streaming solution for real-time text reveal
    • Implemented character-by-character animation triggering
    • Added variable timing based on character context
    • Optimized performance for smooth animation
    • Implemented proper pause timing for punctuation
    • Added cancellable reveal for better user experience
    • Created signal system for UI synchronization
  3. Resource Management

    • Created relevance scoring algorithm for primary sources
    • Implemented URL validation and security checks
    • Developed caching system for frequently accessed resources
    • Added type safety for string arrays and dictionaries
    • Improved keyword matching with term variations
  4. Animation System

    • Resolved mouth shape loading issues by switching from preload to load
    • Fixed type mismatches in animation manager
    • Optimized sprite transitions for smoother animation
    • Added debug logging for animation state tracking
    • Improved timing system for more natural speech rhythm
    • Implemented context-aware mouth shape selection

Future Development

  1. Voice Synthesis Integration

    • Investigating integration with ElevenLabs for period-appropriate voice
    • Planning lip-sync improvements for voice output
  2. Enhanced Animation System

    • Implementing gesture blending for smoother transitions
    • Adding emotional state machine for more natural expressions
    • Improving mouth shape transitions with interpolation
    • Adding more varied and dynamic expressions
    • Implementing advanced timing algorithms for speech patterns
    • Adding support for emphasized syllables and stress patterns
  3. Educational Features

    • Developing a curriculum builder for teachers
    • Adding support for custom primary sources
    • Creating a progress tracking dashboard
    • Implementing a more sophisticated relevance scoring system
    • Adding support for multiple languages and translations
  4. Security and Configuration

    • Implementing encrypted storage for sensitive data
    • Adding user-specific configuration profiles
    • Improving error handling and recovery
    • Enhancing logging and debugging tools

Lessons Learned

  1. AI Integration

    • Careful API key management is critical – we implemented multiple fallback options (environment variables, config files) and proper validation
    • Prompt engineering requires extensive testing – we went through many iterations to find the right balance of historical accuracy, educational value, and natural conversation flow
    • Configuration management needs to balance security with ease of use
  2. Animation Systems

    • Real-time animation is deceptively simple to implement at a basic level, but achieving natural, fluid motion requires significant expertise and iteration
    • Performance wasn’t a bottleneck since we kept the system lightweight, focusing on essential animations rather than complex blending
    • Type safety in Godot 4 requires careful attention, especially when working with arrays and string operations
    • Timing is crucial for natural-feeling animation – small adjustments can make a big difference
    • Context-aware animation decisions lead to more realistic results
  3. Educational Design

    • While we built a foundation for learning features, we didn’t have time to thoroughly test them with real users
    • The dynamic resource system proved more valuable than initially expected, helping users dive deeper into topics naturally
    • Balancing historical accuracy with accessibility requires careful consideration
    • Primary sources need clear context and explanation to be useful

Conclusion

This project demonstrates how game development technology can be combined with AI to create engaging educational experiences. The Martin Luther avatar shows that historical education doesn’t have to be static – it can be interactive, personalized, and still maintain historical accuracy.

The combination of LLM technology with real-time animation and educational features creates a unique learning environment that engages users while maintaining historical authenticity. While there’s still room for improvement, particularly in animation sophistication and educational feature testing, the foundation is solid for future development.


Jonas Heinke

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top