Xinyi He

UX Designer
Work  |  Me






Trustworthy GenAI

Designing an enterprise explainable GenAI framework to calibrate employee trust in GenAI decisions —focused on internal GenAI chatbots for work, with a reusable library of guidelines and UX patterns.



RoleEmployerTimelineCollaboration
UX Designer BMW Group7 Months
Ai Stakeholders
UX Community
Key & End Users





The explainable GenAI framework








Problem

Within BMW, over 1,800 GenAI solutions have been deployed for work. However, user testing shows that employees generally lack trust and still stick to traditional tools. Around 27% of users abandon the AI tools after a single use. 

Additionally, the absence of unified, reusable standards has led to redundant development across departments, reducing efficiency and compromising experience consistency.





Approach

User Research


To deeply understand user needs and pain points, I applied the following methods:

  • Expert Interviews: I conducted in-depth interviews with five factory operation experts from the U.S. and Germany, as well as internal AI specialists. These sessions helped uncover expectations, concerns, and technical constraints regarding AI usage on the shopfloor.

  • Online Survey: I designed and distributed a quantitative questionnaire focusing on user trust and AI explainability, collecting cross-factory data on user attitudes and experience.

  • Desk Research: I reviewed insights from previous user research and benchmarked against leading AI product designs (e.g., Microsoft, Google, Baidu, Ant Design). I also studied cutting-edge research papers and LLM interaction models to extract four key UX design dimensions for AI: transparency, behavior, communication, and humanizing. These formed the theoretical foundation for the design phase.







Prototyping
To bring these abstract design principles into testable form, I implemented the following strategies:

  • User Journey Mapping: I selected a representative MVP use case to map out user goals, interaction paths, and key AI touchpoints. This helped identify behavior patterns and moments of trust breakdown.

  • Conversation Flow Design: I structured modular dialog patterns around the four core UX principles—e.g., including source references in AI responses (transparency), predictive prompts (behavior), clear adjustment logic (communication), and personalized language (humanizing).

  • Interactive Prototyping: Using Figma, I created low- to high-fidelity prototypes of AI interfaces to simulate key workflows and test usability with real users.



A glimpse of best practices applied to a real use case, following the framework guidelines.



UX Design Concepts
Throughout the design process, I focused on the following main goals:


Calibrating Trust















Avoiding both overtrust and mistrust by calibrating user trust aligns with AI’s actual capabilities.
User-centered Explanations








Focusing on clear, user-helpful explanations over technical details.
Increasing user control








Providing balanced control to users as AI becomes more automated.





Usability Testing & Iteration
We conducted three rounds of rapid usability testing within one week:

  • Round 1: Initial testing with users from multiple plants to observe first impressions of AI behavior, reasoning, and interface clarity.

  • Round 2: After refining the design based on round 1 feedback, we tested again with new users, measuring task completion time, satisfaction, and interface comprehension.

  • Round 3: Final prototype testing showed that most users rated the chatbot's usability as 1–2 on a scale of 1 (very easy) to 10 (very difficult). Users generally expressed willingness to adopt the tool in their daily work.

  • Final Deliverable: Across three rounds, nine users were involved. Based on their feedback, we further refined input logic, clarification strategies, and personalization settings. The final prototype was production-ready and promoted internally.




Usability Testing Mapped Using Affinity Diagramming







Highlights




Design Principle

Humanization

Guideline Applied

Explain AI trustworthiness

Pattern Used

Adaptive Trust Indicator

Why

Avoiding overconfidence – Users tend to trust the AI after several correct answers. With the document together, it provides double verification and encourages critical thinking in an understanding way.
The AI confidence level is represented in a humanized form.






Expressing varying confidence levels
Interaction

Progressive Disclosure
– 
A color-coded assessment enables quick evaluation, with more details on hover and optional links for deeper insights, ensuring transparency without overload.





❌ Don’t
Misleading Numbers - Assuming users understand probability, a 95% trust rating might make them think the answer is correct. However, it only reflects the AI’s confidence, not its actual accuracy.
Design mistakes from the past





Impact


  • Increased Trust: User trust in the AI assistant significantly improved. More employees showed interest in adopting the tool in their routine workflows.

  • Cross-Team Alignment: The designed GenAI UX framework was integrated into BMW’s internal design system and became part of the company’s GenAI platform “GAIA,” supporting around 2,000 chatbot use cases to date.










Fachportal Redesign
CareCal
GenAI Maintenance Assistant
MusclePal







© 2025 Xinyi He. All Rights Reserved.