Centralized LLM Evaluation Dashboard for Psychological Consultation
A centralized dashboard developed in collaboration with BRIN to evaluate LLM outputs in psychological consultation use cases.

About the Client
Impact Summary
The Business Challenge
LLM output evaluation was previously conducted using spreadsheets, resulting in fragmented data and inconsistent assessment processes.
Evaluators lacked a structured system to assess LLM responses at both message-level and overall conversation quality.
Administrators had limited visibility into evaluator progress, evaluation summaries, and overall LLM quality.
The Solution
We developed a centralized LLM evaluation dashboard tailored for psychological consultation scenarios.
Evaluators can review LLM chat outputs per message, provide detailed feedback, and assign overall quality ratings in a structured workflow.
Admin features include dialog management, evaluator assignment, user management, and aggregated evaluation summaries.
The system provides consolidated insights into LLM quality based on evaluator assessments.
Key Implemented Features
- Message-Level LLM Output Evaluation
- Overall Conversation Rating
- Evaluator Assignment & Management
- Evaluation Summary & Quality Insights
- Centralized Evaluation Workflow
The Result
LLM evaluation workflows became more structured, consistent, and traceable.
Evaluation time was significantly reduced by replacing spreadsheet-based processes with a centralized system.
Researchers and administrators gained clear visibility into LLM performance quality and evaluation outcomes.