

Optimizing AI workflows to help Distyl’s team validate models effortlessly.
Distyl
ROLE
UI/UX Designer
DURATION
October — December 2024; 8 weeks
TEAM
A cross-functional team consisting of 1 PM, 2 Designers, 5 Developers
Context

DISTYL’S MISSION
Distyl develops enterprise AI solutions for non-tech Fortune 500 companies across healthcare, IT, and finance. But building powerful AI models isn’t just about the code, it’s about ensuring accuracy, reliability, and trust.
For Distyl’s engineers, optimizing an LLM (Large Language Model) requires constant evaluation. Unlike supervised AI, where answers are clear-cut, unsupervised AI presents a unique challenge: defining what "good" looks like. Engineers needed a way to compare the model’s outputs against valid responses efficiently.
STAKEHOLDERS
For this project, I maintained continuous communication with our key stakeholders, Edward Chew and David Liusk, to provide updates on my progress and to seek feedback throughout those 8 weeks.
Distyl’s Challenge For Us
There was one primary problem Distyl introduced: the validation process for busy AI specialists at Distyl was slow, manual, and tedious.

SO WE ASKED
“How might we create a seamless, centralized platform for our AI strategists to manage data?”
GOAL
Our goal was to create an application that displays the data for easy comparison between the expected outputs and current outputs to improve model improvement practices.
Discover
USER INTERVIEWS & PROBLEM FRAMING
We kicked off the project with in-depth interviews with AI strategists to map out workflows, frustrations, and ideal solutions.
TIMELINE
5 weeks
PARTICIPANTS
4 AI Strategists, 1 Data Scientist, 1 Engineer
@ Distyl
USER INTERVIEW
Who are they, what do they do for Distyl?
WALKTHROUGH
What does their current workflow look like, and what obstacles do they run into?
RESEARCH GOALS
01
Understand Existing Workflows
How do engineers currently validate AI outputs, and what tools do they rely on?
02
Identify Bottlenecks and Inefficiencies
Where do users face friction within their current workflow?
03
Determine Opportunities for Improvement
Where can can we enhance the existing systems?
WHAT DISTYL EMPLOYEES SAID ABOUT THE CURRENT SYSTEM
We realized most Distyl employees rely on general spreadsheet tools, such as Excel, Google Sheets or Notion, that were not dynamic enough to keep up with the specific use case of comparing LLM models.
AI Engineer, Will Morley
“With Excel, what often happens is long text on some cells end up blowing up other cells. It’s not super useful seeing the entire sequel on the table, only when you’re actually comparing the two.”


“Keeping track of changes made to the data set is critical, including who made the changes and when. We always want to look back at old tests we’ve ran.”
AI Strategist, Mariah Alf
“Additional information in a generated output would sometimes be marked wrong incorrectly, so we would need to go through each box in Excel and manually correct and revalidate each output.”

AI Strategist, Harshini Jayaram
WHITE BOARDING AND THINKING OUT-LOUD EXERCISES
Taking our findings, we began white boarding the problem space and group key overlapping pain points in order to clearly define solutions.

With some of us being new to AI terminology and technology, aligning our understanding of the problem space with the right solution space was a challenge. As a multidisciplinary team of developers and designers, we supported one another’s knowledge gaps in order to ensure our insights were well-informed and actionable.
Define
THE PROBLEM SPACE
From our excercises, we clustered together three main patterns of pain points.
01
Poor Visual Clarity
When working with large datasets in Google Sheets, it’s difficult to identify key differences during comparison. This led to a time-consuming workflow that slowed down AI model refinement and also increased risk of human error.
02
Rigid Data Management
Spreadsheets lacked flexibility, it is difficult to update multiple datasets of the same type simultaneously. The current process is heavily reliant on individual manual reviews, which is prone to oversight.
03
Fragmented Collaboration
It’s challenging to align on evaluations to maintain a consistent shared understanding of the data among AI strategists and their customers.
How Might We...
After identifying key pain points in the validation workflow, we focused on reframing these challenges into opportunities for design solutions.
differentiate from existing tabular data tools and create a validation experience tailored to golden sets?
design a system flexible enough to accommodate a wide variety of data sets and validation requirements?
improve data editing workflows to help AI specialists efficiently update and validate golden sets?
streamline information across teams to ensure key updates and issues are consistently reviewed?
THE SOLUTION SPACE
To address each of the pain points and HMW questions, we established clear objectives to guide our design ideations.
PAIN POINT
SOLUTION
01
POOR VISUAL CLARITY
INTUITIVE STRUCTURED UI
Reduce complexity but maintain familiarity.
02
RIGID DATA MANAGEMENT
FILTERS AND GROUPING
Provide dynamic grouping and flagging.
03
FRAGMENTED COLLABORATION
HISTORY AND COMMENTS
Introduce version tracking and shared spaces for discussion.
Design: MVP
DEFINING DESIGN PRINCIPLES FOR AN EFFECTIVE AI VALIDATION PLATFORM
Structured yet Flexible Data Management
Design an organized system that supports different data types.
Collaborative by Default
Enable seamless teamwork by incorporating features ensuring alignment across all stakeholders.
Transparent & Traceable
Provide clear version history and status tracking so that every change is logged, reducing errors and improving accountability.
Beyond a Spreadsheet
Enhance traditional tabular data tools by integrating validation-specific workflows that streamline decision-making.
CREATING MVP FOR ENGINEERS
Given the complexity of AI validation workflows, we focused on designing for immediate impact by delivering a minimum viable product to address the most pressing issues while working within our time constraints and engineering feasibility.
I prioritized on delivering a foundational solution for our engineers to develop within our given timeline, ensuring core functionality such as a tabular layout, basic tagging, initial version history, and a side-by-side comparison.

Comparing input and output pairs

Global history of all data points
Rethinking the Workflow
FROM FOUNDATION TO SOLUTION
Our MVP designs didn’t exactly solve all the pain points we found through our discussions with AI strategists. While our MVP’s provided a solid foundation, feedback from Distyl highlighted opportunities for deeper collaboration tools and more flexible data management options. Taking this feedback into account, I continued iterating on our designs where I moved beyond the MVP to a more comprehensive solution.
“Data points are typically always going through iteration, especially in the beginning of the project.” — AI Strategist
I had to rethink the workflow AI strategists needed in order to successfully collaborate within their iterative validating journey. To do this, I began mapping out a life cycle of a data point that implements a structured review-and-approval flow before finalized dataset changes.
Life Cycle of a Data Point

Final Designs
I started off by designing the workflow into the table.

Tabular View of All Data Points

Filtering Data Points


Selecting Multiple Data Points to Assign
I then began building this structure into the Activity Page, which was previously known as the Global History Page.

Opening Activity Page
Opening a Cell to See Current vs. Expected Output



Approving With Inaccuracies
Version History of a Single Cell
Everything I learned From This Project
01
DESIGNING FORWARDS, THEN BACKWARDS—THEN FORWARDS AGAIN
I started with an ambitious vision for what this product could be. But as our timeline shifted, I had to pause and rethink about what actually mattered most to the people using it. I scaled things back into an MVP that was focused and realistic. That shift forced me to really differentiate between user needs and nice-to-haves. Once we locked in the essentials, I was able to build forward again, this time with more clarity.
02
UNDERSTANDING THE AI ECOSYSTEM
Before diving into this project, I was still new to AI workflows and terminology. I quickly realized that to design something truly useful, I had to understand our users’ process inside and out. That learning curve was steep and required a lot (like, a lot a lot) of questions and interviews—but in the end, it proved essential.


Team Distyl @ CodeLab Banquet
Team lunch after a work session!