Optimizing AI workflows to help Distyl’s team validate models effortlessly.

Distyl

ROLE

UI/UX Designer

DURATION

October — December 2024; 8 weeks

TEAM

A cross-functional team consisting of 1 PM, 2 Designers, 5 Developers

Context

DISTYL’S MISSION

Distyl develops enterprise AI solutions for non-tech Fortune 500 companies across healthcare, IT, and finance. But building powerful AI models isn’t just about the code, it’s about ensuring accuracy, reliability, and trust.


For Distyl’s engineers, optimizing an LLM (Large Language Model) requires constant evaluation. Unlike supervised AI, where answers are clear-cut, unsupervised AI presents a unique challenge: defining what "good" looks like. Engineers needed a way to compare the model’s outputs against valid responses efficiently.

STAKEHOLDERS

For this project, I maintained continuous communication with our key stakeholders, Edward Chew and David Liusk, to provide updates on my progress and to seek feedback throughout those 8 weeks.

Distyl’s Challenge For Us

There was one primary problem Distyl introduced: the validation process for busy AI specialists at Distyl was slow, manual, and tedious.

SO WE ASKED

“How might we create a seamless, centralized platform for our AI strategists to manage data?”

GOAL

Our goal was to create an application that displays the data for easy comparison between the expected outputs and current outputs to improve model improvement practices.

Discover

USER INTERVIEWS & PROBLEM FRAMING

We kicked off the project with in-depth interviews with AI strategists to map out workflows, frustrations, and ideal solutions.

TIMELINE

5 weeks

PARTICIPANTS

4 AI Strategists, 1 Data Scientist, 1 Engineer

@ Distyl

USER INTERVIEW

Who are they, what do they do for Distyl?

WALKTHROUGH

What does their current workflow look like, and what obstacles do they run into?

RESEARCH GOALS

01

Understand Existing Workflows

How do engineers currently validate AI outputs, and what tools do they rely on?

02

Identify Bottlenecks and Inefficiencies

Where do users face friction within their current workflow?

03

Determine Opportunities for Improvement

Where can can we enhance the existing systems?

WHAT DISTYL EMPLOYEES SAID ABOUT THE CURRENT SYSTEM

We realized most Distyl employees rely on general spreadsheet tools, such as Excel, Google Sheets or Notion, that were not dynamic enough to keep up with the specific use case of comparing LLM models.

AI Engineer, Will Morley

“With Excel, what often happens is long text on some cells end up blowing up other cells. It’s not super useful seeing the entire sequel on the table, only when you’re actually comparing the two.”

“Keeping track of changes made to the data set is critical, including who made the changes and when. We always want to look back at old tests we’ve ran.”

AI Strategist, Mariah Alf

“Additional information in a generated output would sometimes be marked wrong incorrectly, so we would need to go through each box in Excel and manually correct and revalidate each output.”

AI Strategist, Harshini Jayaram

WHITE BOARDING AND THINKING OUT-LOUD EXERCISES

Taking our findings, we began white boarding the problem space and group key overlapping pain points in order to clearly define solutions.

With some of us being new to AI terminology and technology, aligning our understanding of the problem space with the right solution space was a challenge. As a multidisciplinary team of developers and designers, we supported one another’s knowledge gaps in order to ensure our insights were well-informed and actionable.

Define

THE PROBLEM SPACE

From our excercises, we clustered together three main patterns of pain points.

01

Poor Visual Clarity

When working with large datasets in Google Sheets, it’s difficult to identify key differences during comparison. This led to a time-consuming workflow that slowed down AI model refinement and also increased risk of human error.

02

Rigid Data Management

Spreadsheets lacked flexibility, it is difficult to update multiple datasets of the same type simultaneously. The current process is heavily reliant on individual manual reviews, which is prone to oversight.

03

Fragmented Collaboration

It’s challenging to align on evaluations to maintain a consistent shared understanding of the data among AI strategists and their customers.

How Might We...

After identifying key pain points in the validation workflow, we focused on reframing these challenges into opportunities for design solutions.

differentiate from existing tabular data tools and create a validation experience tailored to golden sets?

design a system flexible enough to accommodate a wide variety of data sets and validation requirements?

improve data editing workflows to help AI specialists efficiently update and validate golden sets?

streamline information across teams to ensure key updates and issues are consistently reviewed?

THE SOLUTION SPACE

To address each of the pain points and HMW questions, we established clear objectives to guide our design ideations.

PAIN POINT

SOLUTION

01

POOR VISUAL CLARITY

INTUITIVE STRUCTURED UI

Reduce complexity but maintain familiarity.

02

RIGID DATA MANAGEMENT

FILTERS AND GROUPING

Provide dynamic grouping and flagging.

03

FRAGMENTED COLLABORATION

HISTORY AND COMMENTS

Introduce version tracking and shared spaces for discussion.

Design: MVP

DEFINING DESIGN PRINCIPLES FOR AN EFFECTIVE AI VALIDATION PLATFORM

Structured yet Flexible Data Management

Design an organized system that supports different data types.

Collaborative by Default

Enable seamless teamwork by incorporating features ensuring alignment across all stakeholders.

Transparent & Traceable

Provide clear version history and status tracking so that every change is logged, reducing errors and improving accountability.

Beyond a Spreadsheet

Enhance traditional tabular data tools by integrating validation-specific workflows that streamline decision-making.

CREATING MVP FOR ENGINEERS

Given the complexity of AI validation workflows, we focused on designing for immediate impact by delivering a minimum viable product to address the most pressing issues while working within our time constraints and engineering feasibility.


I prioritized on delivering a foundational solution for our engineers to develop within our given timeline, ensuring core functionality such as a tabular layout, basic tagging, initial version history, and a side-by-side comparison.

Comparing input and output pairs

Global history of all data points

Rethinking the Workflow

FROM FOUNDATION TO SOLUTION

Our MVP designs didn’t exactly solve all the pain points we found through our discussions with AI strategists. While our MVP’s provided a solid foundation, feedback from Distyl highlighted opportunities for deeper collaboration tools and more flexible data management options. Taking this feedback into account, I continued iterating on our designs where I moved beyond the MVP to a more comprehensive solution.

“Data points are typically always going through iteration, especially in the beginning of the project.” — AI Strategist

I had to rethink the workflow AI strategists needed in order to successfully collaborate within their iterative validating journey. To do this, I began mapping out a life cycle of a data point that implements a structured review-and-approval flow before finalized dataset changes.

Life Cycle of a Data Point

Final Designs

I started off by designing the workflow into the table.

Tabular View of All Data Points

Filtering Data Points

Selecting Multiple Data Points to Assign

I then began building this structure into the Activity Page, which was previously known as the Global History Page.

Opening Activity Page

Opening a Cell to See Current vs. Expected Output

Approving With Inaccuracies

Version History of a Single Cell

Everything I learned From This Project

01

DESIGNING FORWARDS, THEN BACKWARDS—THEN FORWARDS AGAIN

I started with an ambitious vision for what this product could be. But as our timeline shifted, I had to pause and rethink about what actually mattered most to the people using it. I scaled things back into an MVP that was focused and realistic. That shift forced me to really differentiate between user needs and nice-to-haves. Once we locked in the essentials, I was able to build forward again, this time with more clarity.

02

UNDERSTANDING THE AI ECOSYSTEM

Before diving into this project, I was still new to AI workflows and terminology. I quickly realized that to design something truly useful, I had to understand our users’ process inside and out. That learning curve was steep and required a lot (like, a lot a lot) of questions and interviews—but in the end, it proved essential.

Team Distyl @ CodeLab Banquet

Team lunch after a work session!

Let's work together!