Pulze

Research

2 Days

Breaking it down

With this being a startup, the team wanted to move quickly and redesign the Playground ASAP. So as a compromise, I suggested that we split the redesign into two phases:

•

Phase 1 (1 week): Quickly touch up some obvious UX/UI issues, allowing users to gain a better understanding of what the product does.

•

Phase 2 (2 weeks): Test with actual customers and flesh the design out to be a more scalable solution for the future of the product.

Understanding our users

I had regular daily meetings with the CEO to communicate my research findings and validate any assumptions that I had about the redesign. It was mentioned that one of the major characteristics of the platform is that it should be easy enough for “anyone” to use. This referenced people who would have very little knowledge about the detailed workings of LLMs, such as Executives one of their main targets. With this in mind, I was able to expand my feedback circle to family, friends and former colleagues and gain a better starting point in regards to intuitiveness.

Heuristic analysis

Since this was a redesign project, I wanted to begin by conducting a heuristic analysis of the existing Playground. Then, I spoke with the CEO and several members of my team to get their opinions on what they felt could be improved.

I combined all of my findings and found commonalities in the responses regarding the lack of intuitiveness and a disconnect of the Playground to the rest of the platform (Dashboard).

key Takeaways

•

Not very intuitive.

•

No way to connect to app creation workflow.

•

Inability to compare prompt results

•

Parameters of results are hidden by default.

•

Unable to share configurations with others.

•

Lots of wasted screen space.

Design

17 Days

Where to start?

A big factor regarding the lack of intuitiveness was that users did not understand what to do first on the screen. There was no clear instruction or visual hierarchy.

Original

The positioning and space between the adjustable weights and the prompt entry field left users feeling confused about where to begin on the screen and the connection between the two elements. This also caused users’ eyes to travel up and down the page repeatedly to properly utilize the elements together.

Redesign

Here, I wanted to utilize the advantages of an F-pattern visual hierarchy, which emphasizes that people generally read screens from left-right, top-bottom. By moving the prompt entry field and examples to the top left of the screen, users’ eyes would inherently gravitate to the starting point of the screen.

Conversation UI

Most users assumed the setup of the Playground to work similarly to the conversational element of ChatGPT but were taken aback and confused once they tried to enter multiple prompts.

Original

The responses for each respective LLM were separated by tabs. The interface only showed one prompt result at a time, so it was difficult for users to compare the results of a single LLM, much less results of multiple LLMs.

Redesign

In order to minimize clicks and allow users with little technical knowledge to easily make comparisons, I chose to show the results of the top 3 recommended LLMs all at once. Also, we highlighted the “Top Choice”

Understanding your results

Parameters such as latency, cost per token, temperature (creativity) and response time are important variables used to compare LLMs, and determine which LLM is “best”.

Original

Parameters are hidden, by default, behind an unconventionally styled button. Users would have to click the button to view the parameters for each response.

Redesign

Showing the parameters by default made comparisons easier for developers, who primarily made use of this info to assess the efficiency of the model. Also, split results into 2 tabs: Chat and Compare. the Compare tab showed a UI that was much to easier to compare all necessary attributes of the models side-by-side.

User feedback (from Phase 1 iteration)

Finally, I was able to get in touch with some users in our Slack community to ask them about their experience with the updated Playground design and understand any remaining pain points.

feedback

•

Users likened the Playground to ChatGPT and expected the prompt entry field to be similar in position.

•

Preferred the concept but showing all 3 results at once made the screen too cluttered.

•

“Top Choice” felt made a few users uncomfortable and hesitant to continue because like they were “being sold something”.

Additional changes

One interesting change that was requested by the CEO was to quickly rebrand the entire website and the platform to align more with developers. In doing so, we decided to switch everything to dark theme with a more "dynamic" primary brand color. This meant that we also need to account for accessibility, according to WCAG guidelines, with the new color palette.

Something familiar

It seems I underestimated the overwhelming influence of ChatGPT, as users immediately expected the Playground to have a similar layout and functionality. While the functionality isn't exactly the same, Jakob's Law states that users prefer interfaces to work similarly to interfaces that they already familiar with. As a result, I adjusted the layout again for the final design.

Don't overwhelm, simplify

In speaking with users, I learned that most arrived on the Playground with a specific model in mind for testing. This led me to believe that they don't necessarily need to see the top 3 models we suggest, but maybe just the top option and how it compares to models they're interested in. As an alternative, I adjusted the final designs to add the ability for users to manually select which models they want to test and Pulze will provide what it thinks is best for them as well so they can compare the best of both worlds.

Subtle for the win

The idea was to really emphasize that the platform had selected the "best" model for users, it seems that the bold style of the "Top Choice" was too reminiscent of pricing charts and made users uncomfortable. To remedy this, I adjusted the design to give a more subtle highlight. This was a bit easier to accomplish with the new dark theme.

Find the right LLM for you

Overview

Problem

Challenge

Objective

My role(s)

Research

Breaking it down

Understanding our users

Heuristic analysis

key Takeaways

Design

Where to start?

Original

Redesign

Conversation UI

Original

Redesign

Understanding your results

Original

Redesign

User feedback (from Phase 1 iteration)

feedback

Additional changes

Something familiar

Don't overwhelm, simplify

Subtle for the win

Final designs

Conclusion

What I learned

Next steps

Check out my other work

Check out my other work

Stanford University