How might we help human technicians work better with AI-enabled buildings?

Every legacy interface has a feature that quietly holds everything together. Without user research, it’s often hard to guess what that feature might be. To outside eyes, it might look clunky, ugly, or redundant, easy to simplify or replace with something shiny and new.

While redesigning Myrspoven’s customer app, we cut “AI vs. Baseline,” a graph-heavy diagnostic tool, from the MVP to focus on other priorities, thinking its core functions had been covered elsewhere. But in early user testing, it became obvious the app couldn’t function without it. By rethinking the tool’s design, and later extending its patterns to other parts of the product, we turned it into the app’s most valued feature.

my contributions

UI design
UX research 
Prototyping 
Frontend project management

the goal

Myrspoven is a pioneer in AI-driven building optimization. Their core product uses an AI model to optimize a building’s HVAC system every 15 minutes, finding a balance between indoor comfort and energy savings. Users set min/max values to guide the AI’s output.

Of course, no AI is perfect, especially right out of the box. Plus, buildings are complex, with multiple overlapping control systems and unpredictable human factors. Multiple control systems, unpredictable human behavior, and the occasional “tenant opened the window in January” moment mean human oversight is essential.

My task was to design a new customer-facing portal that would:

Encourage proactive check-ins
Reduce system malfunctions and tenant complaintsIncrease trust in the AI
Decrease support load by enabling technicians to be more independent

finding a missing piece

While testing our initial MVP with internal users, I gave them two tasks, and asked them to use the new portal to solve them as much as possible:

In both cases, users had to leave the new app and use the old “AI vs. Baseline” tool to finish.

A daily checkup on a building, scanning for problems
A diagnostic workflow based on a customer complaint where they attempted to find ways to save energy in the building

In both cases, users had to leave the new app and use the old “AI vs. Baseline” tool to finish.

Key insight: The new “happy path” laid out in the spec was a road to nowhere. Without some form of this tool in the MVP, technicians couldn’t do their jobs.

Left: a screenshot of a user research session on Google Meet. A user (blurred face) is demonstrating a tool using screen sharing to the researcher (unblurred). Right: a window overlooking another building, covered in Post-It notes

Left: A user demonstrates the old AI vs. Baseline interface during internal user testing.
Right: Post-its with sorted insights and tasks generated from testing are affinity mapped and prioritized

designing the new ai vs. baseline

pain point: finding and filtering

In diagnostic workflows, users often knew which signals they needed to analyze, but had trouble finding them in the endless scroll, often resorting to ctrl+F.

Solution: Added search, chip filters, and date picker to the top of the interface, helping users to tailor the view to their current needs

pain point: scannability

Repeated interface elements, slow loading, and layout made it challenging for users to scan quickly. This was important to both workflows, but especially for the daily checkup.Solution:

Solution:

Tested 2 layouts internally: a cleaned-up single column layout, and one that used small multiple graphs for quick comparison
Small multiples performed best, so I refined it and handed it off
Graphs can be opened into a modal view for deeper analysis when needed, helping to reduce cognitive overload while allowing better access to supplemental data

pain point: supplemental data access

Supplemental data was hidden and tedious to access. All supplemental data was also presented on a single graph despite different units and scales, leading to a difficult-to-interpret graph with effectively 3 different y-axes.

Solution:

Stacked individual graphs with a shared time axis and synchronized tooltips
Datasets could be added, removed, and toggled through chips and search

business goal: help users manage the system correctly

A paradox came up in conversations with stakeholders and management-level users. We wanted users to proactively manage their buildings, but too much “tinkering” with the wrong values — specifically, those displayed in AI vs. baseline — would make the AI less effective, validating technician users’ distrust in the system.

Solution:

I added a bit of friction back in. Values could be viewed in the graph modal, but not directly edited there. Instead, users could click through to edit the values in a different view and context, giving more time to consider if a change was helpful.

losing an anchor

As we began planning for launch, another wrinkle emerged. After months of development, it turned out that a key AI-driven feature — the one that took center stage in all the decks being shown to investors — simply wasn’t technically feasible as planned. In bleeding-edge AI products, these things happen, but this left a big gap in perceived product value.

I was able to make the case that, thanks to the new AI vs. baseline, we were delivering more than enough value to users by improving the presentation of the data they really used. This would help users to gradually trust and understand the AI over time as they measured its impact in the data views and language they were already familiar with.

Based on this insight, I began improving and extending other key, but neglected, data features, often using the design patterns from AI vs. baseline. By leaning on meeting user needs instead of delivering flashy features, we were able to deliver a product with measurable value to end-users at launch.

Design win: Reusing design patterns and applying research insights to other features helped us build a rich, user-centered product even without the missing feature.

testing & metrics

I planned pre-launch usability testing and prepared a plan to monitor if the app was meeting our goals by tracking metrics over time:

Pre-launch customer usability testing, targeting both engaged and skeptical technicians
Follow-up surveys with test group to monitor 2 metrics:
- SUS (system usability scale) for overall usability
- TOAST (trust in automated systems test) for AI trust
Set up Matomo analytics in the new and old apps to monitor broad trends in usage frequency and workflows

outcome

Despite losing the original anchor feature, the new portal, including AI vs. baseline and several other improved data features, launched in February 2024 with strong early user reception
The enriched AI vs. Baseline was described by users as the “most appreciated” feature in the new customer portal
User-research informed patterns from its design improved other core workflows

reflection

This project reinforced that designing for experts means balancing density with clarity, and sometimes adding friction where it protects the system’s long-term performance.
It also clearly demonstrated for me that even a detailed brief can often overlook critical needs — and that user research really is the fastest way to uncover those blind spots.
This experience, along with expanding my toolkit to include some scrappy, lower-demand research techniques, has made me more confident in pushing for user research earlier in the process to de-risk major assumptions before we build, hopefully leading to fewer plot twists in the future. •

the company & the project so far

Myrspoven is a pioneer in AI-driven building optimization. Their core product uses an AI model to optimize its HVAC system every 15 minutes, finding a balance between indoor comfort and energy savings. Users primarily define and adjust the climate policy for the building by setting min and max values for sensors and system components throughout the building. These values act as guardrails and set expectations for the AI output.   

Of course, no AI is perfect, especially right out of the box. Buildings are complex systems, made from many overlapping, sometimes conflicting control systems. They're also full of unpredictable human beings. It was still necessary for end users to keep an eye on the system and troubleshoot any problems they found, whether proactively or due to tenant complaints. Myrspoven hoped that a better user interface would increase the amount of proactive check-ins — ideally daily or a few times a week — and help to decrease the number of system malfunctions and tenant complaints.

Myrspoven wanted to:

“open the black box,” using data to help users understand and trust the ai
decrease support load by encouraging users to manage their buildings proactively and correctly

I was brought on to design a customer-facing product to help to meet those goals.

After nearly a year and a number of fits and starts, we had a nearly finished product and were beginning usability testing. But it turned out there was one feature we had missed: AI vs. baseline.

the first test

I conducted user interviews, then usability tests with 5 internal users. Users were asked to run through 2 tasks: a daily checkup on a building, scanning for problems; and a diagnostic workflow where they attempted to find ways to save more energy in the building, using only the new app as much as possible. For both workflows, users wound up needing to open AI vs. baseline in the old interface in order to complete the tasks. This gave me the chance to observe how they used the current page, and lead to some key insights about the feature:

essential feature

Focus on 1-on-1 activities (manual adjustments, teaching new techniques) or special equipment use, during intermediate sessions, and let users do the rest at home.

hard to scan

repetition, slow loading, and layout make it challenging to scan quickly in a checkup workflow

hard to find what you're looking for

when a specific signal or group is needed, users were searching with the browser (ctrl + f)

supplemental data access is clunky

supplemental data is presented poorly and hard to access

With user needs definitively demonstrated, and some clear ideas for improvements, I started designing the new feature. I was concerned with addressing the points above, as well as implementing general best practices for data vis, like removing “chart junk” and eliminating mixed units and double axes.

How might we help human technicians work better with AI-enabled buildings?

the goal

process at a glance

finding a missing piece

designing the new ai vs. baseline

pain point: finding and filtering

pain point: scannability

pain point: supplemental data access

business goal: help users manage the system correctly

designing and testing a new skimmable layout

losing an anchor

testing & metrics

outcome

reflection

the company & the project so far

the first test

essential feature

hard to scan

hard to find what you're looking for

supplemental data access is clunky