How might we help human technicians work better with AI-enabled buildings?

Every legacy interface has a feature that quietly holds everything together. Without user research, it’s often hard to guess what that feature might be. To outside eyes, it might look clunky, ugly, or redundant, easy to simplify or replace with something shiny and new.

While redesigning Myrspoven’s customer app, we cut “AI vs. Baseline,” a graph-heavy diagnostic tool, from the MVP to focus on other priorities, thinking its core functions had been covered elsewhere. But in early user testing, it became obvious the app couldn’t function without it. By rethinking the tool’s design, and later extending its patterns to other parts of the product, we turned it into the app’s most valued feature.

my contributions
UI design
UX research

Prototyping

Frontend project management

the goal

Myrspoven is a pioneer in AI-driven building optimization. Their core product uses an AI model to optimize a building’s HVAC system every 15 minutes, finding a balance between indoor comfort and energy savings. Users set min/max values to guide the AI’s output.

Of course, no AI is perfect, especially right out of the box. Plus, buildings are complex, with multiple overlapping control systems and unpredictable human factors. Multiple control systems, unpredictable human behavior, and the occasional “tenant opened the window in January” moment mean human oversight is essential.

My task was to design a new customer-facing portal that would:

  • Encourage proactive check-ins
  • Reduce system malfunctions and tenant complaintsIncrease trust in the AI
  • Decrease support load by enabling technicians to be more independent

process at a glance

Shipped first MVP features (not including AI vs. Baseline) based on spec, stakeholder feedback, and desk research
Internal usability testing showed users couldn’t complete key workflows without AI vs. Baseline
Designed and tested a new, highly scannable diagnostic interface
Extended tool and applied patterns to other areas when anchor feature dropped due to tech constraints
Continuous feedback and data collection, before and after product launch

finding a missing piece

While testing our initial MVP with internal users, I gave them two tasks, and asked them to use the new portal to solve them as much as possible:

In both cases, users had to leave the new app and use the old “AI vs. Baseline” tool to finish.

  • A daily checkup on a building, scanning for problems
  • A diagnostic workflow based on a customer complaint where they attempted to find ways to save energy in the building

In both cases, users had to leave the new app and use the old “AI vs. Baseline” tool to finish.

Key insight: The new “happy path” laid out in the spec was a road to nowhere. Without some form of this tool in the MVP, technicians couldn’t do their jobs.

Left: a screenshot of a user research session on Google Meet. A user (blurred face) is demonstrating a tool using screen sharing to the researcher (unblurred). Right: a window overlooking another building, covered in Post-It notes

Left: A user demonstrates the old AI vs. Baseline interface during internal user testing.
Right: Post-its with sorted insights and tasks generated from testing are affinity mapped and prioritized

designing the new ai vs. baseline

pain point: finding and filtering

In diagnostic workflows, users often knew which signals they needed to analyze, but had trouble finding them in the endless scroll, often resorting to ctrl+F.

Solution: Added search, chip filters, and date picker to the top of the interface, helping users to tailor the view to their current needs

pain point: scannability

Repeated interface elements, slow loading, and layout made it challenging for users to scan quickly. This was important to both workflows, but especially for the daily checkup.Solution:

Solution:

  • Tested 2 layouts internally: a cleaned-up single column layout, and one that used small multiple graphs for quick comparison
  • Small multiples performed best, so I refined it and handed it off
  • Graphs can be opened into a modal view for deeper analysis when needed, helping to reduce cognitive overload while allowing better access to supplemental data

pain point: supplemental data access

Supplemental data was hidden and tedious to access. All supplemental data was also presented on a single graph despite different units and scales, leading to a difficult-to-interpret graph with effectively 3 different y-axes.

Solution:

  • Stacked individual graphs with a shared time axis and synchronized tooltips
  • Datasets could be added, removed, and toggled through chips and search

business goal: help users manage the system correctly

A paradox came up in conversations with stakeholders and management-level users. We wanted users to proactively manage their buildings, but too much “tinkering” with the wrong values — specifically, those displayed in AI vs. baseline — would make the AI less effective, validating technician users’ distrust in the system.

Solution:

I added a bit of friction back in. Values could be viewed in the graph modal, but not directly edited there. Instead, users could click through to edit the values in a different view and context, giving more time to consider if a change was helpful.

losing an anchor

As we began planning for launch, another wrinkle emerged. After months of development, it turned out that a key AI-driven feature — the one that took center stage in all the decks being shown to investors — simply wasn’t technically feasible as planned. In bleeding-edge AI products, these things happen, but this left a big gap in perceived product value.

I was able to make the case that, thanks to the new AI vs. baseline, we were delivering more than enough value to users by improving the presentation of the data they really used. This would help users to gradually trust and understand the AI over time as they measured its impact in the data views and language they were already familiar with.

Based on this insight, I began improving and extending other key, but neglected, data features, often using the design patterns from AI vs. baseline. By leaning on meeting user needs instead of delivering flashy features, we were able to deliver a product with measurable value to end-users at launch.

Design win: Reusing design patterns and applying research insights to other features helped us build a rich, user-centered product even without the missing feature.

testing & metrics

I planned pre-launch usability testing and prepared a plan to monitor if the app was meeting our goals by tracking metrics over time:

  • Pre-launch customer usability testing, targeting both engaged and skeptical technicians
  • Follow-up surveys with test group to monitor 2 metrics:
    • SUS (system usability scale) for overall usability
    • TOAST (trust in automated systems test) for AI trust
  • Set up Matomo analytics in the new and old apps to monitor broad trends in usage frequency and workflows

outcome

  • Despite losing the original anchor feature, the new portal, including AI vs. baseline and several other improved data features, launched in February 2024 with strong early user reception
  • The enriched AI vs. Baseline was described by users as the “most appreciated” feature in the new customer portal
  • User-research informed patterns from its design improved other core workflows

reflection

  • This project reinforced that designing for experts means balancing density with clarity, and sometimes adding friction where it protects the system’s long-term performance.
  • It also clearly demonstrated for me that even a detailed brief can often overlook critical needs — and that user research really is the fastest way to uncover those blind spots.
  • This experience, along with expanding my toolkit to include some scrappy, lower-demand research techniques, has made me more confident in pushing for user research earlier in the process to de-risk major assumptions before we build, hopefully leading to fewer plot twists in the future. •