My role

I led the design of this feature for ADEM (Autonomous Digital Experience Management) early this year. ADEM is an offering by Palo Alto Networks that offers segment-wise insights for complete visibility, operational efficiency, and the user experience to diagnose network issues.

This feature marked a significant transition for ADEM’s offering as an end-to-end tool for custom diagnosing issues and pinpointing causes of poor performance for applications.

Customer Insights and Ideation

I partnered with three product managers at Palo Alto Networks and one product manager at Zoom to uncover insights obtained from the research, and translate concepts into features to address customer pain points and behavior.

Experience Strategy and Vision

Created a framework and prototypes to share the vision and design principles and content strategy. This helped evangelize ideas and gain alignment across the platform, and a roadmap for having unique application dashboards was laid.

Planning and Scope Definition

I defined the product with product manager partners. I evangelized customer goals and balanced them with business goals. I prioritized and negotiated features for launch and beyond.

Oversight and Coordination

I collaborated with one other designer to translate product features.

Design Execution and Validation

I executed journeys, wireframes, prototypes, and design specs.

Leadership

I presented work to gain buy-in from executives, senior stakeholders, and many other teams throughout the project lifecycle.

Final prototype


Some context on the problem

Remote work is set to expand substantially following the end of the pandemic compared to pre-pandemic levels. And there’s a good reason why. During the pandemic, organizations have largely experienced increased productivity from a workforce that has been primarily working from home, and employees have primarily found that they like it.

Research reveals that 43 percent of employees expect more organizations to offer remote work after the pandemic comes to an end. Fully remote workers earn more than non-remote workers and reported higher rates of job satisfaction and retention compared to non-remote workers.


Challenges Created by Highly Distributed Environments

No doubt, these highly distributed environments are far more complex than they were in the past. According to research, 75% of organizations believe that their IT environments are more complex than they were just two years ago.



Other insights from research

  • Lack of comprehensive control and visibility into all components of the service delivery path including device, Wi-Fi, and home router, ISP path, applications, and internet traffic.

  • Together, these factors have contributed to extended troubleshooting times, loss of productivity (think of the influx of help desk tickets from users), and poor user experiences.

  • Existing solutions used for network monitoring, endpoint monitoring, and application monitoring provide siloed visibility into their respective domains, but lack the context of the overall environment, making it difficult to troubleshoot effectively. Consequently, organizations began turning to digital experience management (DEM) solutions.



Working “good enough” is no longer good enough. To be operationally efficient and ensure positive user experiences, organizations require end-to-end visibility into the actual data paths for all applications.



Identifying Business Goals

What is in it for Palo Alto Networks?

ADEM already offers segment-wise insights on application performance. This could immediately be extended to other SaaS applications like Slack, Microsoft Teams, and Cisco Webex.

In the long run, this opens a dialogue for customizing insights for complex applications like Figma, Salesforce, Tableau, Photoshop, and Microsoft suite of products.


What is the end goal for Zoom?

The pandemic marked a surge in Zoom usage that led to an increase in the volume of calls that gave rise to call quality issues like delays, interference, interruptions, disconnections, and more. Zoom wanted to gain insight into specific call quality data, view trends categorized by audio, video, and screen-sharing.

 
 

Vision

Segment-wise Insights - Operations teams must be able to view every segment in the application delivery path for all users - in branch, home, or remote locations.


Expedited Troubleshooting - In a single pane of glass, ADEM must provide comprehensive visibility that aids Zoom to substantially reduce the time to identify potential issues and then remediate them.

Comprehensive Monitoring - ADEM must collect and monitor data from endpoint devices (including CPU, memory utilization, and Wi-Fi statistics), leveraging both real user traffic and synthetic tests to provide a holistic view of the entire distributed environment.




Picking up existing pieces

As a network admin,


I must resolve or communicate likely performance issues before they impact end-user experience



Quickly understand the root cause of performance issues, and resolve them when possible


Joining the dots

Scenario

The University of Miami’s (UM) campus in Coral Gables, Florida, suddenly went quiet in the spring of 2020 as students switched to remote learning at the onset of the COVID-19 pandemic. Over the next few months, some students returned to campus while others remained online, requiring the university to provide flexible learning options in response to the pandemic’s evolving situation.

We had to go from being a mostly residential, on-campus university to different ways of teaching including completely remote

- Ernie Fernandez, Chief Information Officer at UM.

Faculty members began exploring unique applications of Zoom for interactive instruction. Ali Habashi, assistant professor of cinematic arts at UM’s School of Communication, encouraged remote students to use Zoom as a collaborative platform for filmmaking.

——————————Talk about Zoom issues————————Talk about how Self Serve empowered users to resolve issues.


Design principles

Zoom Metrics to help measure call quality

Zoom admins can enable meeting quality scores or network alerts on the Dashboard for meetings and webinars. 

The quality score of the meeting is based on the Mean Opinion Score (MOS) which ranges from 1 to 5 for a meeting’s quality between bad to good. Network alerts and quality scores for audio, video, and screen sharing will be displayed on the Meetings and Webinar dashboard. Zoom will use default values for network alerts.

Alternatively, admins can set custom thresholds that will trigger network alerts related to audio, video, screen sharing, and CPU usage. These alerts will be shown on Dashboard.

Feedback

Managing feedback was even more challenging and felt like a swinging pendulum of viewpoints

ADEM was able to quickly identify the issue and alert users within seconds of the problem occurring, mitigating user frustration and stemming the potential loss of productivity.


Challenges within the Team

Working backwards from a fixed launch date, meant that design was subsumed into an engineering driven process. Sign-off milestones were driven by engineering estimates and time to create the right design was the time left over. The combination of a fixed launch date and aggressive scope created an intense environment with many coordination and time challenges.



Customer Insights

Numerous calls with the Zoom team drove our planning phase

Key insights from the research:

Primary issues are related to audio. (Icon)

Network admins want to know the exact point of issue -device/ wifi/internet/lan

Customers only need to know about the poorest performing calls - decisions about KPIs

Data collection from historical data - trend (future scope)

Monitoring for a wide time range


How we got there?

Managing feedback was even more challenging because it felt like a swinging pendulum of viewpoints. The team spent a disproportionate amount of time debating design decisions - when there wasn’t data that could easily be gathered to. Help drive a decision. For example - The disable functionality was a highly destructive action as the user would lose access to all historical data. So we added a show/hide functionality instead.

The impact was agony, paralysis and a growing skepticism for instincts in the design process.


In order to avoid this, I started creating documentations to help alleviate the data crutch and better articulate and distribute design rationale. Doing this was time consuming but saved a lot of back and forth as the project progressed. 

Design principles and feedback from the Zoom team helped to create visibility into my design process and galvanize the team to share in the vision for unique dashboards for each monitored app. 


Do all the heavy lifting for the customer

Make sensible decisions and provide intuitive fixes/resolutions for improving performance

Account for edge cases - Meetings rejoined……

Show actionable content at all times

Avoid dead ends


A technique I used to brainstorm content ideas was inspired by a technique from —— book = ‘——‘. He teaches us that great and original ideas can emerge if we reverse the polarity in an assumption.



I imagined the worst possible perception of the Zoom dashboard to be analogous to the … - useless metrics that do nothing except show a bunch of numbers

Filtered out absolutely useful metrics - journey map of a network admin to troubleshoot issues with Zoom



From this exercise, it became clear that we could help derive more value for Zoom if we crafted a curated experience that can be applied to other applications as well. 


To get buy-in for this direction, I created a set of dashboards for a few apps uniquely identifying their issues. Although many of these concepts were not feasible at the time of launch, they were still important to help get the team a vision about the future of diagnostic capabilities of ADEM.


Survey

Sniffing out important details to understand the root causes of each type of issue.

  1. There is a high degree of overlap between audio and video related issues.

  2. There is a set way admins follow to find out root causes that go beyond the user interface


Based on these insights, I designed the cascading UI where action taken on the top widgets curate the subsequent widgets to show the user a filtered view. 


How does this fit in the overall integrated view of PANW suite of products?

How can we utilize existing design patterns to design this curated experience?


SWOT analysis



Features: 

UCaaS user detail view with meeting performance data for a particular ADEM user in a given *day*. Root-Cause Analysis for UCaaS Performance Issues
Organization-Wide UCaaS Performance Dashboard
User alerts
and remediation suggestions for upcoming meetings likely to underperform 

Data: 

Initially, we want to support UCaaS integration for Zoom There are critical gaps in the data available from Teams 

Zoom is rolling out a push-based streaming service that will give us per-user qos telemetry every minute 


Data Volume: 

PAN has about 75k meetings a week with 12k employees. About 1.5 meetings per employee per workday. So for a customer with 100k employees, we should expect about 150k meetings per workday and be prepared to handle 300k meetings per day.
In order to support the existing ADEM time ranges, we need to be able to store and retrieve up to 30 days of data. 


Root Cause Analysis 

The primary value proposition of ADEM's integration with Zoom is ADEM's ability to leverage existing synthetic test data to interpret the root cause of experience issues users have on Zoom.
We want to identify the most likely root-cause, and, in some cases, there may be more than one root-cause. A root cause represents a problem that we believe requires attention and action. 

A couple cases to consider: 

There's high latency at the LAN exit - 500ms. There's 550ms of latency on the isp. The isp only introduced 50ms latency, so the LAN exit is the root cause.
There's high latency at the LAN exit - 500ms. There's 1000ms of latency on the isp. The isp introduced 1000 ms latency, so both the LAN and ISP are root causes 

There's moderate latency at the LAN exit as well as the ISP - introduction of 200ms, 100ms respectively. Together, this amounts to latency that impacts Zoom. So we flag the LAN exit because it's responsible for the greatest part of the issue.
There's moderate and identical latency at the LAN exit as well as the ISP - introduction of 150ms, 150ms respectively. This impacts Zoom. Because the LAN exit has high latency relative to normal latency at the LAN exit, AND Zoom was impacted, we flag the LAN exit 







Enriched Self-Serve Notifications 

If an end-user has self-serve enabled AND we have Zoom QSS data for that user, we should enrich any self-serve notifications we raise on their device with Zoom information.
The trigger for these notifications should remain the existing self-serve trigger 

 

Isolate problems instantly - increase productivity by quickly isolating problems

Reduce ticket escalations - use easy to use metrics that reduce escalations to Tier 3 support teams

Proactively notify users on performance degradation to reduce service desk ticket volume

Using meeting quality scores and network alerts

Last Updated: December 13, 2021

Account owners and admins can enable meeting quality scores or network alerts on the Dashboard for meetings and webinars

The quality score of the meeting is based on the Mean Opinion Score which ranges from 1 to 5 for a meeting’s quality between bad to good. Network alerts and quality score for audio, video and screen share will be displayed on the Meetings and Webinar dashboard. Zoom will use default values for network alerts.

Alternatively, admins can set custom thresholds that will trigger network alerts related to audio, video, screen sharing, and CPU usage. These alerts will be shown on Dashboard.

This article covers:

How to enable meeting quality scores and network alerts

  1. Sign in to the Zoom web portal.

  2. In the navigation panel, click Account Management then Account Settings.

  3. Click the Meeting tab.

  4. In the Admin Options section, verify that Meeting quality scores and network alerts on Dashboard is enabled.

  5. If the setting is disabled, click the toggle to enable it. If a verification dialog displays, click Enable to verify the change.

  6. Click one of these options to enable it:

    • Show meeting quality score and network alerts on Dashboard: Display the standard MOS metric for measuring meeting quality. Alerts will be on the MOS. See the overview section for more information.

    • Set custom thresholds for network alerts: Set custom thresholds for alerts instead of using the standard MOS metric. See the overview section for more information. Make sure to set custom thresholds after enabling this option.

How to set custom thresholds for network alerts

If you enabled the option to Set custom thresholds for network alerts, follow this section to specify your thresholds.

Tip: To help determine your thresholds, see our recommendations for meeting and phone statistics.

  1. Sign in to the Zoom web portal.

  2. In the navigation panel, click Dashboard.

  3. At the top of the Dashboard screen, click the Meetings or Webinars tab.

  4. In the top-right corner, click Quality Settings.

  5. Click either the Audio, Video, Screen Sharing, or CPU Usage tab.

  6. Click Edit.

  7. Set the values to the desired threshold.

  8. Click Apply.

How to view alerts on the Dashboard

  1. Sign in to the Zoom web portal.

  2. In the navigation panel, click Dashboard.

  3. At the top of the Dashboard screen, click the Meetings or Webinars tab.

  4. (Optional) Click Past Meetings to access historical meeting data.

  5. Take note of the follow columns:

    • Health: Displays any Warning level or Critical level issues in the meeting based on the MOS or custom threshold you've set.

    • Issue: Shows any current connection/client health warnings, including unstable audio, or video, or screen sharing quality, high CPU usage, or disconnect and reconnect issues. For example High CPU Usage or Unstable network for video.

    • If you enabled MOS instead of setting custom thresholds, you will also see the Video Quality, Audio Quality, and Screen Share Quality columns. These display a grade (Good, Fair, Poor, or Bad) based on the MOS. Click a participant's display name to view specific MOS details.


Once Zoom is enabled, below the application experience chart, include a summary statistics section with:

total minutes on Zoom:
Minutes with poor performance total/percentage of poor-performing minutes associated with different root causes

Total meetings in Zoom
Total meetings (percentage as well) with poor performance
Breakdown by

Audio issues
Video issues Screensharing issues

Hurdles to Overcome

Testing against Zoom's dashboard has shown that Zoom does not record every momentary issue. Wherever possible, we should aim to round up the number of minutes when a user was experiencing problems.

Feedback from stress testing

Meeting issues will be aggregated - if an issue lasted for 5 minutes, we should have a single meeting issue with a 5 minute time range.

Impact

Team level - Build a completely new set of components for this feature and added them to the component library

Org level -

Product level - Extend to other apps like Microsoft Teams, Google Meet, Cisco WebEx

Business level -