About
Self Service Troubleshooting
Watch this session to learn how you can leverage the industry’s only SASE-native Autonomous Digital Experience Management (ADEM) solution with new self-serve capabilities to empower employees to quickly remediate digital experience problems themselves.
Troubleshoot Employee Experience at Lightspeed with ADEM Self Serve
Proactively notifies end users about app issues that require attention
Empowers end users to quickly remediate problems on their own with minimal interruption to their daily work
<Notification about WiFi signal quality being poor. Move closer to your router. workflow show ui, expand chart. collapse chart dismiss ui.>
Future enhancements
Give users more actionable content - collect and display the top memory and CPU-consuming applications
TCP synthetics - where the service provider blocks certain routes. These tests will provide more accurate performance statistics
Enrich end users’ notifications and recommendations and direct users to close tabs that they are not using or apps that are spinning (frozen)
Offer paths to expanded versions of the chart along with added functionality and deeper dive in the data while preserving values, context, and state.
User Feedback
Navigating the interface was very simple, we literally had to through only 4 clicks to fix the issue
Super straight and simple
Self Serve made it so much quicker than I would have spent 24 hours waiting for an IT personnel to diagnose the issue
Design Decisions
When to use charts?
Show changes in data
Show state of something that’s completing/progressing towards a goal or emptying
Comparing
How will these charts support the core goals of the app?
Charts provide focus and reasoning to take an action
Show only the most important information that will help user to take an action - show actionable information
Key information - Viewing historical data about CPU and memory spikes, state of WiFi and Internet
How to use them?
A prominent area of the screen after giving some context to the user. Let someone scan the data and decide whether they want to see more.
Show change
Line chart to show time series
Axis labels with chart legend
Title of the chart - CPU usage over the past 3 hours
Time range - 3 hours - most pertinent to measure the performance of the device/ internet/WiFi
Level of the line chart will tell the user whether the CPU is spiking or not, whether performance is getting better or worse
Important to summarize your data - so I incorporated context-based tooltips
At a macro level - we look for ways to represent the entire dataset - like a total or average connected time
I also wanted to show subsets of data correlating to spikes above the threshold - This subset would show notifications sent
Individual data points - notifications sent to user
Interactivity with precise values in the chart
Encourage exploration to build mental models for certain types of system and user behavior - For example, when I opened a specific Figma file, the app froze or other apps became slower. For more context, we will be providing a list of the top 5 apps that are consuming the highest CPU
Progressively reveal information in the chart - show/ hide CPU/ memory graph. Tooltips for added context
How they relate to each other and Integration with the Palo Design System?
Use common colors while ensuring that both charts are unique
Ensure more clarity and appeal
Why Dark Theme?
Why design for both Windows and Mac?
Refactored apps - No visi
Design by accretion
Some background on what happened - Too many IT tickets, performance issues swamped IT tickets that could be solved easily
With the growth in the trend of the remote and hybrid workforce, organizations and end users need to rely on consumer-grade networks and ISPs for access to apps both proprietary and SaaS applications. The integration of ADEM agents into Prisma Access provides the IT teams with key insights from network performance metrics constantly monitored and collected on the endpoints as well as cloud hosted. This enables the IT admins to troubleshoot, triage, and remediate issues highlighted, however End users are often oblivious to issues that they can potentially resolve and often launch into troubleshooting mode only after experiencing significant degradation in the user experience and app performance without the necessary information. Today we capture performance metrics across all the hops of the network path and can accurately determine first mile & last mile issues by notifying and suggesting remediation actions to end users we can ensure fast troubleshooting & lower internal IT tickets .End users need the ability to view and remediate issues that pertain to the endpoints that can be fixed via their actions
<Picture of a swamped IT Tickets table>
<Picture of google search for WiFi related issues>
<Long wait times for ISP>
Other apps like CleanMyMac - are too CPU intensive
Application and System performance issues caused due to the following reasons
High CPU utilization - The device has hit CPU utilization of over 95% due to applications or browser tabs consuming high cycles furthermore prompting users to close unwanted applications or browsers/tabs (reference for thresholds Zoom)
High Memory utilization - The device has hit Memory utilization of over 95% due to applications or browser tabs consuming high cycles furthermore prompts users to close unwanted applications or browsers/tabs
Poor Wifi quality - Poor application performance caused due to poor Wifi quality, congested network indicated by Low SNR value lesser than 20, prompt user to move closer to the wifi connection limit activities that heavily use your internet connection (Edit from original ask as engineering came back and updated that noise and SNR cannot be collected from windows machine so we have to fall back to signal quality
only. Notifications now will be sent based on poor signal quality and the default will be 48%. More updates in the comment section)Poor Wifi quality with change in SSID Poor application performance caused due to poor Wifi quality, congested network indicated by Low SNR value lesser than 20 along with a change in SSID, prompt user to reconnect back to the same SSID (Same here, instead of SNR it will now use signal quality. More updates in the comment section)
Internet Outages - Notify users of an internet outage
Recapture control and focus on important tickets
High-level goals - Lightweight, easy fixes, give users more control over their time and devices, create a platform for innovation and deeper engagement
My Role
Engineers, Product Managers, Data Scientists, Researchers, UX Writers, and Other Designers
Worked with the research team, content writer, and 2 product managers
The app was launched internally in May 2022, and externally in June 2022
Picking up the pieces
Without preexisting insights, I partnered with our researcher to explore network admins were responding to IT Tickets
Checked most problematic scenarios where Device, WiFI, and Internet have issues
<Visual here>
Discovery
“quote here”
Simple - fewer steps, clearer steps to resolve issues
Simplicity revealed an opportunity to pick up the perfect experience to solve common device and network-related issues
Working backward from perfect
Define success and understand the health of these systems
WiFi SNR value
How much time will the user need to resolve this?
How many steps will they have to perform to resolve this?
Partnered with data scientists to uncover actual numbers
Issue Signal Metric
WiFi not working Webpages not loading, taking longer User getting frustrated, raising IT Ticket, looking for solutions elsewhere
Extra coordination and handholding needed - Don’t think the user is dumb didn’t work
“Thousands of hours wasted by IT Team trying to resolve simple issues that users can handle themselves”
How might we empower users to solve their own issues and not rely on IT for simple issues?
"Solve your remote work user experience and application performance issues without consulting IT"
Design Principles
Detect
Identify
Effortless
Actionable
Always looking out for you - Background monitoring yet lightweight
Always moving faster -
Sometimes there is a faster way, and that requires simple steps. ADEM understands these situations and gives you an option
Future enhancements:
Collecting feedback directly from users
Telemetry to understand how users interact with the app
Expand to resolving other issues like battery etc.
ADEM gives you complete control when you need it
How we got there?
How do we design for everyone? - Power users/ Naive users
What contexts need to be understood?
What is the perfect user experience?
<Framework for user journey>
<Map for ——>
More inclusive design
To move beyond the existing biases, I tried to educate the team with an approach to designing for everyone, everywhere.
The spectrums attempt to highlight the range of temporary or permanent challenges the users face while interacting with their systems
The situations attempt to highlight situational challenges that everyone experiences. A situation is a temporary context that affects the way any person interacts with their system for a short time.
The key pillars of any copy we write for the app are:
Head: Are we clear, straightforward, and easy to understand?
Heart: Are we warm, relatable, and inclusive?
Hands: Are we providing tangible benefits and active language so users know what to do and why they’re doing it?
<In hindsight, PANW poorly empathized with users who weren’t reflections of people that work in their offices across the world.>
Elevating
Empowering
Enabling
Our data revealed that:
50% of users globally don't explicitly understand thresholds.
Half of all issues are at least situational/ temporary.
Testing the new app
Tested with internal folks to test early prototypes and a few customers.
To our surprise, not a single participant had trouble understanding the language and iconography. The notification system built in conjunction with the OS, helped users to be at ease without having the overhead to learn to use a new app. The —— resonated well with participants, confirming our intuition around designing for ease and speed.
Inspiring confidence on the confirmation screen
WHY
Was it influenced by research?
Was it due to engineering challenges?
Was it discussed and agreed upon by the team?
Business decision?
How did thing A influence thing D?
Animations:
Background video with overlay - IBM design system homepage