Cloudwise Power Z Paradise
Bilibili (NASDAQ: $BILI) is now a cultural community and video platform where young Chinese millennials gather. It was established on June 26, 2009, and was listed on NASDAQ on March 28, 2018. As of the second quarter of 2020, the average monthly active users of Bilibili reached 172 million, of which mobile users reached 153 million.
Customer pain points
The development of the Internet industry is accompanied by the increase in IT service systems. If effective system monitoring is unavailable, problems such as website slow response and crashes cannot be quickly detected. The constant increase in users especially global users brings great challenges to the operation of the website. The details are as follows:
The access experience of global users varies, and it is hard to determine whether existing resources are sufficient to support quick access around the world.
ISP's network quality problem: Monitor the network quality of different ISPs, including the route status, DNS performance, ICMP packet loss ratio, and delay, to effectively identify and analyze network problems, measuring ISP's network quality.
User feedbacks are not comprehensively recorded, and system faults cannot be reproduced on-site.
When service problems occur, both operation and R&D personnel are involved, contributing to low efficiency and communication issues.
The original monitoring system cannot meet the requirements of the quickly growing business.
The existing open-source monitoring system has low usability, is difficult to operate for non-operation personnel, and is unsuitable for custom development.
Many resources are required to monitor hundreds of CDN nodes, overall data reliability, and end-user network quality.
The Cloudwise Synthetic Monitoring is deployed to help Bilibili identify performance problems in real time, quickly locate the causes of faults, and effectively ensure user experience to avoid customer lost caused by poor performance.
User experience problem: Simulate the website homepage and key services to monitor the availability and operation flow integrity. Initiate simulated access to the official website, live broadcast, payment, app, and the homepage and key pages of major service modules from major cities in China/around the world, and perform HTTP monitoring, page element effect monitoring, and operation flow monitoring to perceive locations where user experience problems occur in a timely manner.
ISP's network quality problem: Monitor the network quality of different ISPs, including the route status, DNS performance, ICMP packet loss ratio, and delay, to effectively identify network problems and analyze and measure ISP's network quality.
Problem recording based on historical snapshots: Record all services with problems using the historical snapshot function, including the reason, time, region, and system node for the occurrence of the problem.
API performance problem: Early alert to API performance problems. Locate the problem that API availability is compromised by load change, which invoked by a sharp increasement of user access.
CDN performance problem: Provide continuous monitoring on self-built CDN services in China, compare and analyze the CDN service quality in each location, locate CDN problems, implementing performance management and monitoring for self-built CDN services
Server resource problem: Monitor the performance resources of back-end servers supporting the entire operation system, understand the resource utilization rate of distributed servers in real time, and generate early alerts when the resource utilization rate exceeds a specified value.
Perceive faults in a timely manner: A sharp decrease in user access performance is detected. The reason is that the DNS response speed is too slow. The problem is resolved after adjustment by operation personnel.
Track and locate problems: After new services are launched, the page performance monitoring module detects an element performance problem immediately. The problem is urgently resolved before the user access volume surges, ensuring the continuity of new services.
Improve operation efficiency: Cloudwise Synthetic Monitoring alleviates the huge pressure on the operation team due to lack of manpower, reduces the communication costs between operation personnel and R&D personnel, improves working efficiency, and speeds up IT construction.
Enhance product trust: The customer has used hundreds of Cloudwise Synthetic Monitoring quotas since 2015, and has kept renewing the services.