BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//wp-events-plugin.com//7.2.3.1//EN
TZID:Asia/Kolkata
X-WR-TIMEZONE:Asia/Kolkata
BEGIN:VEVENT
UID:87@cds.iisc.ac.in
DTSTART;TZID=Asia/Kolkata:20241219T150000
DTEND;TZID=Asia/Kolkata:20241219T160000
DTSTAMP:20241205T073622Z
URL:https://cds.iisc.ac.in/events/ph-d-thesis-defense-cds-december-19-2024
 -application-service-resilience-in-cloud-an-end-to-end-perspective/
SUMMARY:Ph.D. Thesis Defense: CDS: December 19\, 2024 "Application Service 
 Resilience In Cloud: An End-to-End Perspective"
DESCRIPTION:DEPARTMENT OF COMPUTATIONAL AND DATA SCIENCES\n\nPh.D. Thesis D
 efense\n==================================================================
 ===\nSpeaker : Ms. Dhanya R Mathews\n\nS.R. Number : 06-18-02-10-12-18-1-1
 5855\n\nTitle : "Application Service Resilience In Cloud: An End-to-End Pe
 rspective"\n\nResearch Supervisor: Dr. J. Lakshmi\n\nThesis examiner : Pro
 f. Praveen Tammana\, IIT-Hyderabad\n\nDate &amp\; Time : December 19\, 202
 4 (Thursday) at 3:00 PM\n\nVenue : The Thesis Defense will be held on HYBR
 ID Mode\n\n# 102 CDS Seminar Hall /MICROSOFT TEAMS.\n\nPlease click on the
  following link to join the Thesis Defense\n\nMS Teams link\n=============
 ===========================================================\n\nABSTRACT\n\
 nThe idea of computing as a utility was realized with the emergence of the
  cloud computing paradigm. Cloud service providers offer a wide range of s
 ervices that are delivered over the Internet to cloud service consumers. I
 n its current manifestation\, the Cloud services are realized over multipl
 e logical\, virtualized\, and distributed resources\, typically using a mu
 lti-layered architecture. The providers document the non-functional servic
 e level guarantees like availability\, performance\, security\, etc\, in S
 ervice Level Agreements (SLAs) provided to the consumer as Service Level O
 bjectives (SLO). The wide adoption of cloud computing\, compounded with th
 e emergence of microservice architecture\, has resulted in a considerable 
 increase in the number of components involved in service delivery. Manuall
 y addressing failures in real-time is inefficient and often impossible at 
 the cloud scale\, where failures are a norm rather than an exception. Ensu
 ring the quality of an application service\, as documented in the SLA\, th
 erefore requires autonomous mechanisms to enhance cloud services' resilien
 ce.\n\nThough cloud setups rely on highly autonomous service layers for ma
 naging\, provisioning\, and monitoring applications\, most of them focus o
 n a specific cloud service architecture layer or consider only a particula
 r set of faults. Any component across the cloud service stack involved in 
 the service delivery could disrupt the SLO. Further\, as cloud services us
 e shared infrastructure\, monitoring and acting on the individual service 
 layer metrics is limiting. In such a scenario\, the visibility of failure 
 anywhere in the stack can offer effective recovery/remediation strategies\
 ; hence\, an application-oriented approach that takes an end-to-end view o
 f failures makes a case for any resiliency solution. Towards this\, we pro
 pose an end-to-end service resilience framework that employs data-dependen
 t intelligent autonomous mechanisms to deal with cloud service disruptions
  efficiently. The intelligence to reduce the effect of disruptions is base
 d on understanding the complex interconnections and inter-dependencies of 
 end-to-end components in the cloud service stack.\n\nThe different cloud s
 ervice abstraction layers and infrastructure sharing have resulted in incr
 eased occurrence of faults\, more specifically\, saturation faults. The in
 itial phase of this work examines real-world disruption scenarios to under
 stand the faults that could disrupt a cloud service. With ever-changing ap
 plications and environments on which they are hosted\, realizing a failure
  repository for cloud service faults is infeasible. This makes conventiona
 l data-oriented approaches less practical and dynamic observability data-o
 riented methods more desirable. Towards this\, the second phase of this wo
 rk developed a Topology Aware Root Cause Detection Algorithm (TA-RCD) that
  considers the observability data from end-to-end service components and t
 heir interconnectedness. Our results from the fault injection studies show
  that the proposed approach performs better than the state-of-the-art RCD 
 algorithm\, at least by 2x times for Top-5 recall and 4x times for Top-3 r
 ecall\, on average.\n\nTo autonomously recover a service from its anomalou
 s state\, the remediation should target the root cause of anomalous behavi
 or. The root-cause localizations\, though accurate\, are not restricted to
  a specific component because of causal effects due to service interaction
 s. In order to identify the anomalous component\, the third phase of this 
 work developed a Topology Aware end-to-end failure Recovery framework (TA-
 REC) that identifies the appropriate remediation strategy for an anomaly. 
 The anomaly scores assignment and component activity tracking in TA-REC fa
 cilitates the identification of the component and the remediation that nee
 ds to be applied to the component. For the saturation fault scenarios inje
 cted across the stack\, TA-REC can identify an adequate remediation/recove
 ry strategy compared to the state-of-the-art because of the better visibil
 ity of the origin of the failure due to the end-to-end visibility.\n\nIn c
 onclusion\, this work demonstrated the usefulness of the end-to-end topolo
 gy of a cloud application service to remediate anomalies that challenge th
 e service quality efficiently. The observations prove that looking at the 
 service as a black box restricts the development of intelligent autonomous
  approaches to guarantee SLOs. The proof-of-concept evaluations demonstrat
 ed that the intelligence to maintain service resilience effectively is bas
 ed on an accurate understanding of the end-to-end state\, as it facilitate
 s maintaining component serviceability by targeting the cause of failure i
 n the stack. Future work aims to evaluate both TA-RCD and TA-REC for a bro
 ader range of fault scenarios in real-life production deployments.\n\n====
 =====================================================================\nALL
  ARE WELCOME
CATEGORIES:Events,Thesis Defense
END:VEVENT
BEGIN:VTIMEZONE
TZID:Asia/Kolkata
X-LIC-LOCATION:Asia/Kolkata
BEGIN:STANDARD
DTSTART:20231220T150000
TZOFFSETFROM:+0530
TZOFFSETTO:+0530
TZNAME:IST
END:STANDARD
END:VTIMEZONE
END:VCALENDAR