Abstract

In this talk we present a class of techniques, known as mining of likely invariants from the operational data of cloud based systems to obtain latent anomalies in the execution behaviour of the underlying software. The idea of invariant based detection and monitoring can be used to support a variety of activities; on line anomaly detection through system monitoring to avoid an imminent failure of software systems, jobs submitted to a compute cluster, or a post mortem for troubleshooting. Usage scenarios can be capacity planning, detect execution anomalies and imminent violations of Service Level Agreements of a SAAS. On two widely different datasets from real world systems- one a Google cluster, whose traces are publicly available, and a SaaS platform, we have created a time series based modelling to capture various flow invariants. Furthermore, we performed an empirical analysis of three machine learning techniques for mining value based invariants: clustering, association rules, and decision list. The assessment is based on common metrics of coverage, recall and precision. Results show that relatively few invariants characterize the majority of operating conditions, that precision and recall may drop significantly when trying to achieve a large coverage, and that techniques exhibit similar precision, though the supervised one a higher recall. Using these invariants we have been able to characterise job failures in google cluster and several extreme cases of silent data corruption in the SaaS application.

Biography

Santonu Sarkar is a professor of Computer Sc and Information Systems BITS Pilani, K.K.Birla Goa Campus, India. Dr. Sarkar received the PhD degree in computer science from Indian Institute of Technology Kharagpur. He has more than 20 years of experience in IT industry in applied research, product & application development, architecture consulting for large software systems, project and client account management. His current research interest includes building software engineering techniques to ensure dependability, performance, and ease-of-use of Cloud and HPC applications. Dr. Sarkar has total 15 granted patents and several publications in the peer reviewed journals and conferences with h and i10 indices of 15 and 10 respectively.