2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing, 2014
Cloud providers and data centers rely heavily on forecasts to accurately predict future workload.... more Cloud providers and data centers rely heavily on forecasts to accurately predict future workload. This information helps them in appropriate virtualization and cost-effective provisioning of the infrastructure. The accuracy of a forecast greatly depends upon the merit of performance data fed to the underlying algorithms. One of the fundamental problems faced by analysts in preparing data for use in forecasting is the timely identification of data discontinuities. A discontinuity is an abrupt change in a time-series pattern of a performance counter that persists but does not recur. Analysts need to identify discontinuities in performance data so that they can a) remove the discontinuities from the data before building a forecast model and b) retrain an existing forecast model on the performance data from the point in time where a discontinuity occurred. There exist several approaches and tools to help analysts identify anomalies in performance data. However, there exists no automated approach to assist data center operators in detecting discontinuities in the first place. In this paper, we present and evaluate our proposed approach to help data center analysts and cloud providers automatically detect discontinuities. A case study on the performance data obtained from a large cloud provider and performance tests conducted using an open source benchmark system show that our proposed approach provides on average precision of 84% and recall 88%. The approach doesn't require any domain knowledge to operate.
2013 35th International Conference on Software Engineering (ICSE), 2013
Load testing is one of the means for evaluating the performance of Large Scale Systems (LSS). At ... more Load testing is one of the means for evaluating the performance of Large Scale Systems (LSS). At the end of a load test, performance analysts must analyze thousands of performance counters from hundreds of machines under test. These performance counters are measures of run-time system properties such as CPU utilization, Disk I/O, memory consumption, and network traffic. Analysts observe counters to find out if the system is meeting its Service Level Agreements (SLAs). In this paper, we present and evaluate one supervised and three unsupervised approaches to help performance analysts to 1) more effectively compare load tests in order to detect performance deviations which may lead to SLA violations, and 2) to provide them with a smaller and manageable set of important performance counters to assist in root-cause analysis of the detected deviations. Our case study is based on load test data obtained from both a large scale industrial system and an open source benchmark application. The case study shows, that our wrapper-based supervised approach, which uses a search-based technique to find the best subset of performance counters and a logistic regression model for deviation prediction, can provide up to 89% reduction in the set of performance counters while detecting performance deviations with few false positives (i.e., 95% average precision). The study also shows that the supervised approach is more stable and effective than the unsupervised approaches but it has more overhead due to its semi-automated training phase.
2010 IEEE 34th Annual Computer Software and Applications Conference, 2010
Enterprise systems are load tested for every added feature, software updates and periodic mainten... more Enterprise systems are load tested for every added feature, software updates and periodic maintenance to ensure that the performance demands on system quality, availability and responsiveness are met. In current practice, performance analysts manually analyze load test data to identify the components that are responsible for performance deviations. This process is time consuming and error prone due to the large volume of performance counter data collected during monitoring, the limited operational knowledge of analyst about all the subsystem involved and their complex interactions and the unavailability of up-to-date documentation in the rapidly evolving enterprise. In this paper, we present an automated approach based on a robust statistical technique, Principal Component Analysis (PCA) to identify subsystems that show performance deviations in load tests. A case study on load test data of a large enterprise application shows that our approach do not require any instrumentation or domain knowledge to operate, scales well to large industrial system, generate few false positives (89% average precision) and detects performance deviations among subsystems in limited time.
2008 IEEE International Conference on Software Maintenance, 2008
When changing a source code entity (e.g., a function), developers must ensure that the change is ... more When changing a source code entity (e.g., a function), developers must ensure that the change is propagated to related entities to avoid the introduction of bugs. Accurate change propagation is essential for the successful evolution of complex software systems. Techniques and tools are needed to support developers in propagating changes. Several heuristics have been proposed in the past for change propagation. Research shows that heuristics based on the change history of a project outperform heuristics based on the dependency graph. However, these heuristics being static are not the answer to the dynamic nature of software projects. These heuristics need to adapt to the dynamic nature of software projects and must adjust themselves for the peculiarities of each changed entity. In this paper we propose adaptive change propagation heuristics. These heuristics are metaheuristics that combine various previously researched heuristics to improve the overall performance (precision and recall) of change propagation heuristics. Through an empirical case study, using four large open source systems; GCC (a compiler), FreeBSD (an operating system), PostgreSQL (a database), and GCluster (a clustering framework), we demonstrate that our adaptive change propagation heuristics show a 57% statistically significant improvement over the top-performing static change propagation heuristics.
The hypotheses tested that (1) anxiety and ( ) extraversion (exvia) would he negatively related t... more The hypotheses tested that (1) anxiety and ( ) extraversion (exvia) would he negatively related to career making ability. Variables defined as contributing to anxiety included ego weakness, excitability, low superego strength, threat sensitivity and high ergic tension. Extraversion was considered the "general tendency to social interaction" with people. Career decision making ability was considered directly proportional to the quality of strategy used by the individual while planning the future Activities of a ficticious person in the fields of education, job, family, life and leisure. The Junior-Senior High School Personality Questionnaire and the Life Career Game (Boocock, 1963) were used to obtain measures of anxiety and exvia, and career decision making ability respectively. Except tor one female subgroup, the hypothesis of negative relationship between anxiety and career decision making ability was not supported; in fact for 3 male subgroups, a significant positiVe relationship was found. The exvia_ scores and career decision making ability were not related tor any of the subgroups. Possible reasons for non-support of the hypotheses are discussed. (KS)
Uploads
Papers by haroon malik