Papers by Syed Aqeel Haider Gillani

High performance top-k processing of non-linear windows over data streams
This year's DEBS Grand Challenge offers two very challenging queries over social networks dat... more This year's DEBS Grand Challenge offers two very challenging queries over social networks data. These queries -- each for a different reason -- cannot be handled by traditional techniques and therefore call for the development of a specific architecture and data structures. In the first query, the novelty is the non-linearity of the expiration of the elements. Since a traditional sliding window is not suitable, we investigate here the data structures offering the best tradeoffs for all the required operations. In the second query, unlike traditional approaches where no persistent data is stored over the stream, we have to manage a friendship graph which is persistent throughout the system execution. Due to the centrality of this structure, a careful design is therefore required. The common point of the algorithmic approaches that we developed for both queries, is the overwhelming usage of bounds -- upper and lower --, in order execute expensive computations only when required. We devise, for the Query 1, a bound based on the score decay. For the Query 2, we use Turan's theorem to limit the clique computation. The combination of lazy evaluation, careful implementation and thorough testing lead to the realization of an efficient streaming process system.

There is a paradigm shift in the nature and processing means of today's data: data are used to be... more There is a paradigm shift in the nature and processing means of today's data: data are used to being mostly static and stored in large databases to be queried. Today, with the advent of new applications and means of collecting data, most applications on the Web and in enterprises produce data in a continuous manner under the form of streams. Thus, the users of these applications expect to process a large volume of data with fresh low latency results. This has resulted in the introduction of Data Stream Processing Systems (DSMSs) and a Complex Event Processing (CEP) paradigm -both with distinctive aims: DSMSs are mostly employed to process traditional query operators (mostly stateless), while CEP systems focus on temporal pattern matching (stateful operators) to detect changes in the data that can be thought of as events. The road to a successful PhD is long and labour intensive. During the last three years or so, I have experienced countless setbacks: each one plunging my hopes and self confidence to the ground. However, through these setbacks, I have undergone a genuine and powerful transformation -both intellectually and personally. The main thing I have learned from this experience is that, no matter how di cult and improbable the task is if you keep on walking in the right direction, and keep on looking for the probable remedies, you will definitely find the solution. Critically, this journey would have been unbearable were it not for the great support from the following people. This thesis owes its existence to the help, support and inspiration from two of my great advisors: Frédérique Laforest and Gauthier Picard. Without the countless discussions with them, I would have not been able to establish the frontier works that were able to impact on the development of this research. I am greatly indebted to Frédérique for spending hours remotely (on weekends and during holidays!!) to give me advice, proofreading and correcting my "s" and "the" mistakes. Over the past three years or so she has not only helped me in writing technical papers, but has also assisted me with the tedious administrative tasks. Gauthier, on the other hand, provided me with a di erent view of research and o ered me some critical suggestions that moulded my research. He has also been extremely supportive and understanding, especially with the choice of my research path. In addition, I would like to thank my advisors for supporting me and making it possible for me to attend numerous summer schools and conferences. No matter how much I write in this note, it is impossible to express my sincere gratitude to my advisors. I would also like to thank all my colleagues from my group formally known as "Satinlab": Abderrahmen Kammoun, Christophe Gravier, Julian Subercaze, Kamal Singh, Jules Chevalier. Especially to Christophe and Julien for giving me feedback and insightful suggestions to improve my work. In addition, a special thanks to Antoine Zimmerman for o ering me his services and insights into the theoretical aspects of my work, and my thesis reviewing committee (Angela Bonifati and Marie-Christine Rousset) for providing insightful comments and suggestions. Finally, thanks to my great family for giving me so much support and guidance. Mom, Dad and my sisters, you guys are the best, and you instilled in me the confidence, curiosity and discipline it takes to be successful. Thank you so much. I would also like to thank my two best mates Calum and Adam for providing me with such a great company. Last but not least, thanks to my girlfriend and Psychiatrist Céline for supporting me during my di cult times and encouraging me to be up to the task. Il n'existe pas de chemin tracé pour mener l'homme à son salut; il doit en permanence inventer son propre chemin. Mais pour inventer, il est libre, responsable, sans excuse et tout espoir est en lui. There is no trace out path to lead a man to his salvation; he must constantly invent his own path. But, to invent, he is free, responsible, without excuse, and every hope lies within him. -Jean-Paul Sartre ideas of Tim Berners-Lee, and recently we have seen its adoption even at the industrial level. The whole consortium of Semantic Web relies on its data model called RDF and ontological languages such as RDF Schema and Web Ontology Language (OWL). RDF data consist of triples, where an RDF triple can then be seen as representing an atomic "fact" or a "claim", and consists of subject, predicate and object. A set of these triples forms an RDF graph. The prototype research of processing RDF data had a similar start to that of the relational data model. That is, the data are persisted and indexing techniques are utilised on top of it to process it with expressive query languages: SPARQL is the SQL of RDF and triples stores are the relational stores for RDF. In recent years, a number of highly e cient RDF triple stores have been engineered, storing billions of triples with high-speed query processing. However, in today's application, the assumption of static data may not be applicable, and data items arrive in a continuous, ordered sequence of items. Consider few examples: on social networks, people continuously collaborate, consequently producing data in a continuous manner; sensors, that are ubiquitous devices and crucial for a multitude of applications, continuously produce situational data. Hence, data are always in motion for such applications, construct a dynamic world, and contain an additional attribute of time. Such data are not only produced rapidly, but also continuously -hence forming data streams. This highly dynamic and unbounded nature of data streams requires that a new processing paradigm be found: data from a variety of sources are pushed into the system and are processed by persistent and continuous queries, which continuously produce the results with the arrival of new data items. The apparent characteristics of data streams are especially challenging for both DBMSs and RDF query processors. The reasons are two-fold: first of all, data streams are usually produced at a very high frequency, often in a bursty manner, can pose real-time requirements on processing applications and may only allow them one pass over the data. Second, data streams result in a high volume of data, such that not all of it can be stored and processed. Considering these requirements, traditional DBMSs are simply not suitable to process data streams in a timely fashion. Thus, a new research field, Data Stream Management Systems (DSMSs), was introduced parallel to DBMSs with the following novel requirements: • The computation is performed in push-based manner or it is data driven. That is, newly arrived data items are continuously pushed into the DSMS to be processed. • The DSMS's queries are persistent, and continuously processed throughout the lifetime of the streams. The results of these continuous queries also take the form of streams. • Data streams are considered to be unbounded, thus they cannot be stored in their entirety. Instead, a portion of the recent data items are stored and processed, where the boundaries of recency are defined by the users. These boundaries are generally called windows. • Due to the requirement of a real-time response, DSMSs should employ the mainmemory to process the most recent data items within windows. • New data models and query languages are required to comply with the above mentioned requirements. In this Chapter, we provide a broad overview of the history and key concepts of the Semantic Web. These concepts provide a crucial background to our discussion on the Semantically-enabled Stream Processing and Complex Event Processing. It starts with the evolutionary history of the World Wide Web and then presents the case of the Semantic Web. 2.1 Introduction .
Complex Event Processing (CEP) deals with matching a stream of events with the query patterns to ... more Complex Event Processing (CEP) deals with matching a stream of events with the query patterns to extract complex matches. These matches incrementally emerge over time while the partial matches accumulate in the memory. The number of partial matches for expressive CEP queries can be polynomial or exponential to the number of events within a time window. Hence, traditional strategies result in an extensive memory and CPU utilisation. In this paper, we revisit the CEP problem through the lens of complex queries with expressive operators (skip-till-any-match and Kleene+). Our main result is that traditional approaches, based on the partial matches' storage, are inefficient for these types of queries. We advise a simple yet efficient recomputation-based technique that experimentally outperforms traditional approaches on both CPU and memory usage.

Annals of King Edward Medical University, 2014
Objective: To find out the association between shoulder impingement and morphological characteris... more Objective: To find out the association between shoulder impingement and morphological characteristics of acromion. Methods: This descriptive cross sectional study was conducted at the Department of Orthopedic Surgery and Traumatology (DOST-I), Mayo hospital Lahore between 1 st January to 30 th October 2013. The study population composed of 60 patients of shoulder pain with age 40 years and above. We analyzed demographic variables for frequencies and their associations between variables using SPSS version 17.0. Significance level was p<0.05. Results: Amongst the total 60 patients, majority 36 (60%) were females and 24(40%) were males. We included patients with age group ranged between 40 years and above. Most of them 27 (45%) had type II acromion which is curved in morphology. Conclusion: We concluded that shoulder impingement syndrome was present commonly amongst female than male. It was Bigliani type II acromion causing shoulder impingement in age group of 40 years and above. K...

Federalism and Provincial Autonomy in Pakistan: A Case of Balochistan
Global Legal Studies Review, 2021
The present study intends to explore the various aspects of the issue of provincial autonomy in t... more The present study intends to explore the various aspects of the issue of provincial autonomy in the entire federation of Pakistan in general and in Baluchistan in particular and prescribes their possible resolution. Although federal system of government remains essential for diverse societies like Pakistan, it has not been able to lessen the grievances of its various units, particularly Baluchistan, over the period of time due to poor implementation of the policies. The 18th amendment in the constitution discourages the centralization of power and demands political and economic autonomy of the smaller provinces like Baluchistan, but the authoritarian trend of our political system has never permitted it to happen. The study suggests that deprivation of the people of Baluchistan can be ended by strengthening the federation of Pakistan with the help of a few viable steps like Decentralization of power and restoration of democratic values.

Primary Care Diabetes, 2015
Introduction: Managing people with diabetes is a health priority worldwide. Cost benefit attempts... more Introduction: Managing people with diabetes is a health priority worldwide. Cost benefit attempts at avoiding non elective admissions (NEA) have had some success. To develop an NEA avoidance service, we audited multiple NEA in those with diabetes. Method: All people with diabetes who had ≥3 NEA to our hospital over 12 months were identified (n = 418); 104 (1 in 4) patients were randomly selected and retrospective data collected in 98 subjects on their index (latest, 3rd) admission. Results: Of 98 subjects (50 males, 60 Caucasians, 86 type 2 diabetes, aged 69 ± 16 years).Conditions contributing to admission included: Significant co-morbidities in 95 patients (≥2 in 57, ≥4 in 24). Only 14 admission were directly due to diabetes: hypoglycaemia (5); hyperglycaemia (6); DKA (2), Infected foot ulcer (1).97 admissions were justified at the time of presentation. However whilst 78 were unavoidable, 19 were deemed avoidable amongst whom 10 were diabetes related. The majority of re-admissions were due to multi-morbidity and were often nondiabetes related. The concept of avoidability must be distinguished from point justification at the time of acute need. This would allow the prospective identification of high risk patients and requires an integrated working process to avoid NEA.

Proceedings of the 7th ACM Symposium on Information, Computer and Communications Security, 2012
Spam botnets are no more driven by personal agenda, fun or proof of skills but by an underlying e... more Spam botnets are no more driven by personal agenda, fun or proof of skills but by an underlying economic engine. Not until recently, intrusion detection techniques have approached spambot as a purely behavioral traffic detection problem using statistical features of mail traffic. Then, recently some efforts were made to comprehend the underly economic engine of spambot. These approaches either presents an abstract view of spambot economy or adapt purely measurement based approach to quantify spambot economy. No study so far has tried to bridge the gap between spambot detection and spambot economic modeling. We formalize the spambot economic system to monetize spammer efforts to spammer utility. We use standard consumer economic theory to translate spam activity to spammer utility. We also constrain this spammer utility through statistical features of mail traffic used by existing spambot detection techniques.

Journal of Diabetes Mellitus, 2012
of research reports identified moderate reduction in glycated haemoglobin with education interven... more of research reports identified moderate reduction in glycated haemoglobin with education interventions regardless of age group. Our study objective was to evaluate the pharmacist interventions in providing patient home care. A 24-week longitudinal quasi-experimental-pretest/post-test study design was used to assess the effectiveness of a diabetes education program to enhance self-care practices. A doubleblinded randomized study design was considered but was not feasible as the investigator was responsible for implementing the intervention and collecting data on outcomes. Since this was a longitudinal study a 25% attrition rate was included in the calculation of sample size. Hence the sample size for the proposed study was 106 subjects with 53 subjects in each group. All analyses were done using SPSS version 18 ® . The level of significance was set at 0.05. The Research Ethics Committee of hospital and the Malaysian Medical Research and Ethics Committee approved the study. Of the 109 subject who met the study-entry criteria, 3 subjects declined to participate due to lack of time and interest. There was no significant relationship between the demographic and clinical characteristic of participants who completed the study. No significant relationship between the intervention and control groups who completed the study in demographic, clinical and psychosocial contexts. Of the 47 subjects from the intervention group who reported adherent to their daily medication intake after the education intervention, 51 subjects (31.9%) reported taking their medication at the wrong time. The recommended times for oral anti-hyperglycemic medication (OAM) are: sulphonylureas 30 minutes before food, acarbose with food, metformin with or within 30 minutes after food. This research has shown a brief structured education program that incorporated behavior science specifically selfefficacy was effective in enhancing self-care practices (SMBG and medication adherence) and improving glycaemic control in the intervention group.

International Archives of Medicine, 2014
Background: Hyperglycemia and hypocalcaemia have separately been attributed to adverse outcomes i... more Background: Hyperglycemia and hypocalcaemia have separately been attributed to adverse outcomes in critically ill patients. The study was aim determine whether hyperglycemia and hypocalcaemia together post-operative effect of thyroidectomy and evaluate the gender & age impact on the extend of clinical condition. Methods: All the patients underwent thyroidectomy in the duration of 1 st Jan 2012 till 30 th June, 2013 in HPP and HUSM Kelantan, Malaysia. Serum evaluation has been made on 4 consecutive reading with duration of 6 hours. The predictive trend has been established to identify the hypokalemic and hyperglycemic condition. Ethical approvals & Patients' consent forms have been made prior to conduct this study. Results: The incidence of hyperglycemia [≥ 150 mg/dl(8.3 mmol/L)] and hypocalcaemia (serum calcium < 8.5 mg/dl (2.2 mmol/L)] were 39.4% and 43.9% respectively. Hyperglycemia and hypocalcaemia associated with age and length of stay, significant association has been found among pre-operative diagnosis as well. The interaction of hyperglycemia and hypocalcaemia did not separate effects on mortality. As demonstrated, the prevalence of hyperglycemia and hypocalcaemia in post-thyroidectomy patients is considerable high. Also, the linear association pattern has been shown. However, considering the disease severity, the association of hyperglycemia and hypocalcaemia with surgical ward indicators of morbidity could not be verified.

Replantation Versus Revision of Amputated Fingers in Patients Air-Transported to a Level 1 Trauma Center
The Journal of Hand Surgery, 2010
To assess the rate of replantation versus revision of amputated fingers in patients air-transport... more To assess the rate of replantation versus revision of amputated fingers in patients air-transported to a tertiary care hand trauma center. We included 40 consecutive subjects (70 digits) who were transported via air after digit(s) amputation distal to the metacarpophalangeal joint. The primary outcome measure was type of surgery (attempted replantation vs revision of the amputation). Data were collected prospectively. We identified 3 groups of patients. In group 1 (15 patients, 23 digits), replantation of one or more digits was attempted. In group 2 (6 patients, 8 digits), replantation was not elected. In group 3 (19 patients, 39 digits), no digits were suitable for replantation. The mean age was 36.2 years (range, 5-69 years) and mean time of transport was 5.15 hours (range, 1-24 hours). Mechanisms of finger injury were crush (n = 34), followed by clean cut (n = 15), avulsion/crush (n = 15), and gunshot (n = 6). No significant differences were found between groups for age or time elapsed from injury to hospital arrival. Most patients (n = 25; 65%) transported via air did not undergo replantation surgery. Injury characteristics (n = 18 patients, 72%) were the main reason not to replant. The most common reason for the refusal of replantation was inability to return to work immediately. The most common reasons for surgeon&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;#39;s decision to not to replant were single digit amputations proximal to flexor digitorum superficialis attachment (7 patients), and crush/avulsion type injuries (7 patients), followed by health status and age (5 patients). This study shows that a considerable portion of patients transported via air do not undergo replantation surgery. Further studies are needed to establish whether this is an overused service.

D-Lib Magazine, 2013
This paper addresses the problem of name disambiguation in the context of digital libraries that ... more This paper addresses the problem of name disambiguation in the context of digital libraries that administer bibliographic citations. The problem occurs when multiple authors share a common name or when multiple name variations for an author appear in citation records. Name disambiguation is not a trivial task, and most digital libraries do not provide an efficient way to accurately identify the citation records for an author. Furthermore, lack of complete meta-data information in digital libraries hinders the development of a generic algorithm that can be applicable to any dataset. We propose a heuristic-based, unsupervised and adaptive method that also examines users' interactions in order to include users' feedback in the disambiguation process. Moreover, the method exploits important features associated with author and citation records, such as co-authors, affiliation, publication title, venue, etc., creating a multilayered hierarchical clustering algorithm which transforms itself according to the available information, and forms clusters of unambiguous records. Our experiments on a set of researchers' names considered to be highly ambiguous produced high precision and recall results, and decisively affirmed the viability of our algorithm.

australasian medical journal, 2008
Objectives: This case series describes new clinical features of Rove beetle dermatitis Methods: I... more Objectives: This case series describes new clinical features of Rove beetle dermatitis Methods: Interviews were conducted with four students at University Sains Malaysia with current or past Rove beetle skin infections. Information on the onset of symptoms, complication, treatment and duration of symptoms were recorded. A physician at a local clinic was also interview to describe the challenges in diagnosis and therapy for this condition. Results: This case series describes new features of Rove beetle dermatitis. Redness, swelling, fatigue and localised stretching of the skin were the commonly reported symptoms. However, vibrations, twitching of the skin and difficulty in breathing were new features observed in this study. Three of the four patients were not aware of the aetiology of the condition, antibiotics and topical steroids were prescribed for the management and prevention of secondary infection. Conclusion: Rove beetle dermatitis is a common seasonal endemic in Malaysia, with a higher incidence in the month of September and March. This case series highlights the need for a health literacy program, aimed at informing the public and medical practitioners about the aetiology, symptoms and complications of this infection.

Semantic Web, 2018
The field of Complex Event Processing (CEP) relates to the techniques and tools developed to effi... more The field of Complex Event Processing (CEP) relates to the techniques and tools developed to efficiently process pattern-based queries over data streams. The Semantic Web, through its standards and technologies, is in constant pursue to provide solutions for such paradigm while employing the RDF data model. The integration of Semantic Web technologies in this context can handle the heterogeneity, integration and interpretation of data streams at semantic level. In this paper, we propose and implement a new query language, called SPA , that extends SPARQL with new Semantic Complex Event Processing (SCEP) operators that can be evaluated over RDF graph-based events. The novelties of SPA includes (i) the separation of general graph pattern matching constructs and temporal operators; (ii) the support for RDF graph-based events and multiple RDF graph streams; and (iii) the expressibility of temporal operators such as Kleene+, conjunction, disjunction and event selection strategies; and (iv) the operators to integrate background information and streaming RDF graph streams. Hence, SPA enjoys good expressiveness compared with the existing solutions. Furthermore, we provide an efficient implementation of SPA using a non-deterministic automata (NFA) model for an efficient evaluation of the SPA queries. We provide the syntax and semantics of SPA and based on this, we show how it can be implemented in an efficient manner. Moreover, we also present an experimental evaluation of its performance, showing that it improves over state-of-the-art approaches.
All-optical 2R regeneration of two-wavelength time-interleaved signals is demonstrated using high... more All-optical 2R regeneration of two-wavelength time-interleaved signals is demonstrated using higher order four-wave mixing in a fiber. In this regeneration scheme, a single fiber and a pump are shared by the two channels. The results show that interchannel crosstalk can be avoided when pulses in the two channels are time separated by more than 3 times their pulsewidth. A possible power monitoring t o determine suitable time-interleaving condition is also discussed.

Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems, 2015
In this paper, we describe our novel system named as RGraSPA an RDF Graph-based Stream Processing... more In this paper, we describe our novel system named as RGraSPA an RDF Graph-based Stream Processing with Actors, which adheres to the realm of RDF graph and knowledge reasoning, and uses an actor model for distribution of continuous queries. Furthermore, we present our approach to solve DEBS Grand Challenge by employing our system. RGraSPA uses RDF graph-based event model to encapsulate a set of triples and process them in continuous manner. We also present our synchronised structure traversal algorithm that uses Range tree to store results in a sorted view, where each node of the tree maintains a balanced Multimap Binary Search Tree (BST). The range of each node is adaptive and updated according to the incoming values and defined size of the Multimap BST for each node. In order to solve the DEBS challenge, we provide a formal method to calculate cell IDs from the longitude and latitude in a streaming fashion and use two Range trees for 10 most frequent routes and profitable areas. Our experimental results show that the query execution time can be optimised by carefully adjusting the cardinality values of Range tree. Our solution processes 1 year worth of RD-Fised data (372 GB) (approx 3.4 billion triples) for Taxis in 1.8 hours.

Management and recognition of event patterns is becoming thoroughly ingrained in many application... more Management and recognition of event patterns is becoming thoroughly ingrained in many application areas of Semantically enabled Complex Event Processing (SCEP). However, the reliance of state-of-the-art technologies on relational and RDF triple model without having the notion of time has severe limitations. This restricts the system to employ temporal reasoning at RDF level and use historical events to predict new situations. Additionally, the semantics of traditional query languages makes it quite challenging to implement distributed event processing. In our vision, SCEP needs to consider RDF as a first class citizen and should implement parallel and distributed processing to deal with large amount of data streams. In this paper, we discuss various challenges and limitations of state-of-the-art technologies and propose a possible solution to extend RDF data model for stream processing and pattern matching. Furthermore, we describe a high-level query design that enables efficient parallel and distributed pattern matching through query rewriting.

British Journal of Diabetes & Vascular Disease, 2014
Aims: Optimisation of glycaemic control in type 1 diabetes often results in unwanted weight gain.... more Aims: Optimisation of glycaemic control in type 1 diabetes often results in unwanted weight gain. glucagon-like peptide-1 (GLP-1) agonist use is associated with weight reduction in type 2 diabetes but its use in type 1 diabetes is little studied. Methods: We developed a protocol for GLP-1 use in people with type 1 diabetes and obesity in which liraglutide was initiated and up-titrated whilst insulin doses were simultaneously titrated according to glycaemic parameters. Results: Of 15 patients offered treatment, 8 proceeded. Baseline parameters were (n=8, mean + SD): (age 50 ± 6 years, BMI 40.4 ± 5.5 kg/m 2 , weight 123.0 ± 23.9 kg, HbA1c 8.5 ± 1.7%, total daily insulin dose 131 ± 112 units/day. By intention to treat analysis (n=8, 12 months), at 3, 6 and 12 months compared to baseline, weight loss was 6.8 ± 4.1 kg, 10.0 ± 5.6 kg and 8.9 ± 8.4 kg (p=0.026). The reductions in insulin dosage were significant over 6 months (n=8, p=0.045) or when analysing only those who completed 12 months of liraglutide therapy (n=6, p=0.044). Conclusions: GLP-1 agonist use in patients with type 1 diabetes may be advantageous where weight reduction becomes both a constraint and a therapeutic objective.
Uploads
Papers by Syed Aqeel Haider Gillani