Training of object detection models using less data is currently the focus of existing N-shot lea... more Training of object detection models using less data is currently the focus of existing N-shot learning models in computer vision. Such methods use object-level labels and takes hours to train on unseen classes. There are many cases where we have large amount of image-level labels available for training but cannot be utilized by few shot object detection models for training. There is a need for a machine learning framework that can be used for training any unseen class and can become useful in real-time situations. In this paper, we proposed an “Unseen Class Detector” that can be trained within a very short time for any possible unseen class without bounding boxes with competitive accuracy. We build our approach on “Strong” and “Weak” baseline detectors, which we trained on existing object detection and image classification datasets, respectively. Unseen concepts are fine-tuned on the strong baseline detector using only image-level labels and further adapted by transferring the classifi...
Object Detection is the task of classification and localization of objects in an image or video. ... more Object Detection is the task of classification and localization of objects in an image or video. It has gained prominence in recent years due to its widespread applications. This article surveys recent developments in deep learning based object detectors. Concise overview of benchmark datasets and evaluation metrics used in detection is also provided along with some of the prominent backbone architectures used in recognition tasks. It also covers contemporary lightweight classification models used on edge devices. Lastly, we compare the performances of these architectures on multiple metrics.
Proceedings of the 2020 International Conference on Multimedia Retrieval
Event recognition is among one of the popular areas of smart cities that has attracted great atte... more Event recognition is among one of the popular areas of smart cities that has attracted great attention for researchers. Since Internet of Things (IoT) is mainly focused on scalar data events, research is shifting towards the Internet of Multimedia Things (IoMT) and is still in infancy. Presently multimedia event-based solutions provide low response-time, but they are domain-specific and can handle only familiar classes (bounded vocabulary). However multiple applications within smart cities may require processing of numerous familiar as well as unseen concepts (unbounded vocabulary) in the form of subscriptions. Deep neural network-based techniques are popular for image recognition, but have the limitation of training of classifiers for unseen concepts as well as the requirement of annotated bounding boxes with images. In this work, we explore the problem of training of classifiers for unseen/unknown classes while reducing response-time of multimedia event processing (specifically object detection). We proposed two domain adaptation based models while leveraging Transfer Learning (TL) and Large Scale Detection through Adaptation (LSDA). The preliminary results show that proposed framework can achieve 0.5 mAP (mean Average Precision) within 30 min of response-time for unseen concepts. We expect to improve it further using modified LSDA while applying fastest classification (MobileNet) and detection (YOLOv3) network, along with elimination of requirement of annotated bounding boxes. CCS CONCEPTS • Information systems → Multimedia streaming; • Computing methodologies → Neural networks; • Software and its engineering → Publish-subscribe / event-based architectures.
Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems
ere has been substantial research in the area of event processing where systems are focused on ev... more ere has been substantial research in the area of event processing where systems are focused on event processing of structured data. However, in the context of smart cities, signi cant number of realtime applications for event-driven systems consist of image data, rather than structured events. erefore, there is a need for a system that can process multimedia events such as images. is paper discusses challenges with processing images within eventbased systems. CCS CONCEPTS •Information systems → Multimedia streaming; •So ware and its engineering → Publish-subscribe / event-based architectures;
ere has been substantial research in the area of event processing where systems are focused on e... more ere has been substantial research in the area of event processing where systems are focused on event processing of structured data. However, in the context of smart cities, signicant number of realtime applications for event-driven systems consist of image data, rather than structured events. erefore, there is a need for a system that can process multimedia events such as images. is paper discusses challenges with processing images within eventbased systems.
The importance of a high performance sorting algorithm with low time complexity cannot be over st... more The importance of a high performance sorting algorithm with low time complexity cannot be over stated. Several benchmark algorithms viz. Bubble Sort, Insertion Sort, Quick Sort, and Merge Sort, etc. have tried to achieve these goals, but with limited success in some scenarios. Newer algorithms like Shell Sort, Bucket Sort, Counting Sort, etc. have their own limitations in terms of category/nature of elements which they can process. The present paper is an attempt to enhance performance of the standard Merge-Sort algorithm by eliminating the partitioning complexity component, thereby resulting in smaller computation times. Both subjective and numerical comparisons are drawn with existing algorithms in terms of time complexity and data sizes, which show the superiority of the proposed algorithm.
Image segmentation refers to the separation of objects from the background, and has been one of t... more Image segmentation refers to the separation of objects from the background, and has been one of the most challenging aspects of digital image processing. Practically it is impossible to design a segmentation algorithm which has 100% accuracy, and therefore numerous segmentation techniques have been proposed in the literature, each with certain limitations. In this paper, a novel Falling-Ball algorithm is presented, which is a region-based segmentation algorithm, and an alternative to watershed transform (based on waterfall model). The proposed algorithm detects the catchment basins by assuming that a ball falling from hilly terrains will stop in a catchment basin. Once catchment basins are identified, the association of each pixel with one of the catchment basin is obtained using multicriterion fuzzy logic. Edges are constructed by dividing image into different catchment basins with the help of a membership function. Finally closed contour algorithm is applied to find closed regions...
Modern day multimedia content generation and dissemination is moving towards the presentation of ... more Modern day multimedia content generation and dissemination is moving towards the presentation of more and more `realistic' scenarios. The switch from 2-dimensional (2D) to 3-dimensional (3D) has been a major driving force in that direction. Over the recent past, a large number of approaches have been proposed for creating 3D images/videos most of which are based on the generation of depth-maps. This paper presents a new algorithm for obtaining depth information pertaining to a depicted scene from a set of available pair of stereoscopic images. The proposed algorithm performs a pixel-to-pixel matching of the two images in the stereo pair for estimation of depth. It is shown that the obtained depth-maps show improvements over the reported counterparts.
3D imaging has lately been one of the fast growing technologies finding its applications in all w... more 3D imaging has lately been one of the fast growing technologies finding its applications in all walks of life. Stereoscopic images are in great demand as people want to view images in a manner similar to eyes which create 3D pictures. The artificial stereoscopy includes left and right images captured by two cameras simultaneously. The shifted distance of right image from left image gives us the perception of Depth. In this work, we have developed a DepthMap form left and right images. DepthMap gives us the idea of depth through Gray-coded image and also gives us accurate depth of each point of objects. There are many formats of 3D which can be visualized through glasses, 3D televisions, cross-eyed methods etc. In this work, we have implemented algorithms for Anaglyph images and JPS formats. We can view Anaglyph images through Red-Blue Glasses and JPS images through Cross-eyed methods. These formats can be easily converted to any other formats like MPO. One massively potential applic...
A Survey on Object Detection for the Internet of Multimedia Things (IoMT) using Deep Learning and Event-based Middleware: Approaches, Challenges, and Future Directions
The enormous growth of multimedia content in the field of the Internet of Things (IoT) leads to t... more The enormous growth of multimedia content in the field of the Internet of Things (IoT) leads to the challenge of processing multimedia streams in real-time. Event-based systems are constructed to process event streams. They cannot natively consume multimedia event types produced by the Internet of Multimedia Things (IoMT) generated data to answer multimedia-based user subscriptions. Machine learning-based techniques have enabled rapid progress in solving real-world problems and need to be optimised for the low response time of the multimedia event processing paradigm. In this paper, we describe a classifier construction approach for the training of online classifiers, that can handle dynamic subscriptions with low response time and provide reasonable accuracy for the multimedia event processing. We find that the current object detection methods can be configured dynamically for the construction of classifiers in real-time, by tuning hyperparameters even when training from scratch. O...
Event processing systems serve as a middleware between the Internet of Things (IoT) and the appli... more Event processing systems serve as a middleware between the Internet of Things (IoT) and the application layer by allowing users to subscribe to events of interest. Due to the increase of multimedia IoT devices (i.e. traffic camera), the types of events created are shifting more toward unstructured (multimedia) data. Therefore, there is a growing demand for efficient utilization of effective processing of streams of both structured events (i.e. sensors) and unstructured multimedia events (i.e. images, video, and audio). However, current event processing engines have limited or no support for unstructured event types. In this paper, we described a generalized approach that can handle Internet of Multimedia Things (IoMT) events as a native event type in event processing engines with high efficiency. The proposed system extends event processing languages with the introduction of operators for multimedia analysis of unstructured events and leverages a deep convolutional neural network based event matcher for processing image events to extract features. Furthermore, we show that neural network based object detection models can be further optimized by leveraging subscription constraints to reduce time complexity while maintaining competitive accuracy. Our initial results demonstrate the feasibility of a generalized approach toward IoMT-based event processing. Application areas for generalized event processing include traffic management, security, parking, and supervision activities to enhance the quality of life within smart cities.
Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies - ICTCS '16, 2016
The importance of a high performance sorting algorithm with low time complexity cannot be over st... more The importance of a high performance sorting algorithm with low time complexity cannot be over stated. Several benchmark algorithms viz. Bubble Sort, Insertion Sort, Quick Sort, and Merge Sort, etc. have tried to achieve these goals, but with limited success in some scenarios. Newer algorithms like Shell Sort, Bucket Sort, Counting Sort, etc. have their own limitations in terms of category/nature of elements which they can process. The present paper is an attempt to enhance performance of the standard Merge-Sort algorithm by eliminating the partitioning complexity component, thereby resulting in smaller computation times. Both subjective and numerical comparisons are drawn with existing algorithms in terms of time complexity and data sizes, which show the superiority of the proposed algorithm.
Ant Colony Optimization (ACO) is a nature inspired algorithm for solving optimization problems an... more Ant Colony Optimization (ACO) is a nature inspired algorithm for solving optimization problems and is proved to be a powerfnl tool in image processing. It works on the principle that an ant while moving leaves pheromones on its path, which is used as guide to be followed by other ants. ACO is complex and time consuming. In this paper, a multi-threading based implementation of ACO is proposed for identifying edges in images. It combines multi-threading with ACO for increasing the randomness among the artificial ants. The algorithm is implemented and its performance is measured in terms of time complexity. Simulation results show that the proposed method has significantly lower execution time as compared to conventional ACO for edge detection.
Image segmentation is used to separate objects from the background, and thus it has proved to be ... more Image segmentation is used to separate objects from the background, and thus it has proved to be a powerful tool in bio-medical imaging. In this paper, an Improved Edge Detection algorithm for brain-tumor segmentation is presented. It is based on Sobel edge detection. It combines the Sobel method with image dependent thresholding method, and finds different regions using closed contour algorithm. Finally tumors are extracted from the image using intensity information within the closed contours. The algorithm is implemented in C and its performance is measured objectively as well as subjectively. Simulation results show that the proposed algorithm gives superior performance over conventional segmentation methods. For comparative analysis, various parameters are used to demonstrate the superiority of proposed method over the conventional ones.
Proceedings of the 2020 International Conference on Multimedia Retrieval
The Internet of Multimedia Things (IoMT) is an emerging concept due to the large amount of multim... more The Internet of Multimedia Things (IoMT) is an emerging concept due to the large amount of multimedia data produced by sensing devices. Existing event-based systems mainly focus on scalar data, and multimedia event-based solutions are domain-specific. Multiple applications may require handling of numerous known/unknown concepts which may belong to the same/different domains with an unbounded vocabulary. Although deep neural network-based techniques are effective for image recognition, the limitation of having to train classifiers for unseen concepts will lead to an increase in the overall response-time for users. Since it is not practical to have all trained classifiers available, it is necessary to address the problem of training of classifiers on demand for unbounded vocabulary. By exploiting transfer learning based techniques, evaluations showed that the proposed framework can answer within ∼0.01 min to ∼30 min of response-time with accuracy ranges from 95.14% to 98.53%, even when all subscriptions are new/unknown. CCS CONCEPTS • Information systems → Multimedia streaming; • Computing methodologies → Neural networks; • Software and its engineering → Publish-subscribe / event-based architectures.
Uploads
Papers by asra aslam