Papers by Marina Ivasic-Kos
Proceedings of the Third Workshop on Vision and Language, 2014
In order to bridge the semantic gap between the visual context of an image and semantic concepts ... more In order to bridge the semantic gap between the visual context of an image and semantic concepts people would use to interpret it, we propose a multi-layered image representation model considering different amounts of knowledge needed for the interpretation of the image at each layer. Interpretation results on different semantic layers of Corel images related to outdoor scenes are presented and compared. Obtained results show positive correlation of precision and recall with the abstract level of classes used for image annotation, i.e. more generalized classes have achieved better results.
In order to exploit the massive image information and to handle overload, techniques for analyzin... more In order to exploit the massive image information and to handle overload, techniques for analyzing image content to facilitate indexing and retrieval of images have emerged. In this paper, a low-level and high-level image semantic annotation based on Fuzzy Petri Net is presented. Knowledge scheme is used to define more general and complex semantic concepts and their relations in the context of the examined outdoor domain. A formal description of hierarchical and spatial relationships among concepts from the outdoor image domain is described. The automatic image annotation procedure based on fuzzy recognition and inheritance algorithm, that maps high-level semantics to image, is presented together with experimental results.
2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2018
The popularity of surveillance systems grows as well as a need for better security systems partic... more The popularity of surveillance systems grows as well as a need for better security systems particularly in a bad lighting conditions or at night. The aim of a security system is to collect as many details as possible to enable a better recognition of persons. In this paper, a comparison of representative thermal face recognition methods will be given, emphasizing their strengths and weaknesses. Then, trends in the development of surveillance and security systems will be outlined such as fusion of visible and thermal images and use of convolutional neural networks. Also, existing challenges of thermal facial recognition and its applications in a real world will be pointed out.
2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO)
Object detection is commonly used in many computer vision applications. In our case, we need to a... more Object detection is commonly used in many computer vision applications. In our case, we need to apply the object detector as a prerequisite for action recognition in handball scenes. Object detection, to be successful for this task, should be as accurate as possible and should be able to deal with a different number of objects of various sizes, partially occluded, with bad illumination and deal with cluttered scenes. The aim of this paper is to provide an overview of the current state-of-the-art detection methods that rely on convolutional neural networks (CNNs) and test their performance on custom video sports materials acquired during handball training and matches. The comparison of the detector performance in different conditions will be given and discussed.

IEEE Access
Due to a growing number of people who carry out various adrenaline activities or adventure touris... more Due to a growing number of people who carry out various adrenaline activities or adventure tourism and stay in the mountains and other inaccessible places, there is an increasing need to organize a search and rescue operation (SAR) to provide assistance and health care to the injured. The goal of SAR operation is to search the largest area of the territory in the shortest time possible and find a lost or injured person. Today, drones (UAVs or drones) are increasingly involved in search operations, as they can capture a large, controlled area in a short amount of time. However, a detailed examination of a large amount of recorded material remains a problem. Even for an expert, it is not easy to find searched people who are relatively small considering the area where they are, often sheltered by vegetation or merged with the ground and in unusual positions due to falls, injuries, or exhaustion. Therefore, the automatic detection of persons and objects in images/videos taken by drones in these operations is very significant. In this paper, the reliability of existing state-of-the-art detectors such as Faster R-CNN, YOLOv4, RetinaNet, and Cascade R-CNN on a VisDrone benchmark and custom-made dataset SARD build to simulate rescue scenes was investigated. After training the models on selected datasets, detection results were compared. Because of the high speed and accuracy and the small number of false detections, the YOLOv4 detector was chosen for further examination. YOLOv4 model results related to different network sizes, different detection accuracies, and transfer learning settings were analyzed. The model robustness to weather conditions and motion blur were also investigated. The paper proposes a model that can be used in SAR operations because of the excellent results in detecting people in search and rescue scenarios. INDEX TERMS Convolutional neural networks, object detector, person detection, search and rescue operations, UAV, YOLO.
Detecting objects in drone imagery: a brief overview of recent progress
2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO)
Ball Detection Using Yolo and Mask R-CNN
2018 International Conference on Computational Science and Computational Intelligence (CSCI)
Player Tracking in Sports Videos
2019 IEEE International Conference on Cloud Computing Technology and Science (CloudCom)
Deep Image Captioning: An Overview
2019 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO)

2019 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO)
In this paper will be presented an original thermal dataset designed for training machine learnin... more In this paper will be presented an original thermal dataset designed for training machine learning models for person detection. The dataset contains 7412 thermal images of humans captured in various scenarios while walking, running, or sneaking. The recordings are captured in the LWIR segment of the electromagnetic (EM) in various weather condition-clear, fog and rain at different distances from the camera, different body positions (upright, hunched) and movement speeds (regular walking, running). In addition to the standard lens of the camera, we used a telephoto lens for recording, and we compared the image quality at different weather conditions and at different distances in both cases to set parameters that provide the level of detail that is enough to detect the person.
Task-Technology Fit and Continuance of Use of Web-Based Programming Tool: A Pilot Study
Human Systems Engineering and Design III

IEEE Access
Global terrorist threats and illegal migration have intensified concerns for the security of citi... more Global terrorist threats and illegal migration have intensified concerns for the security of citizens, and every effort is made to exploit all available technological advances to prevent adverse events and protect people and their property. Due to the ability to use at night and in weather conditions where RGB cameras do not perform well, thermal cameras have become an important component of sophisticated video surveillance systems. In this paper, we investigate the task of automatic person detection in thermal images using convolutional neural network models originally intended for detection in RGB images. We compare the performance of the standard state-of-the-art object detectors such as Faster R-CNN, SSD, Cascade R-CNN, and YOLOv3, that were retrained on a dataset of thermal images extracted from videos that simulate illegal movements around the border and in protected areas. Videos are recorded at night in clear weather, rain, and in the fog, at different ranges, and with different movement types. YOLOv3 was significantly faster than other detectors while achieving performance comparable with the best, so it was used in further experiments. We experimented with different training dataset settings in order to determine the minimum number of images needed to achieve good detection results on test datasets. We achieved excellent detection results with respect to average accuracy for all test scenarios although a modest set of thermal images was used for training. We test our trained model on different well known and widely used thermal imaging datasets as well. In addition, we present the results of the recognition of humans and animals in thermal images, which is particularly important in the case of sneaking around objects and illegal border crossings. Also, we present our original thermal dataset used for experimentation that contains surveillance videos recorded at different weather and shooting conditions. INDEX TERMS Convolutional neural networks, object detector, person detection, surveillance, thermal imaging, YOLO.

Active Player Detection in Handball Scenes Based on Activity Measures
Sensors
In team sports training scenes, it is common to have many players on the court, each with his own... more In team sports training scenes, it is common to have many players on the court, each with his own ball performing different actions. Our goal is to detect all players in the handball court and determine the most active player who performs the given handball technique. This is a very challenging task, for which, apart from an accurate object detector, which is able to deal with complex cluttered scenes, additional information is needed to determine the active player. We propose an active player detection method that combines the Yolo object detector, activity measures, and tracking methods to detect and track active players in time. Different ways of computing player activity were considered and three activity measures are proposed based on optical flow, spatiotemporal interest points, and convolutional neural networks. For tracking, we consider the use of the Hungarian assignment algorithm and the more complex Deep SORT tracker that uses additional visual appearance features to assi...
Human Detection in Thermal Imaging Using YOLO
Proceedings of the 2019 5th International Conference on Computer and Technology Applications - ICCTA 2019
Automatic image annotation refinement
2016 39th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2016
Multimodal Image Retrieval Based on Keywords and Low-Level Image Features
Lecture Notes in Computer Science, 2015
Two-tier image annotation model based on a multi-label classifier and fuzzy-knowledge representation scheme
Pattern Recognition, 2016
A knowledge-based multi-layered image annotation system
Expert Systems with Applications, 2015
Mobile application for finding ATMs
2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2015
Person de-identification in activity videos
2014 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2014
Uploads
Papers by Marina Ivasic-Kos