Academia.eduAcademia.edu

Fully Convolutional Network

description15 papers
group2 followers
lightbulbAbout this topic
A Fully Convolutional Network (FCN) is a type of deep learning architecture that utilizes convolutional layers without fully connected layers, enabling the model to accept input of arbitrary size and produce spatially structured outputs, commonly used for tasks such as image segmentation and pixel-wise classification.
lightbulbAbout this topic
A Fully Convolutional Network (FCN) is a type of deep learning architecture that utilizes convolutional layers without fully connected layers, enabling the model to accept input of arbitrary size and produce spatially structured outputs, commonly used for tasks such as image segmentation and pixel-wise classification.

Key research themes

1. How can residual learning improve the optimization and depth scalability of fully convolutional networks for visual recognition tasks?

This research area is centered on overcoming the degradation problem in deep convolutional networks by using residual learning frameworks within fully convolutional architectures. The challenge involves easing optimization of substantially deeper nets while maintaining or improving accuracy for tasks such as semantic segmentation, object detection, and image classification. Residual networks (ResNets) refactor convolutional layers as residual functions with identity shortcut connections enabling easier training for very deep architectures without increased complexity, thereby expanding the representational capacity of fully convolutional networks.

Key finding: Introduced a residual learning framework that explicitly lets layers fit residual mappings instead of unreferenced functions, implemented via identity shortcut connections in deep fully convolutional networks. This approach... Read more
Key finding: Proposed increasing the width of Deep Residual Network in Network (DrNIN) architectures while reducing depth to counteract decreasing feature reuse and training inefficiencies in very deep residual networks. This wider... Read more
Key finding: Developed a deep residual fully convolutional network with residual blocks for blind image denoising under various noise types including Gaussian, Poisson, and Poisson-Gaussian. Using cascade training for stage-wise residual... Read more
Key finding: Presented a fully convolutional network leveraging residual learning for per-pixel free-space detection in automotive scenes, trained via self-supervised online methods using stereo disparity cues as weak labels. The residual... Read more

2. What architectural modifications to fully convolutional networks can improve semantic segmentation accuracy, robustness, and computational efficiency across various domains?

This theme investigates CNN architectural variants that simplify, regularize, or extend fully convolutional networks to better capture spatial context, incorporate multi-scale feature representations, or reduce parameter complexity while maintaining or improving semantic segmentation accuracy. Innovations include replacing pooling layers with strided convolutions, dilated convolutions, incorporating spatial regularization terms, multi-modal data fusion, and light-weight designs tuned for real-time biomedical imaging and complex scene parsing.

Key finding: Demonstrated that max-pooling layers in fully convolutional networks for semantic segmentation and image recognition can be replaced effectively by convolutional layers with stride greater than one, simplifying network... Read more
Key finding: Employed dilated convolutions within fully convolutional networks to perform end-to-end and semi-supervised semantic segmentation of pathological lung interstitial diseases in HRCT scans. The model handled arbitrary image... Read more
Key finding: Introduced spatial total variation (TV) regularization integrated into CNN segmentation models via modified activation functions (e.g., softmax), resulting in spatially smooth and robust segmentation maps within fully... Read more
Key finding: Designed a novel lightweight fully convolutional encoder-decoder architecture incorporating dilated channel-wise CNN modules and feature pyramid representations for real-time segmentation of diverse biomedical images.... Read more
Key finding: Incorporated geometric and topological priors, including smoothness and hierarchical label relations, directly into the loss function of fully convolutional networks for histology gland segmentation. This topology-aware... Read more

3. How can fully convolutional networks be extended or fused with specialized modules and learning strategies for enhanced scene understanding and multimodal image analysis?

This theme focuses on augmenting the core fully convolutional architecture with supplementary networks, loss functions, or learning paradigms to handle diverse modalities, improve detail representation, and enable adaptable or online learning. Research explores multi-modal data fusion (e.g., RGB-D), combined fully connected-convolutional layers for GANs, dual-path networks for image restoration, self-supervised training with weak labels, and recurrent or LSTM modules integrated with FCNs for temporal or weather image classification tasks.

Key finding: Developed an FCN-based semantic segmentation pipeline that processes RGB-D camera data to detect and localize objects in warehouse environments, handling limited training datasets via data augmentation, transfer learning, and... Read more
Key finding: Proposed a novel GAN architecture combining deep fully connected layers with convolutional layers within both generator and discriminator networks, diverging from conventional convolution-only GAN designs. The FCC-GAN... Read more
Key finding: Introduced dual CNN architectures composed of complementary subnetworks estimating structures and details separately for image restoration tasks. This approach integrated two fully convolutional networks trained with combined... Read more
Key finding: Proposed an FCN combined with Long Short-Term Memory (LSTM) to leverage spatial and temporal features from weather image datasets for real-time weather condition classification. The FCN component extracted spatial image... Read more

All papers in Fully Convolutional Network

Historically, weather forecasting was unreliable and imprecise, relying on intuition and local knowledge. Inaccurate weather forecasts can cause severe impacts on agriculture, construction, and daily life. Existing methods struggle with... more
Early diagnosis and proper grouping of tumors in the brain are critical for successful therapy and positive outcomes for patients. This work proposes a complete technique for identifying brain tumors that employ sophisticated artificial... more
Fully Convolutional Networks have been achieving remarkable results in image semantic segmentation, while being efficient. Such efficiency results from the capability of segmenting several voxels in a single forward pass. So, there is a... more
Facial landmark detection has been studied over decades. Numerous neural network (NN)-based approaches have been proposed for detecting landmarks, especially the convolutional neural network (CNN)-based approaches. In general, CNN-based... more
Fully Convolutional Networks have been achieving remarkable results in image semantic segmentation, while being efficient. Such efficiency results from the capability of segmenting several voxels in a single forward pass. So, there is a... more
Convolutional Neural Network (CNN) has become one of the most popular techniques in image classification. Usually CNN models are trained on a large amount of data, but in this paper, it is discussed CNN usage on data shortage and class... more
Ship plate recognition is challenging due to variations of plate locations and text types. This paper proposes an effcient Fully Convolutional Network based Plate Recognition approach FCNPR, which uses a CNN (Convolutional Neural Network)... more
The world around us may be viewed as a network of entities interconnected via their social, economic, and political interactions. These entities and their interactions form a social network. A social network is often modeled as a graph... more
Brain tumors present a significant medical concern, posing challenges in both diagnosis and treatment. Deep learning has emerged as an evolving technique for automating the diagnostic process for brain tumors. This research paper... more
Non-intrusive, real-time analysis of the dynamics of the eye region allows us to monitor humans’ visual attention allocation and estimate their mental state during the performance of real-world tasks, which can potentially benefit a wide... more
The analysis of complex structured data like video has been a long-standing challenge for computer vision algorithms. Innovative deep learning architectures like Convolutional Neural Networks (CNNs), however are demonstrating remarkable... more
Rapid and uncontrolled cellular proliferation is what distinguishes a brain tumor. Unfortunately, brain tumors cannot be prevented other than via surgery. As predicted, deep learning may help diagnose and cure brain cancers. The... more
Pupil localization extracts pupil center coordinates from images and videos of the human eye along with the pupillary boundary. Pupil localization essentially plays a major role in identity verification, disease recognition, visual focus... more
Classification of images has been a widely regarded challenge for the past decade, but a new type of object recognition problem which deals with pixellevel segmentation is posing a more complex task for both computer vision enthusiasts... more
Deep learning has shown state-of-art classification performance on datasets such as ImageNet, which contain a single object in each image. However, multi-object classification is far more challenging. We present a unified framework which... more
This project focuses on solving the inpainting problem using both Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) approaches. Popular methods of inpainting include Adobe Photoshop’s Content Aware Fill, which do not... more
Deep learning is a fast-growing machine learning approach to perceive and understand large amounts of data. In this paper, general information about the deep learning approach which is attracted much attention in the field of machine... more
New ongoing rural construction has resulted in an extensive mixture of new settlements with old ones in the rural areas of China. Understanding the spatial characteristic of these rural settlements is of crucial importance as it provides... more
Accurate face landmark localization is an essential part of face recognition, reconstruction and morphing. To accurately localize face landmarks, we present our heatmap regression approach. Each model consists of a MobileNetV2 backbone... more
Accurate automatic detection of measurement points in ultrasound video sequences is challenging due to noise, shadows, anatomical differences, and scan plane variation. This paper proposes to address these challenges by a Fully... more
Recently proposed methods for weakly-supervised semantic segmentation have achieved impressive performance in predicting pixel classes despite being trained with only image labels which lack positional information. Because image... more
Tese de mestrado integrado, Engenharia Biomédica e Biofísica (Engenharia Clínica e Instrumentação Médica) Universidade de Lisboa, Faculdade de Ciências, 2021In collective behaviour studies, the use of multi-animal tracking systems is... more
In this paper, we give a new double twist to the robot localization problem. We solve the problem for the case of prior maps which are semantically annotated perhaps even sketched by hand. Data association is achieved not through the... more
The segmented brain tissues from magnetic resonance images (MRI) always pose substantive challenges to the clinical researcher community, especially while making precise estimation of such tissues. In the recent years, advancements in... more
In this paper, we present a new iris ROI segmentation algorithm using a deep convolutional neural network (NN) to achieve the state-of-the-art segmentation performance on well-known iris image data sets. The authors' model surpasses the... more
Many computer vision applications Computer Aided Diagnosis require an accurate and efficient eye detector. We represent, in this work, an efficient approach for determining the position of the eye in images presenting faces. First, a... more
Structured real world data can be represented with graphs whose structure encodes independence assumptions within the data. Due to statistical advantages over generative graphical models, Conditional Random Fields (CRFs) are used in a... more
This paper presents a novel Transformer-based facial landmark localization network named Localization Transformer (LOTR). The proposed framework is a direct coordinate regression approach leveraging a Transformer network to better utilize... more
Accurate face landmark localization is an essential part of face recognition, reconstruction and morphing. To accurately localize face landmarks, we present our heatmap regression approach. Each model consists of a MobileNetV2 backbone... more
Remote eye tracking technology has suffered an increasing growth in recent years due to its applicability in many research areas. In this paper, a video-oculography method based on convolutional neural networks (CNNs) for pupil center... more
The segmented brain tissues from magnetic resonance images (MRI) always pose substantive challenges to the clinical researcher community, especially while making precise estimation of such tissues. In the recent years, advancements in... more
Eye detection algorithms are being used in many fields such as camera applications for entertainment and commercial purposes, gaze detection applications, computer-human interaction applications, and eye recognition applications for... more
Deep learning has shown state-of-art classification perfor mance on datasets such as ImageNet, which contain a single object in each image. How ever, multi-object classification is far more challenging. We present a unified f ramework... more
Deep learning has shown state-of-art classification performance on datasets such as ImageNet, which contain a single object in each image. However, multi-object classification is far more challenging. We present a unified framework which... more
For the challenging semantic image segmentation task the most efficient models have traditionally combined the structured modelling capabilities of Conditional Random Fields (CRFs) with the feature extraction power of CNNs. In more recent... more
This paper deals with a mobile phone application that allows the user to analyse their driving habits and therefore recognize whether an electric vehicle would suit the users' requirements. The application analyses the daily trips and... more
In recent years new trends such as industry 4.0 boosted the research and development in the field of autonomous systems and robotics. Robots collaborate and even take over complete tasks of humans. But the high degree of automation... more
Recently, many works have been inspired by the success of deep learning in computer vision for plant diseases classification. Unfortunately, these end-to-end deep classifiers lack transparency which can limit their adoption in practice.... more
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
The segmented brain tissues from magnetic resonance images (MRI) always pose substantive challenges to the clinical researcher community, especially while making precise estimation of such tissues. In the recent years, advancements in... more
Accurate face landmark localization is an essential part of face recognition, reconstruction and morphing. To accurately localize face landmarks, we present our heatmap regression approach. Each model consists of a MobileNetV2 backbone... more
In the medical field, landmark detection in MRI plays an important role in reducing medical technician efforts in tasks like scan planning, image registration, etc. First, 88 landmarks spread across the brain anatomy in the three... more
Robust real-time tracking of the human body is crucial to applications that benefit from live visualizations performed on the underlying body. Such applications could fall in the category of Augmented Reality for Human Bodies, finding... more
This paper presents a novel Transformer-based facial landmark localization network named Localization Transformer (LOTR). The proposed framework is a direct coordinate regression approach leveraging a Transformer network to better utilize... more
A key step to driver safety is to observe the driver's activities with the face being a key step in this process to extracting information such as head pose, blink rate, yawns, talking to passenger which can then help derive higher... more
This paper presents a novel Transformer-based facial landmark localization network named Localization Transformer (LOTR). The proposed framework is a direct coordinate regression approach leveraging a Transformer network to better utilize... more
Accurate eye segmentation can improve eye-gaze estimation and support interactive computing based on visual attention; however, existing eye segmentation methods suffer from issues such as person-dependent accuracy, lack of robustness,... more
Accurate face landmark localization is an essential part of face recognition, reconstruction and morphing. To accurately localize face landmarks, we present our heatmap regression approach. Each model consists of a MobileNetV2 backbone... more
Download research papers for free!