Hadoop Paradigm for Satellite Environmental Big Data Processing

Badr-Eddine Boudriki Semlali; Chaker El Amrani

doi:10.4018/IJAEIS.2020010102

Outline

Hadoop Paradigm for Satellite Environmental Big Data Processing

Badr-Eddine Boudriki Semlali

Chaker El Amrani

2020, International Journal of Agricultural and Environmental Information Systems

https://doi.org/10.4018/IJAEIS.2020010102

visibility

…

description

4 pages

link

1 file

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

The important growth of industrial, transport and agriculture activities, has not led only to the air quality and climate changes issues, but also to the increase of the potential natural disasters. The emission of harmful gases, particularly: the Vertical Column Density (VCD) of CO, SO2 and NOx, is one of the major factors causing the aforementioned environmental problems. Our research aims to contribute finding solution to this hazardous phenomenon, by using Remote Sensing (RS) technique to monitor air quality which may help decision makers. However, RS data are not easy to manage, because of their huge size, high complexity, variety and velocity, Thus, our manuscript explains the different aspect of the used satellite data. Furthermore, this article have proved that RS data could be regarded as big data. Accordingly, we have adopted the Hadoop big data architecture and explained how to process efficiently RS environmental data.

International Journal IJRITCC

—Searching info on the web in today's world can be considered as dragging a net across the surface of the earth. While a great deal may be caught in the net, there is still a huge amount of information that is deep, and therefore, missed. The reason is simple: Most of the Web's information is buried down on dynamically produced sites, and standard search engines never find it, where data are hidden behind query interfaces. But a direct query is a "one at a time" laborious way to find info.Several factors contribute to making this problem particularly challenging. The Web is changing at a constant pace – new sources are added, and old sources are removed and modified. The remote wireless senses generate very massive amount real-time data from the Satellite or from the Aircraft with the assistance of the sensors. Technology trends for Big Data accept open source software, commodity servers, and massively parallel-distributed processing platforms. Analytics is at the core of exploiting values from Big Data to produce consumable insights for business and government. This paper presents architecture for Big Data Analytics and explores Big Data technologies offering SQL databases, Hadoop Distributed File System and Map-Reduce. The intended architecture has the aptness of storing incoming unprepared data to dispatch offline analysis on largely stored dumps when required. Concluding, a detailed analysis of remotely sensed earth observatory Big Data for ground or sea level are offered using Hadoop. The proposed architecture possess the ability of dividing, load balancing, and parallel processing of only useful data. Thus, it results in efficient analysis of real-time remote sensing Big Data using earth observatory system.

downloadDownload free PDF View PDFchevron_right

RealBDA: A REAL TIME BIG DATA ANALYTICS FOR REMOTE SENSING DATA BY USING MAPREDUCE PARADIGM

IJESRT Journal

Enormous data generated by Satellite sensors, Storage and Processing of Remote Sensing Data is a challenging task due to its variety and volume. This paper studied on real-time Big Data Analytical architecture for remote sensing satellite application. To handle Remote Sensing Data proposed architecture comprises three main units, such as Data Pre-Processing Unit (DPREU), Data Analysis Unit (DAU) and Data Post-Processing Unit (DPOSTU). First, DPREU acquires the required data from satellite sensors by using filtration, balanced distributed storage and parallel processing using Hadoop environment. Second, DAU identifies the hidden patterns from data stored in distributed File System using Map functions followed by Reduce functions in Map-Reduce paradigm. Finally, DPOSTU is the upper layer unit of the proposed architecture, which is responsible for compiling storage of the results, and generation of decision based on the results received from DAU.

downloadDownload free PDF View PDFchevron_right

In-place query driven big data platform: Applications to post processing of environmental monitoring

Lung-Cheng Lee

Concurrency and Computation: Practice and Experience, 2017

In this paper, we describe the work on the in-place query driven big data platform and applications built on the platform, which include processing climate simulation data and air pollution monitoring. The system architecture of this experimental platform comprises NCHC supercomputer, ALPS, storage pool, one master data node and 18 slave data nodes. The openSUSE operating system and MaraiaDB database are installed on all nodes. Master node is responsible for metadata management and information integration and 18 slave nodes for distributed database and parallel model simulation and computation/analysis. The data are distributed to local nodes according to a pre-defined data partition plan. When application software, such as simulation model or post-processing application, is executed on slave nodes, the relevant input data can be obtained by querying the local database and conduct computing locally. We have obtained the performance benchmark of the system from two applications and both have satisfactory results. When it is applied to global climate simulation, the model simulation is carried out on ALPS supercomputer, and the resultant temporal data are distributed to slave data nodes for parallel post-processing by using MPI. The global forecast data can be further downscaled in regional and local areas through different spatial-scale refinement of models or statistical approach of data mining. When applied to air pollution monitoring, the platform is connected to EPA open data which are used as an input for air pollution GTx model simulation. The simulation and data post-processing are both carried out by GTx on the slave nodes by way of distributed and shared-nothing processing. The influence weighting between point source and receptor or local monitor station thus can be determined. Air quality monitoring requires consider all kind of scenarios, including fixed point source and mobile source management. Therefore, it is necessary to run for many combinations of cases and to constitute knowledge base for fast decision support. Further study will be focused on the air pollution per-warning and response in Taichung city which will be linked with the smart city operation in the city.

downloadDownload free PDF View PDFchevron_right

IRJET- Validating the Data Acquisition and Serialization for Pollution Data using Big Data Analytics

IRJET Journal

IRJET, 2020

In this digital era, data is generated in great volume, variety and velocity. Not all of the data generated has significance. Insignificant and redundant data must be eliminated to form a quality dataset. This data generated in terabytes and petabytes leads to the coining of the term Big Data. The data must be optimally analyzed to enable better models to provide high precision recommendations and solutions to serve mankind. A variety of big data tools are employed to facilitate the faster processing of big data. However, there is no enough evidence to prove if the same tools and methods can be used to improve the analysis for relatively much smaller data. To test this, some of the big data methods and techniques are experimented on pollution data to improve the analysis of small data using big data analytical methods. The effect of quality of air on pollution is analyzed. Poor Air Quality is one of the major challenges that a country faces and is one of the leading causes of deaths. We analyse the major constituents of air that causes contamination of air.

downloadDownload free PDF View PDFchevron_right

A Stream Processing Software for Air Quality Satellite Datasets

Badr-Eddine Boudriki Semlali

2022

There is no doubt that air pollution harms human health. Municipal areas are the most affected by the degradation of the air quality by discharging anthropogenic gases from transport and industrial activities. This research collected remote sensing data from numerous satellite sensors to efficiently monitor the air quality in near-real-time. This paper deliberates the developed software based on the complex event processing calculating in streaming the air quality level in Morocco and Spain. Therefore, this computer program extracts only useful information rapidly from remote sensing big data helping decision-makers. This investigation takes up also a validation between the air quality measured by the ground station data of Andalucía and Madrid regions and the used satellite sensors data.

downloadDownload free PDF View PDFchevron_right

A Survey Model of Big Data by Focusing on the Atmospheric Data Analysis

IJSRD - International Journal for Scientific Research and Development

In the Republic of Korea, the building-type fish and agricultural farms are expected to emerge in the town areas or suburbs. Developed farming technologies that employ water recirculation equipments or LED lights are becoming are becoming more common and convenient. However, there are still some requirements required to successfully operate the farms and these requirements must be identified through analyses of various factors surrounding farms. This study conducts a research to obtain the analytical results and investigates their characteristics through visualization of the atmospheric environment data of Gangnam District provided by the Seoul Metropolitans Government to perform modeling of the preliminary big data analysis against the pollutants as a countermeasure to the bioaccumulation of heavy metals in the agricultural and marine products. The basic research was performed by visualizing the data obtained from the univariate, simple and multiple regression analyses for easy viewing, finding the a log-transformed model, and modeling overall characteristics through categorization of the explanatory variables. We hope that this research will assist the farmers in selecting their farming locations.

downloadDownload free PDF View PDFchevron_right

Towards Remote Sensing Datasets Collection and Processing

Badr-Eddine Boudriki Semlali

International Journal of Embedded and Real-Time Communication Systems, 2019

The world is witnessing important increases in industrial, transport and agriculture activities. This leads to economic growth, but, on the other hand, causes substantial damage in urban air, due to emissions of harmful gases, mainly CO, SO2, NO2 and the Particular Matter (PM). The World Health Organization (WHO) confirms that daily exposure to pollutants causes approximately three million deaths. It is therefore necessary to assess continuously the air quality. In this context, a Java-based application was developed to acquire data from EUMETSAT geostationary and Polar Orbit satellites, through the Mediterranean Dialogue Earth Observatory (MDEO) terrestrial station. This application filters, subsets, processes and visualizes products covering Morocco zone. Significant correlations were found between emissions and industrial activities related to power thermal plants, factories, transportation and ports.

downloadDownload free PDF View PDFchevron_right

Cloud Hadoop Map Reduce For Remote Sensing Image Analysis

Xue Tao

Image processing algorithms related to remote sensing have been tested and utilized on the Hadoop MapReduce parallel platform by using an experimental 112-core high-performance cloud computing system that is situated in the Environmental Studies Center at the University of Qatar. Although there has been considerable research utilizing the Hadoop platform for image processing rather than for its original purpose of text processing, it had never been proved that Hadoop can be successfully utilized for high-volume image files. Hence, the successful utilization of Hadoop for image processing has been researched using eight different practical image processing algorithms. We extend the file approach in Hadoop to regard the whole TIFF image file as a unit by expanding the file format that Hadoop uses. Finally, we apply this to other image formats such as the JPEG, BMP, and GIF formats. Experiments have shown that the method is scalable and efficient in processing multiple large images used mostly for remote sensing applications, and the difference between the single PC runtime and the Hadoop runtime is clearly noticeable.

downloadDownload free PDF View PDFchevron_right

MetOp Satellites Data Processing for Air Pollution Monitoring in Morocco

International Journal of Electrical and Computer Engineering (IJECE)

International Journal of Electrical and Computer Engineering (IJECE), 2018

This paper presents a data processing system based on an architecture comprised of multiple stacked layers of computational processes that transforms Raw Binary Pollution Data coming directly from Two EUMETSAT MetOp satellites to our servers, into ready to interpret and visualise continuous data stream in near real time using techniques varying from task automation, data preprocessing and data analysis to machine learning using feedforward artificial neural networks. The proposed system handles the acquisition, cleaning, processing, normalizing, and predicting of Pollution Data in our area of interest of Morocco.

downloadDownload free PDF View PDFchevron_right

Cloud Computing Cloud Computing in Remote Sensing : High Performance Remote Sensing Data Processing in a Big data Environment

siham aouad

INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2021

Multi-area and multi-faceted remote sensing (SAR) datasets are widely used due to the increasing demand for accurate and up-to-date information on resources and the environment for regional and global monitoring. In general, the processing of RS data involves a complex multi-step processing sequence that includes several independent processing steps depending on the type of RS application. The processing of RS data for regional disaster and environmental monitoring is recognized as computationally and data demanding.Recently, by combining cloud computing and HPC technology, we propose a method to efficiently solve these problems by searching for a large-scale RS data processing system suitable for various applications. Real-time on-demand service. The ubiquitous, elastic, and high-level transparency of the cloud computing model makes it possible to run massive RS data management and data processing monitoring dynamic environments in any cloud. via the web interface. Hilbert-based da...

downloadDownload free PDF View PDFchevron_right

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

References (1)

Yongyut Trisurat, Rob Alkemade and Peter H. Verburg (2011). Land Use, Climate Change and Biodiversity Modeling: Perspectives and Applications (pp. 199-218). www.igi-global.com/chapter/modeling-land-use-biodiversity- northern/53753?camid=4v1a

Badr-Eddine Boudriki Semlali

International Journal of Technology and Engineering Studies

This research aims to monitor abnormal climate changes and supervise Air Quality (AQ), especially in Morocco. This study aims to contribute to finding a solution to the AQ degradation and climate change issues by using Remote Sensing (RS) techniques. RSBD in NRT is collected from six sources: the MDEO ground station of EUMETSAT data, the EOSDIS data of NASA, the NESDIS data of NOAA, and the Copernicus platform, some MGS data, and the Raspberry PI sensors data. The current manuscript explains the different aspects of the used satellite data, proving that satellite data could be regarded as Big Data (BD). Accordingly, this research has proposed a Hadoop BD architecture and explained how to efficiently process RS environmental data. This architecture comprises six main layers: the data sources, data ingestion, data storage, data processing, data visualization, and the monitoring layer. The aforementioned architecture automatically collects filters, extracts, and stores data into the HDFS. This proposed model would be beneficial in managing adverse climate conditions and prevent natural disasters.

downloadDownload free PDF View PDFchevron_right

Satellite Big Data Ingestion for Environmentally Sustainable Development

Badr-Eddine Boudriki Semlali

Currently, many environmental applications take advantage of remote sensing techniques, particularly air quality monitoring, climate changes overseeing, and natural disasters prediction. However, a massive volume of remote sensing data is generated in near-real-time; such data are complex and are provided with high velocity and variety. This study aims to confirm that satellite data are big data and proposes a new big data architecture for satellite data processing. In this paper, we mainly focused on the ingestion layer enabling an efficient remote sensing big data preprocessing. As a result, the developed ingestion layer removed eighty-six percent of the unnecessary daily files. Moreover, it eliminated about twenty percent of the erroneous and inaccurate plots, therefore, reducing storage consumption and improving satellite data accuracy. Finally, the processed data was efficiently integrated into a Hadoop storage system.

downloadDownload free PDF View PDFchevron_right

Big data and remote sensing: A new software of ingestion

Badr-Eddine Boudriki Semlali, International Journal of Electrical and Computer Engineering (IJECE)

International Journal of Electrical and Computer Engineering (IJECE), 2021

Currently, remote sensing is widely used in environmental monitoring applications, mostly air quality mapping and climate change supervision. However, satellite sensors occur massive volumes of data in near-real-time, stored in multiple formats and are provided with high velocity and variety. Besides, the processing of satellite big data is challenging. Thus, this study aims to approve that satellite data are big data and proposes a new big data architecture for satellite data processing. The developed software is enabling an efficient remote sensing big data ingestion and preprocessing. As a result, the experiment results show that 86 percent of the unnecessary daily files are discarded with a data cleansing of 20 percent of the erroneous and inaccurate plots. The final output is integrated into the Hadoop system, especially the HDFS, HBase, and Hive, for extra calculation and processing.

downloadDownload free PDF View PDFchevron_right

SAT-Hadoop-Processor: A Distributed Remote Sensing Big Data Processing Software for Earth Observation Applications

Badr-Eddine Boudriki Semlali

Applied Sciences

Nowadays, several environmental applications take advantage of remote sensing techniques. A considerable volume of this remote sensing data occurs in near real-time. Such data are diverse and are provided with high velocity and variety, their pre-processing requires large computing capacities, and a fast execution time is critical. This paper proposes a new distributed software for remote sensing data pre-processing and ingestion using cloud computing technology, specifically OpenStack. The developed software discarded 86% of the unneeded daily files and removed around 20% of the erroneous and inaccurate datasets. The parallel processing optimized the total execution time by 90%. Finally, the software efficiently processed and integrated data into the Hadoop storage system, notably the HDFS, HBase, and Hive.

downloadDownload free PDF View PDFchevron_right

Real-Time Big Data Analytical Architecture for Remote Sensing Application

JP INFOTECH PROJECTS

The assets of remote senses digital world daily generate massive volume of real-time data (mainly referred to the term “Big Data”), where insight information has a potential significance if collected and aggregated effectively. In today’s era, there is a great deal added to real-time remote sensing Big Data than it seems at first, and extracting the useful information in an efficient manner leads a system toward a major computational challenges, such as to analyze, aggregate, and store, where data are remotely collected. Keeping in view the above mentioned factors, there is a need for designing a system architecture that welcomes both realtime, as well as offline data processing. Therefore, in this paper, we propose real-time Big Data analytical architecture for remote sensing satellite application. The proposed architecture comprises three main units, such as 1) remote sensing Big Data acquisition unit (RSDU); 2) data processing unit (DPU); and 3) data analysis decision unit (DADU). First, RSDU acquires data from the satellite and sends this data to the Base Station, where initial processing takes place. Second, DPU plays a vital role in architecture for efficient processing of real-time Big Data by providing filtration, load balancing, and parallel processing. Third, DADU is the upper layer unit of the proposed architecture, which is responsible for compilation, storage of the results, and generation of decision based on the results received from DPU. The proposed architecture has the capability of dividing, load balancing, and parallel processing of only useful data. Thus, it results in efficiently analyzing real-time remote sensing Big Data using earth observatory system. Furthermore, the proposed architecture has the capability of storing incoming raw data to perform offline analysis on largely stored dumps, when required. Finally, a detailed analysis of remotely sensed earth observatory Big Data for land and sea area are provided using Hadoop. In addition, various algorithms are proposed for each level of RSDU, DPU, and DADU to detect land as well as sea area to elaborate the working of an architecture.

downloadDownload free PDF View PDFchevron_right

Hadoop-Based Distributed System for Online Prediction of Air Pollution Based on Support Vector Machine

Abbas Alimohammadi

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2015

The critical impact of air pollution on human health and environment in one hand and the complexity of pollutant concentration behavior in the other hand lead the scientists to look for advance techniques for monitoring and predicting the urban air quality. Additionally, recent developments in data measurement techniques have led to collection of various types of data about air quality. Such data is extremely voluminous and to be useful it must be processed at high velocity. Due to the complexity of big data analysis especially for dynamic applications, online forecasting of pollutant concentration trends within a reasonable processing time is still an open problem. The purpose of this paper is to present an online forecasting approach based on Support Vector Machine (SVM) to predict the air quality one day in advance. In order to overcome the computational requirements for large-scale data analysis, distributed computing based on the Hadoop platform has been employed to leverage th...

downloadDownload free PDF View PDFchevron_right

Cloud Computing in Remote Sensing: Big Data Remote Sensing Knowledge Discovery and Information Analysis

Fadoua Bahja

International Journal of Advanced Computer Science and Applications, 2021

With the rapid development of remote sensing technology, our ability to obtain remote sensing data has been improved to an unprecedented level. We have entered an era of big data. Remote sensing data clear showing the characteristics of Big Data such as hyper spectral, high spatial resolution, and high time resolution, thus, resulting in a significant increase in the volume, variety, velocity and veracity of data. This paper proposes a feature supporting, salable, and efficient data cube for time-series analysis application, and used the spatial feature data and remote sensing data for comparative study of the water cover and vegetation change.The spatial-feature remote sensing data cube (SRSDC) is described in this paper. It is a data cube whose goal is to provide a spatial-feature-supported, efficient, and scalable multidimensional data analysis system to handle largescale RS data. It provides a high-level architectural overview of the SRSDC.The SRSDC offers spatial feature repositories for storing and managing vector feature data, as well as feature translation for converting spatial feature information to query operations.The paper describes the design and implementation of a feature data cube and distributed execution engine in the SRSDC. It uses the long time-series remote sensing production process and analysis as examples to evaluate the performance of a feature data cube and distributed execution engine. Big data has become a strategic highland in the knowledge economy as a new strategic resource for humans. The core knowledge discovery methods include supervised learning methods data analysis supervised learning, unsupervised learning methods data analysis unsupervised learning, and their combinations and variants.

downloadDownload free PDF View PDFchevron_right

Weather Data Analysis Using Hadoop: Applications and Challenges

mazlina abdul majid

IOP Conference Series: Materials Science and Engineering, 2019

Weather data is very crucial in every aspect of human daily life. It plays an important role in many sectors such as agriculture, tourism, government planning, industry and so on. Weather has a variety of parameters like temperature, pressure, humidity and wind speed. The meteorological department deployed sensors for each weather parameter at different geographical locations to collect data. This data is stored mostly in the unstructured format. Thus, a big amount of data has been collected and archived. Therefore, storage and processing of this big data for accurate weather prediction is a huge challenge. Hadoop an apache product it used to support big data sets in a distributed environment. Hadoop has greatest advantages over scalable and fault-tolerant distributed processing technologies. This paper explains a system that uses the historical weather data of a region and apply the MapReduce and Hadoop techniques to analysis these historical data.

downloadDownload free PDF View PDFchevron_right

SAT-CEP-monitor: An air quality monitoring software architecture combining complex event processing with satellite remote sensing

Badr-Eddine Boudriki Semlali

Computers and Electrical Engineering, 2021

Air pollution is a major problem today that causes serious damage to human health. Urban areas are the most affected by the degradation of air quality caused by anthropogenic gas emissions. Although there are multiple proposals for air quality monitoring, in most cases, two limitations are imposed: the impossibility of processing data in Near Real-Time (NRT) for remote sensing approaches and the impossibility of reaching areas of limited accessibility or low network coverage for ground data approaches. We propose a software architecture that efficiently combines complex event processing with remote sensing data from various satellite sensors to monitor air quality in NRT, giving support to decision-makers. We illustrate the proposed solution by calculating the air quality levels for several areas of Morocco and Spain, extracting and processing satellite information in NRT. This study also validates the air quality measured by ground stations and satellite sensor data.

downloadDownload free PDF View PDFchevron_right

IRJET- Survey on Application of Big Data in weather monitoring system

IRJET Journal

IRJET, 2020

Weather forecasts are made by collecting as much data as possible about the current state of the atmosphere to determine how the atmosphere evolves in the future. To handle such humongous data-"Big Data" is introduced. Big Data has become an imminent part of all industries and business sectors today. we propose a Pre-Processing Framework to address quality of data in weather monitoring. Hence, it is imperative to improve Data quality even it is absorbed and utilized in an industry's Big Data system. In this paper, we propose a Pre-Processing Framework to address quality of data in a weather monitoring and forecasting application that also takes into account global warming parameters and raises alerts/notifications to warn users and scientists in advance. MOTIVATION-We have conceptualized a Weather Monitoring and Forecasting Application to raises alerts/notifications to warn users and scientists in advance.

downloadDownload free PDF View PDFchevron_right

Cited by

SAT-Hadoop-Processor: A Distributed Remote Sensing Big Data Processing Software for Earth Observation Applications

Badr-Eddine Boudriki Semlali

Applied Sciences

downloadDownload free PDF View PDFchevron_right

Hadoop Paradigm for Satellite Environmental Big Data Processing

Sign up for access to the world's latest research

Abstract

Related papers

References (1)

Related papers

Related topics

Cited by