Academia.eduAcademia.edu

Big Data Technologies

description747 papers
group5,893 followers
lightbulbAbout this topic
Big Data Technologies refer to the tools, frameworks, and methodologies used to collect, store, process, and analyze large and complex datasets that traditional data processing applications cannot handle efficiently. These technologies enable organizations to extract valuable insights and support decision-making through advanced analytics and data management techniques.
lightbulbAbout this topic
Big Data Technologies refer to the tools, frameworks, and methodologies used to collect, store, process, and analyze large and complex datasets that traditional data processing applications cannot handle efficiently. These technologies enable organizations to extract valuable insights and support decision-making through advanced analytics and data management techniques.

Key research themes

1. How do distributed computing frameworks address the challenges of scalable big data processing and analytics?

This theme explores how various distributed computing frameworks support efficient storage, processing, and analysis of large-scale big data sets, addressing computational inefficiency, scalability limits, and algorithmic constraints inherent in traditional MapReduce models. Understanding these frameworks is critical for designing big data applications capable of handling exponential data growth and complex analytical tasks.

Key finding: This paper critically evaluates MapReduce-based frameworks like Hadoop MapReduce, Haloop, and Spark, highlighting their limitations in computational inefficiency due to high I/O and communication costs, lack of scalability... Read more
Key finding: The study offers a comprehensive overview of big data system architecture and constituent stages—data sources, data management, computing frameworks, and analysis. It compares distributed file systems and MapReduce-compatible... Read more
Key finding: This paper contextualizes big data origins and examines batch and stream processing technologies in distributed environments. It highlights the evolution of storage architectures and distributed computing paradigms critical... Read more
Key finding: The paper articulates key challenges (volume, velocity, variety) that overwhelm traditional database management systems and surveys distributed big data technologies including NoSQL databases, Hadoop, and cloud computing... Read more
Key finding: This work develops a structured framework for selecting appropriate big data technologies spanning data generation, acquisition, storage, and analytics layers. It highlights the complexity and variety of tools beyond... Read more

2. What are the emerging tools, techniques, and challenges in big data analytics across domains such as healthcare, education, and industry?

This theme focuses on the development and application of advanced big data analytics tools and techniques including machine learning, real-time analytics, and visualization to extract actionable insights in diverse sectors. The significance lies in addressing domain-specific challenges like data heterogeneity, privacy, scalability, and ethical considerations while leveraging big data for strategic decision-making and innovation.

Key finding: Synthesizing 142 peer-reviewed studies, this systematic review identifies critical big data analytics tools (Hadoop, Spark, TensorFlow) and advanced techniques (machine learning, NLP) revolutionizing healthcare by enabling... Read more
Key finding: This paper emphasizes big data’s role in driving innovation and competitiveness amid the Industry 4.0 era, focusing on data-driven decision-making capabilities enabled by analytics of diverse, high-velocity data from IoT and... Read more
Key finding: Through survey research among educators, it finds a low level of awareness and application of big data analytics in educational measurement, identifying critical statistical techniques like clustering and regression... Read more
Key finding: This review demonstrates the pivotal role of big data analytics during the COVID-19 pandemic, particularly in healthcare, education, transportation, and banking sectors. It correlates different types of analytics... Read more
Key finding: Besides exploring technological facets, this paper discusses big data analytics methods and processing types, emphasizing structuralism and functionalism paradigms to understand evolution and current trends. It provides case... Read more

3. How is the concept and discourse of Big Data and Big Tech articulated and critically framed across definitions, ethical debates, and media narratives?

This theme investigates definitional ambiguities in 'big data', explores ethical implications and societal impacts of Big Tech’s expansion particularly in health and privacy contexts, and analyzes media portrayals to understand ideological framing and public discourse. This multifaceted approach is crucial for comprehending Big Data’s conceptual foundations, associated risks, regulatory challenges, and the influence of narratives shaping technology governance and societal perceptions.

Key finding: Utilizing Halliday’s theme-rheme linguistic framework, the paper systematically analyzes 33 definitions of 'big data' from literature, synthesizing them into a comprehensive definition encompassing volume, variety, velocity,... Read more
Key finding: The paper argues for adopting a public health ethics lens to evaluate Big Tech’s expanding role in health and medicine, pointing out risks beyond individual harm including inequities, dependencies on private tech firms, and... Read more
Key finding: Through critical discourse and framing analysis of the 2020 US antitrust hearing and subsequent media coverage, the study reveals that Big Tech frames itself as guardians of American ideals and trustworthy platform managers,... Read more
Key finding: The essay discusses practical legal and ethical dilemmas arising from AI-driven decision-making systems under data privacy laws like GDPR, illustrating challenges in transparency, accountability, and potential discrimination.... Read more
Key finding: The essay proposes integrating stochastic modeling with semantic and process description standards (SPDF) to create adaptable engines capable of simulating human-social-technical interactions within big data systems. It... Read more

All papers in Big Data Technologies

This paper addresses the critical challenge of preserving database integrity during legacy system migrations in the education technology (EdTech) sector. Legacy platforms, often burdened with outdated formats, undocumented business logic,... more
Cloud native ETL pipelines support the extract and transform phases of real time claims processing in large scale insurers. The cloud native approach offers dramatic improvements in scalability, reliability, resiliency and agility as well... more
3510 Résumé L'intelligence artificielle (IA) peut améliorer le contrôle interne en détectant les anomalies et les tendances qui pourraient indiquer des problèmes de fraude ou d'erreur humaine. La maitrise des tâches peut augmenter les... more
Many data sets, such as system logs, are generated from widely distributed locations. Current distributed systems often discard this data because they lack the ability to backhaul it efficiently, or to do anything meaningful with it at... more
This article examines the legal liability of multinational technology companies toward workers in their global supply chains, with particular focus on Kenya's experience as a critical case study in digital labour exploitation. The paper... more
Data engineering is a dynamic and ever-changing discipline that demands strong tools for data lineage management due to the growing complexity of data ecosystems. One of the most important ways to solve problems with data traceability,... more
Resumo: O ar.go inves.ga o impacto do colonialismo digital no campo da arte contemporânea, com ênfase em manifestações ar=s.cas digitais. Par.ndo de uma análise teórica sobre o conceito de colonialismo digital, entendido como uma forma... more
Telecom data models historically centered on OSS/BSS-customers, products, orders, inventory, usage (CDRs), billing, trouble tickets, and network inventory. That foundation still matters, but modern operators run software-defined,... more
New methods of sharing and collaborating on data within companies are required due to the fast development of data ecosystems. In order to respond to changing business demands, organisations typically struggle with agility, scalability,... more
The current research is an attempt to analyze the relationship between the use of big data technologies and strategic cost management and their impact on the competitive advantage of a sample of private banks listed on the Iraq Stock... more
This paper argues that the corruption of the Roman Senate offers a historical mirror for the dysfunction of today’s corporate middle layers. While media narratives often focus on billionaire founders and owners as the architects of... more
There is always a need to modernize energy delivery of traditional power grids, using intelligent devices and big data technologies, this make them smart. The modernization is performed by deploying equipment such as sensors, smart... more
Sommario: 1. Preambolo – 2. Cenni sul pegno nel codice civile italiano e sulla teoria della realità del pegno – 3. Il superamento del tradizionale schema-tipo di pegno – 4. Il contemperamento tra tutela del credito e circolazione della... more
Organizations struggle to attain scalable, secure, and unified data intelligence in the exponential data proliferation age. Conventional data structures are likely to face the problem of data fragmentation, lack of scalability, little or... more
This book is a complete guide for professionals and data enthusiasts who want to make the most of Microsoft’s cloud-native ecosystem for big data analytics. It covers essential services like Azure Synapse Analytics, Microsoft... more
Dans un environnement marqué par une digitalisation accélérée, les banques marocaines sont confrontées à des transformations profondes dans leur relation avec les clients. L’émergence de nouveaux usages numériques, la multiplication des... more
The explosive growing number of data from mobile devices, social media, Internet of Things and other applications has highlighted the emergence of big data. This paper aims to determine the worldwide research trends on the field of big... more
This book offers a critical account of Karl Marx’s dazzling theory of labour power which is also one of the most influential concepts in the history of contemporary philosophy. Labour power is the dark side of the digital revolution.... more
Many organizations are shifting to a data management paradigm called the "Lakehouse," which implements the functionality of structured data warehouses on top of unstructured data lakes. This presents new challenges for query execution... more
Managing big data in cloud environments requires a combination of strong security measures and efficient clustering techniques. With the increasing volume of sensitive data being processed, ensuring secure storage, retrieval, and... more
One of the biggest challenges for businesses experiencing a digital transformation was to build and maintain strong positive relations with employees. To do so, companies should develop a proper organisational culture. Despite the... more
Recently, hybrid multi-site big data analytics (that combines on-premise with off-premise resources) has gained increasing popularity as a tool to process large amounts of data on-demand, without additional capital investment to increase... more
Customers who use internet banking have the flexibility to manage their assets sometimes when and wherever they choose. Any web-based transactions will, in any instance, be vulnerable to privacy risks. The current system makes use of... more
A decade on from the launch of Amazon's Alexa the smart home's breakout product the vision of semi-automated, pervasively sensed domesticity remains unrealised; industry losses are mounting; and big tech protagonists are in partial or... more
The processes of generating innovative solutions mostly rely on skilled experts who are usually unavailable and their outcomes have uncertainty. Computer science and information technology are changing the innovation environment and... more
Makalah ini memaparkan prosedur instalasi OwnCloud suatu platform open-source untuk layanan file sync & share pada lingkungan Windows 10 menggunakan XAMPP 5.6.36 sebagai web stack dan ngrok sebagai reverse proxy guna mempublikasikan... more
Makalah ini memaparkan prosedur instalasi OwnCloud suatu platform open-source untuk layanan file sync & share pada lingkungan Windows 10 menggunakan XAMPP 5.6.36 sebagai web stack dan ngrok sebagai reverse proxy guna mempublikasikan... more
Most of the current day applications process large amounts of data. There were different trends in computing like mainframes, parallel computing, cluster computing, grid computing as per the requirement of the data size and execution... more
Cloudera Impala is a modern, open-source MPP SQL engine architected from the ground up for the Hadoop data processing environment. Impala provides low latency and high concurrency for BI/analytic read-mostly queries on Hadoop, not... more
In the context of debates about logistics as a critical infrastructure of contemporary capitalism, this article addresses the question of the conditions under which workers' structural power can be translated into the organisation of... more
In the early days of navigation, humans relied on visual landmarks such as mountains, coastlines, and structures to determine their location and direction. With the arrival of tools like compasses and sextants, navigation became more... more
This paper explores how Big Data Analytics is revolutionizing the retail sector. With the exponential growth of data from online and offline sources, smart retail systems are leveraging big data to improve customer experiences, inventory... more
The exponential growth of data and the rising demand for scalable, resilient, and cost-efficient computing resources have driven many enterprises to adopt multi-cloud strategies-leveraging services from multiple cloud vendors such as... more
Le traitement des données en gestion et en recherche vise à collecter, organiser, analyser et interpréter des informations pertinentes pour appuyer la prise de décision ou valider des hypothèses. Il repose sur plusieurs étapes clés : la... more
The exponential growth in modern systems has significantly transformed how organizations monitor, optimize, and understand their operations. Logs, which record critical facts concerning system performance, user interaction, and... more
Big Data alludes to huge amounts of data or pieces of information made by the digitization of everything that gets merged and broken down by explicit advancements. Applied to human services, it will utilize explicit wellbeing information... more
Large consumer technology corporations are becoming increasingly influential in health and medicine. While this is sometimes beneficial to public health, it also raises many risks, like inequitable returns to the public sector in... more
In today's dynamic business environment, project management faces increasing complexities and demands for precision, efficiency, and risk mitigation. Big Data Analytics has emerged as a transformative tool, enabling project managers to... more
The evolution of data platforms is entering a new era characterized by AI-driven automation and self-optimizing capabilities that address the unprecedented challenges of exponential data growth. As organizations struggle with... more
The landscape of consumer behavior in insurance markets has evolved significantly in recent years, driven predominantly by advancements in big data analytics. This paper examines the dynamics inherent in bundled insurance offerings,... more
This paper argues that investing in artificial intelligence (AI) in developing economies involves significant trade-offs requiring ethical, financial, and geopolitical scrutiny. While AI is increasingly seen as a vehicle for technological... more
The evolution of data ma0nagement has necessitated the development of efficient Change Data Capture (CDC) mechanisms to ensure real-time data synchronization across disparate systems. Traditional CDC methodologies, including log-based and... more
Big Data processing and analyzing is very crucial across various industries. Rapidly increasing data volume and multiple sources of data generation it has created a challenge to analyze the huge amount of data. Traditional relational... more
This report outlines the initial phase of a machine learning-driven endeavor focused on predicting used car prices. Using a Kaggle dataset of 46,000+ used car listings from Pakistan, we conducted data cleaning, analysis, and... more
The examines cognitive and affective considerations in decisions regarding online purchasing, with a focus on the fashion e-commerce sector. It explains key user experience (UX) techniques designed to minimize user abandonment in the... more
Download research papers for free!