Siddhant patil

NMIMS University, Btech, Graduate Student

Followers

Following

Public Views

Interests

Uploads

Papers by Siddhant patil

Caption Generator using CNN and LSTM

Siddhant patil, 2024

Our paper introduces an innovative deep learningbased Image Caption Generator that makes use of t... more Our paper introduces an innovative deep learningbased Image Caption Generator that makes use of the strength of convolutional neural networks (CNN) and long-short-term memory (LSTM) networks to produce contextually relevant and descriptive captions of images. The model utilizes a pre-trained CNN, like VGG16 , to obtain high-level visual features that are taken as input by an LSTM-based language model to produce natural language descriptions. Using these two models, our system can bridge the gap between natural language processing and computer vision. The model is trained and tested on common benchmark datasets, Flickr8k, to make it robust and generalize well. Experimental results indicate promising caption generation accuracy, well-describing object relationships, scene information, and semantic knowledge of images. We also solve critical issues of automatic image captioning, e.g., withstanding image content variation, generating correct sentences, and improving caption variety. To boost performance, we investigate techniques that include transfer learning, attention-based methods, and hyperparameter tuning. We compare our model based on standard metrics including BLEU, METEOR, and CIDEr to measure the relevance and fluency of the caption. The results improve vision language modeling with applications in assistive technology for the blind, content-based image retrieval, and human-computer interaction.

Download

CCDC 2127332: Experimental Crystal Structure Determination

The Cambridge Structural Database, 2023

CCDC 2127333: Experimental Crystal Structure Determination

The Cambridge Structural Database, 2023

Scalable Training of Language Models using JAX pjit and TPUv4

Siddhant patil, 2024

Modern large language models require distributed training strategies due to their size. The chall... more Modern large language models require distributed training strategies due to their size. The challenges of efficiently and robustly training them are met with rapid developments on both software and hardware frontiers. In this technical report, we explore challenges and design decisions associated with developing a scalable training framework, and present a quantitative analysis of efficiency improvements coming from adopting new software and hardware solutions.

Download

Design and Fabrication of Automated scarecrow

Zenodo (CERN European Organization for Nuclear Research), May 21, 2023

To protect his crop from animals and catcalls, a planter places a scarecrow in the midst of the f... more To protect his crop from animals and catcalls, a planter places a scarecrow in the midst of the field. We've noticed that when birds enter the field, the scarecrow remains motionless. In our design, we'll change the scarecrow so that when birds enter the field, it will smell them with the aid of a sound detector, move its hand up and down with the use of a flapping medium, and begin buzzing with the aid of a buzzer. The flopping medium's job is to change the motor's rotational stir into the flopping hands' repaying stir.The birds will be scared away from the field as the coil rotates and the connecting rods drive the hand up and down, ensuring the safety of the field's crop. It may be used in the garden as well. A bait or dummy, typically in the form of a mortal, is the basis of a spontaneous scarecrow. Natural scarecrows are typically dressed in worn-out garments and left in open fields to discourage teasing and feeding on recently sown seeds and developing crops. Growers use scarecrows all around the world because they are a magnificent symbol of granges and, by extension, of the nation that engages in widespread agriculture. The typical or garden-style scarecrow is a constructed figure clothed in discarded clothing that is set up in open fields to discourage birds like crows and sparrows from disrupting and preying on recently sown seeds and developing crops. Windmills and other machinery are often mistaken for scarecrows, although this perception gradually fades as wildlife become accustomed to the structures. The nation's austerity is mostly supported by the money from tilling. When producers are far away from their crops and risk exposing them to dangers like bird catcalls destroying the crops, it causes them great anxiety.

Download

Fungal P450 Deconstructs the 2,5-Diazabicyclo[2.2.2]octane Ring <i>En Route</i> to the Complete Biosynthesis of 21<i>R</i>-Citrinadin A

Journal of the American Chemical Society, Jun 23, 2023

Revolutionizing Farming with Innovative Equipment Rental System

2023 2nd International Conference on Edge Computing and Applications (ICECAA)

Fungal P450 Deconstructs the 2,5-Diazabicyclo[2.2.2]octane Ring En Route to the Complete Biosynthesis of 21RCitrinadin A

Design And Fabrication of Automated Scarecrow

SSRN Electronic Journal

An NmrA-like enzyme-catalysed redox-mediated Diels–Alder cycloaddition with anti-selectivity

Nature Chemistry

Speech Emotion Recognition System Using Recurrent Neural Network in Deep Learning

International Journal for Research in Applied Science and Engineering Technology, Mar 31, 2022

In today's world, machine learning and deep learning together are enabling around 80% of the huma... more In today's world, machine learning and deep learning together are enabling around 80% of the human interactions through the sheer ubiquity of the solutions provided by this domain. But one of the problems with the existing world is most of the people are not able to understand the actual emotional meaning and occurrence behind a person's speech. For instance, people having problem like Catatonia, etc. are not able to express themselves clearly or some industries which are considering some marketing strategy according to the customer mood, etc. can use this method. So, to bridge this gap between the people, it is important to develop a system that can assist them and then predict their emotional speech. This paper reviews the different approaches adopted to reduce the barrier of emotional communication which are already in existence and what methodology they used while doing so. In this context, we also present an approach of using the Recurrent Neural Network which is a part of Deep learning algorithms. The whole process of automated systems which continuously learn, adapt, and improve without much instruction is really fascinating. Our primary goal is to create a robust communication system through technologies that enable machines to respond correctly and reliably to human voices and provide useful and valuable services accordingly. In this review, an extensive report is made on the various approaches available for speech emotion recognition that has been done till now. All the model's and accuracy aspects are taken into consideration and are relayed according to it.

Download

User Feedback based Recommendation Engine using Neural Network

2019 International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT)

With a paradigm shift of focus towards user behavior in today’s E-commerce driven contemporary wo... more With a paradigm shift of focus towards user behavior in today’s E-commerce driven contemporary world there is a need for an efficient and portable learning methodology for machines so as to serve a particular user or a fraternity of users collectively based on knowledge acquired on interests of a pool of users. However, any data driven or data sourced Engine can be contaminated by redundant entries or bogus feedback to jack up or scale down a particular commodity on offer so as to facilitate the promotion of another one in it’s stead. This calls for the implementation of a Neural Network which familiarizes itself with the most popular user choices by virtue of user provided ratings and Machine Learning uses a Support Vector Machine model for filtering time-stamps and IP addresses associated with each user interaction with the system to nullify attempts at manipulating said user feedback or rating by miscreants that could affect recommendation of popular choices explicitly.A Neural Network using 12 Neurons representing various dimensions of a user commodity in form a theme or a colour coded scheme is used on which a rating system is laden for user to provide feedback. This feedback is used to provide unsupervised learning. The monitoring of time-stamps and IP addresses of each user feedback is done by using a supervised learning technique. This makes the model a semi-supervised one in it’s entirety which is the best approach. The Neural Network furthermore adopts a layered learning approach where it uses the ratings provided by the users to learn which fore-ground colour contrasts best with which background colour.In this research a comprehensive study of 10 relevant papers has been made to highlight and discover the scope of research and key challenges to the already existing systems in employment by various entities to serve a similar purpose.

Visualizing Urban Accessibility: Investigating Multi-Stakeholder Perspectives through a Map-based Design Probe Study

CHI Conference on Human Factors in Computing Systems, Apr 29, 2022

Figure 1: Interview setup and three-part study process. Part 1 presents visualization probes with... more Figure 1: Interview setup and three-part study process. Part 1 presents visualization probes with seven map types. Row-byrow we gradually build a 5 x 5 map grid (A & B), where each row shows a diferent map type. Part 2 involves performing three sensemaking tasks. In (C), a participant completes a task using the map grid. (D) illustrates a task involving three ego-centric isochrone maps. Part 3 critiques map types and gathers opinions for future interactive visualization tools.

Download

CNN Based Approach for Copy-Move Video Forgery Detection

Design Engineering, Oct 13, 2021

Psychological Survey of Color Perceptions for Indian Users

Design Science and Innovation, 2022

Automatic Energy Meter Reading

Automatic energy meter reading, or AEMR, is the technology of automatically collecting consumptio... more Automatic energy meter reading, or AEMR, is the technology of automatically collecting consumption, diagnostic, and status data from energy metering devices (electric) and transferring that data to a central database for billing, troubleshooting, and analyzing. This technique mainly saves utility providers the expense of periodic trips to each physical location to read a meter using a microcontroller. Another advantage is that the billing can be based on near real time consumption rather than on estimates based on previous or predicted consumption. This timely information coupled with analysis, can help both utility providers and customers better control the use and production of electric energy, gas usage, or water consumption. AEMR technologies include handheld devices like mobile and network technologies based on telephony platforms (wired and wireless) or power line transmission.

Download

Linear with polynomial regression: Overview

International Journal of Applied Research, 2021

In the current world, there is a need to analyze and define the relations from the data and predi... more In the current world, there is a need to analyze and define the relations from the data and predict outcomes for profits. Regression is a Machine Learning technique that involves finding correlations between variables and predicts a continuous output. It helps us to understand how the value of the dependent variable (target) is changing corresponding to an independent variable (predictor). The aim of this paper is to discuss linear regression with its types, polynomial regression, and the relationship between Linear Regression and Polynomial Regression and how they are interrelated. These techniques are widely used to find the trends in data and forecast some outcomes. The aforementioned techniques are explained and analyzed based on the factors like the size of the dataset, type of the data set, quality, efficiency, consistency, accuracy, variables, and performance. These methods create a visual graph that can be used for predicting various past and future outcomes. The intent of d...

Download

Is The Crypto Currency Never Ending World? - IRE Journals

Railway track crack detection based on GSM technique

1Student of Electrical Engineering 2Student of Electrical Engineering 3Assistant Professor, Dept.... more 1Student of Electrical Engineering 2Student of Electrical Engineering 3Assistant Professor, Dept. of Electrical Engineering, DES’s college of engineering and technology, Maharashtra, India ---------------------------------------------------------------------***--------------------------------------------------------------------Abstract Transport is a key necessity for specialization that allows production and consumption of products to occur at different locations. Transport has throughout history been a spur to expansion as better transport leads to more trade. In India, to find that rail transport occupies a prominent position in providing the necessary transport infrastructure to sustain and quench the ever-burgeoning needs of a rapidly growing economy. Today, India possesses the fourth largest railway network in the world. The principal problem has been the lack of cheap and efficient technology to detect problems in the rail tracks and of course, the lack of proper maintenance ...

Download

A Review on Multi-Agent Data Mining Systems

Data mining and intelligent agents have emerged as two fields with immense potential for research... more Data mining and intelligent agents have emerged as two fields with immense potential for research. Every intelligent agent is self-sufficient, acting independently within its boundary while collaborating with other agents to perform the assigned task efficiently. The ability of agents to learn from their experience complements the data mining process. Agent mining helps to overcome the challenges faced by data mining in a distributed heterogeneous environment. Recently, a lot of research has been conducted on the role of agents in the data mining. The paper focuses on the existing multi-agent data mining system architectures and the roles of agents in them. Keywords— Agent, agent mining, multi-agent, pikater, meta learning, distributed data mining, MADM, DMMAS, decision support system, MAS, performance optimization.

Download

Siddhant patil

Uploads

Papers by Siddhant patil

Log In