Software Code Clone Detection Model Using Hybrid Approach
2012, INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY
https://doi.org/10.24297/IJCT.V3I2B.2875…
4 pages
1 file
Sign up for access to the world's latest research
Abstract
The aspiration of this study is to understand and analyze the concept of software Cloning and its detection. Software cloning is an acuity in which source code is duplicated. Software cloning and its detection is one of emerging and most dominant area of research in the field of software engineering. There exist numbers of techniques to detect clone in software. The focus of this study will be given on acquiring and analyzing the concept of hybrid clone detection technique. We will try to devise an algorithm for detecting duplicacy in the software by using hybrid software clone detection technique. The algorithm will first compute the required software metrics that provide sufficient information regarding the software application and then depending on software metrics matches the clone will be detected. While detecting clone we will focus on line clone rather than token or word.
Related papers
Code Clones are the entities in software ecosystems which can be unavoidable. Demand of software based clone detection has risen in industries day by day. Due to code duplication means the copy and paste activities, such pattern is recurrent thereby developers can reduce effort and time of rewriting similar code fragment by editing prewritten code. Code duplication may affect on quality, consistency, maintainability and comprehensibility. Here the trial is variety of syntax, compiler dependent language, and various coding patterns to resolve a single problem. There is lots of software tools, code clone detection algorithms exist, but they have some restrictions to detect perfect cloning. Earlier research and tools developed till now can find only Type-I, Type-II and some part of Type-III clones. Some tools are very slow and time consuming for comparing codes and with low in precision. Type-IV clone detection represents a challenge in current scenario. Type-IV is the Code with similar functionality that may be syntactically different but logically similar referred as semantic clones. This paper presents an algorithm for clone detection based on comparing parts of abstract syntax tree of programs and finding semantic coding styles.
In software upgradation code clones are regularly utilized. So, we can contemplate on code location strategies goes past introductory code. In condition of-craftsmanship on clone programming study, we perceived the absence of methodical overview. We clarified the earlier research-in view of deliberate and broad database find and the hole of research for additionally think about. Software support cost is more than outlining cost. Code cloning is useful in several areas like detecting library contents, understanding program, detecting malicious program, etc. and apart from pros several serious impact of code cloning on quality, reusability and continuity of software framework. In this paper, we have discussed the code clone and its evolution and classification of code clone. Code clone is classified into 4 types namely Type I, Type II, III and IV. The exact code as well as copied code is depicted in detail for each type of code clone. Several clone detection techniques such as: Text, token, metric, hybrid based techniques were studied comparatively. Comparison of detection tools such as: clone DR, covet, Duploc, CLAN, etc. based on different techniques used are highlighted and cloning process is also explained. Code clones are identical segment of source code which might be inserted intentionally or unintentionally. Reusing code snippets via copying and pasting with or without minor alterations is general task in software development. But the existence of code clones may reduce the design structure and quality of software like changeability, readability and maintainability and hence increase the continuation charges.
In the last few decades many techniques for software clone detection have been investigated by various researchers to detect the duplicated code in programs but all of these techniques have different merits and demerits. However, after a decade of this research there has been a lack of progress in understanding where to fit these techniques into the maintenance process and to detect the evolution of software clones. Code clones are basically identical fragments of code that occur at various locations in a program source code. There is great necessity to understand various approaches of clone detection as it provides useful information for the maintenance, reengineering, program understanding and reuse. After comparison of text based, token based and tree based approach it has been analyzed tree based approach is very fast to detect the efficient clones and one of the major area in which code can be semantically and syntax wise checked.
International Journal of Computer Applications, 2017
Most of the developers indulged in the coding phase of SDLC, try to copy the code that occurs again and again in the code, hence it becomes difficult to maintain the cloned data. If two functions or templates from a single source code are similar then it would be referred as "code clones". Cloning in the code can lead to the obstacles in the maintenance phase of the software. It also increases the probability corresponding to the occurrence of bugs in the software. When a code is reused by copy-paste, then it referred as "software clone". In order to detect the clone from the source code each and every template of the code is evaluated corresponding to the source code. The detection of clone is an issue hence various techniques had been developed in previous research works by various researchers for the detection of clone. In this study a brief introduction is given about the clones in the code, its types, reason of cloning, and process of clone detection. The second section depicts the clone detection techniques with their limitations and advantages. The traditional work conducted in this field is described in the third section of the study under the segment of related work.
iaeme
Code cloning or the act of copying code fragments and making minor, non–functional alterations, is a well known problem for evolving software systems which leads to duplicated code fragments known as code clones. A Clone Detection approach is to find out the reused fragment of code in any application to maintain different types of clones that are being identified by the clone detection techniques. Ever since clone detection evolved, it has been providing better results by reducing the complexity. A different clone detection tool makes the detection process easier and produces efficient results. In many existing systems, main focus is on line by line detection or token based detection to find out the clones in the system. So, it makes the system to take long time to process the entire source code. If the fragment of code is not an exact copy but the functionalities make it similar to each other, then existing system doesn’t figure out that type of clones in it. This paper proposes combination of textual and metric analysis of a source code for the detection of all types of clones in a given set of fragment of java source code. Various semantics have been formulated and their values are used during the detection process. This metrics with textual analysis provides less complexity in finding the clones and giving accurate results.
Concepts, Methodologies, Tools, and Applications
Code clone is a portion of codes that contains some similarities in the same software regardless of changes made to the specific code such as removal of white spaces and comments, changes in code syntactic, and addition or removal of code. Over the years, many approaches and tools for code clone detection have been proposed. Most of these approaches and tools have managed to detect and analyze code clones that occur in large software. In this chapter, the authors aim to provide a comparative study on current state-of-the-art in code clone detection approaches and models together with their corresponding tools. They then perform an empirical evaluation on the selected code clone detection tool and organize the large amount of information in a more systematic way. The authors begin with explaining background concepts of code clone terminology. A comparison is done to find out strengths and weaknesses of existing approaches, models, and tools. Based on the comparison done, they then select a tool to be evaluated in two dimensions, which are the amount of detected clones and run time performance of the tool. The result of the study shows that there are various terminologies used for code clone. In addition, the empirical evaluation implies that the selected tool (enhanced generic pipeline model) gives a better code clone output and runtime performance as compared to its generic counterpart.
IJRAR, 2021
Code fragments are reused by software developers through copying pasting with or without slight modifications. As a consequence in software systems, code sections also include very similar sections known as code clones. Code cloning can be harmful in software evolution and maintenance. Additionally, duplicated fragments will greatly increase the amount of work required when adapting or improving code. Various software engineering processes including software evaluation analysis, code quality analysis, plagiarism detection, program understanding, aspect mining, copyright infringement investigation, code compaction and Bug detection can necessitate the extraction of code fragments that are semantically or syntactically identical, making clone detection an important and valuable software analysis process. Various clone detection methods have been proposed over the last decade. In this article, an adequate comprehension of the text, token, tree, Program Dependency Graph (PDG) and machine learning based clone detection techniques. Also, their benefits and limitations are analyzed in a tabular form. Based on the analysis, future direction towards the clone detection is suggested for better software development. IndexTerms-Software development, clone code, clone code detection, duplicated fragments.
Advances in Mathematics: Scientific Journal, 2020
Code clone detection is the most important aspect for removal of repetitive code. Current software is based on larger codes, large source code takes more processing time and bigger memory to store. There require some automated tool that can identify different types of clones in the codes and mark the code, which has a clone. The programmer can remove the clones by making the general functions. These functions can be called repetitive. In the proposed object oriented metrics based technique for clone detection, different types of clones are detected. The proposed technique has been implemented using Javabased tool. It is a generalized tool, it can convert the code into sequence of the processing steps to identify the types of clones. The proposed technique detects all types of clones T-1, T-2, T-3 and T-4. The proposed approach achieves high accuracy 100%, 90%, 100%, 75%, respectively.
2016
Several studies on clones or porting show that about 14% to 21% of software systems can contain duplicated code, which are basically the results of copying existing code fragments and using them with or without minor modifications. One of the major drawback of such code fragments is that if a bug is detected in a fragment, all similar fragments should be investigated to check the possible existence of the same bug in the similar fragments. In this paper, we first describe the cloning terminologies and commonly used clone type. Second, we provide a review of the existing clone taxonomies, detection approaches. Finally, this paper concludes by pointing out several open problems related to clone detection research.
Information and Software Technology, 2013
Context: Reusing software by means of copy and paste is a frequent activity in software development. The duplicated code is known as a software clone and the activity is known as code cloning. Software clones may lead to bug propagation and serious maintenance problems. Objective: This study reports an extensive systematic literature review of software clones in general and software clone detection in particular. Method: We used the standard systematic literature review method based on a comprehensive set of 213 articles from a total of 2039 articles published in 11 leading journals and 37 premier conferences and workshops. Results: Existing literature about software clones is classified broadly into different categories. The importance of semantic clone detection and model based clone detection led to different classifications. Empirical evaluation of clone detection tools/techniques is presented. Clone management, its benefits and cross cutting nature is reported. Number of studies pertaining to nine different types of clones is reported. Thirteen intermediate representations and 24 match detection techniques are reported. Conclusion: We call for an increased awareness of the potential benefits of software clone management, and identify the need to develop semantic and model clone detection techniques. Recommendations are given for future research.

Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
References (8)
- REFERENCES
- Manik Sharma, Chandni Sharma et. al. "Comparative Study of Static Metrics of Procedural and Object Oriented Programming Languages", IJOCT, February 2012.
- Manik Sharma, Gurdev Singh, "A Comparative Study of Static Object Oriented Metrics "Vol. 3 No.1 (January 2012), IJoAT.
- Toshihiro Kamiya, Shinji kusumoto, " A Token based code clone detection toll ccfinder and its empirical evaluation", Technical report, 2000.
- Brooks, F., The mythical man month: essay on software engineering, Reading mass: Addison-Wesley Publishers.
- A.M. Leitao, "Detection of redundant code using R2D2" in software quality journal, Vol. 12, No.4, pp. 361-382., 2004.
- Manik Sharma, Gurdev Singh, "Analysis of Static and Dynamic Metrics for Productivity and Time Complexity", International Journal of Computer Applications (0975 -8887) Volume 30-No.1, September 2011
- Yogita Sharma, "Hybrid technique for object oriented software clone detection", M.E Thesis submitted at Thapar University, Pataila, 2011.