The advancement in communication and Internet technology leads to mass of online data available o... more The advancement in communication and Internet technology leads to mass of online data available on the Internet. People communicate to each other with application such as Facebook, Twitter, Short Message Service and e-forum. Entries or posts from these applications are known as microtexts. Normally a microtext is very short, very noisy and does not follow the correct structure of a sentence either in the English language or the Malay language. High occurrence of noisy texts decreases the accuracy value when microtexts are processed. This paper proposes a prototype of a system known as Sistem Penterjemahan Mesej Atas Talian (SPMAT). The objective of the system is to 'clean' noisy texts in microtexts that are created online by the Malaysian. 5000 Facebook messages, 5000 Twitter messages and 5000 e-forum messages were collected. From these sources, few lists such as common noisy texts list, common acronyms list artificial abbreviations list and Bi-gram index were created and us...
This is an open access article distributed under the Creative Commons Attribution License unporte... more This is an open access article distributed under the Creative Commons Attribution License unported 3.0, which permits unrestricted use, distribution, and reproduction in any medium, provided that original work is
International Journal of Advanced Computer Science and Applications, 2013
The number of messages that can be mined from online entries increases as the number of online ap... more The number of messages that can be mined from online entries increases as the number of online application users increases. In Malaysia, online messages are written in mixed languages known as 'Bahasa Rojak'. Therefore, mining opinion using natural language processing activities is difficult. This study introduces a Malay Mixed Text Normalization Approach (MyTNA) and a feature selection technique based on Immune Network System (FS-INS) in the opinion mining process using machine learning approach. The purpose of MyTNA is to normalize noisy texts in online messages. In addition, FS-INS will automatically select relevant features for the opinion mining process. Several experiments involving 1000 positive movies feedback and 1000 negative movies feedback have been conducted. The results show that accuracy values of opinion mining using Naïve Bayes (NB), k-Nearest Neighbor (kNN) and Sequential Minimal Optimization (SMO) increase after the introduction of MyTNA and FS-INS.
E-business systems are known for their frequent changes in business requirements, and traditional... more E-business systems are known for their frequent changes in business requirements, and traditional software development engineering approaches have difficulties in keeping up with this dynamicity. The use of service oriented architecture in software development has become popular as it provides a solution to frequent changes to business environments in a heterogeneous network. In service oriented architecture, new systems are quickly developed by combining services developed and owned by different organizations, and one way of realising this architecture is via Web services. Although much research effort has been put into the discovery, invocation and composition of services testing Web services has only begun to attract interest from both researchers and industry players. This paper aims to provide a mapping study of current Web services composition testing researches conducted by other researchers. Research papers on testing of Web services composition were gathered from various scholarly databases using provided search engines within a given period of time. The research papers were then classified according to issues addressed by them. The aim is to get a broad overview of the current state of research in Web services composition testing. By looking at the areas focused by existing researchers, gaps and untouched areas of Web services composition testing can be discovered.
compared to backward-forward algorithm. However, the sentiment analysis accuracy using AIN with b... more compared to backward-forward algorithm. However, the sentiment analysis accuracy using AIN with both stemming techniques show almost similar result. In the future, thorough study on artificial immune system techniques and comparative study on other machine learning techniques for sentiment analysis is required for better result.
Normalization of noisy texts in Malaysian online reviews
The process of gathering useful information from online messages has increased as more and more p... more The process of gathering useful information from online messages has increased as more and more people use the Internet and other online applications such as Facebook and Twitter to communicate with each other.One of the problems in processing online messages is the high number of noisy texts that exist in these messages.Few studies have shown that the noisy texts decreased the result of text mining activities.On the other hand, very few works have investigated on the patterns of noisy texts that are created by Malaysians.In this study, a common noisy terms list and an artificial abbreviations list were created using specific rules and were utilized to select candidates of correct words for a noisy term.Later, the correct term was selected based on a bi-gram words index.The experiments used online messages that were created by the Malaysians.The result shows that normalization of noisy texts using artificial abbreviations list compliments the use of common noisy texts list.
Intelligent decision support system for employee's performance prediction / Hamidah Jantan … [et al.]
The hidden and valuable knowledge can be discovered through data mining process. In data mining, ... more The hidden and valuable knowledge can be discovered through data mining process. In data mining, classification is one of the major tasks to impart knowledge from huge amount of data. Knowledge discovered form data mining classification process can be embedded with Decision Support System (DSS) development which is known as Intelligent DSS (IDSS). IDSS uses Artificial Intelligent techniques to complement the work of human professionals. Nowadays, data mining techniques are widely used in various fields, but it has not attracted much attention people in Human Resource(HR) field. HR system is known as integrated and interrelated approaches to managing human resources and most of their activities involve a lot of unstructured processes such as staffing, training, motivation and maintenance. In addition, human decisions are subject to limitation where sometimes people forget the crucial details of a problem. Fair and consistent in evaluations are very important for HR professionals in a...
2 Suhaimi Ibrahim and Naz’ri Mahrin are supported by Research University Grant Vot 00H68 Abstract... more 2 Suhaimi Ibrahim and Naz’ri Mahrin are supported by Research University Grant Vot 00H68 Abstract – Semantic Web Services are Web Services that are semantically annotated in order to make the services machine understandable, thus allowing service discovery, selection, composition, and invocation to be done automatically or with minimum human intervention. The Semantic Web services research community has been focusing on how these semantics can facilitate service discovery, selection, composition, and invocation. As of late, there have been some growing research interests in the area of Semantic Web services testing. However, it is not stated how the semantic annotations in the Web services description can help improve testing and how different it is from testing normal Web services. This paper discusses current ongoing research on testing Semantic Web services and classifies how testing uses the semantics of the Semantic Web services description.
There are sheer volume of rich web resources such as digital newspaper, e-forum, blogs, Facebook ... more There are sheer volume of rich web resources such as digital newspaper, e-forum, blogs, Facebook and Twitter. Mining the digital text resources may reveal interesting knowledge to respective individuals or organizations. Text mining and sentiment mining or analysis are parts of a new area in sentiment research. Sentiment mining for Malay Newspaper (SAMNews) is constructed based on the artificial immune system called negative selection algorithm which is able to classify the sentiment in newspaper's sentences into the polarity (positive, negative or neutral) intelligently. The sentiment analysis in this project utilized 1000 sentences from newspapers to evaluate the average accuracy. The research used 900 sentences from newspapers as the training data and another 100 as the testing data. The accuracy is achieved at 88.5%. In the future, a comparative study on Artificial Immune System and other techniques or algorithms can be carried out to enhance the performance of the sentiment classification model.
An Investigation of the Small Business Start-ups' Performance
This paper aims to scrutinize the factors that influence the performance of small businesses in t... more This paper aims to scrutinize the factors that influence the performance of small businesses in the early stages of operation. The sample for this study consists of small enterprises that operate under the Tunas Mekar programme in the states of Terengganu and Kelantan. Multiple regression analysis was conducted to analyse the data. This study reveals that entrepreneurial characteristics, management practices, and training and guidance significantly influence the performance of small business start-ups. This paper proves that entrepreneurial characteristics and management practices are important attributes for the performance in the initial stage of an enterprise. It also reminds theorists and practitioners that to ensure the sustainability of emerging small enterprises, training should not be neglected.
This paper is proposing an Artificial Intelligence (AI) technique in solving the RF magnetron spu... more This paper is proposing an Artificial Intelligence (AI) technique in solving the RF magnetron sputtering process parameter optimization problem. RF magnetron sputtering is a physical vapor deposition process which is widely used in the manufacturing of thin films. In this research, the optimization of the sputtering process parameters is to be solved computationally based on gravitational search algorithm (GSA).This study is concentrating on four process parameters of RF magnetron sputtering process, which are RF power, deposition time, oxygen flow rate and substrate temperature. As for the material, zinc oxide (ZnO) has been chosen due to its many significance characteristics. For the validation purpose, GSA performance was compared with particle swarm optimization (PSO). Based on the results, GSA has outperformed PSO in terms of the accuracy of the optimization performance, fitness value and processing time. The results showed that the AI approach in solving this nano-process para...
Defining Image Attributes for Frames Filtering Rule
Image assessment continues to be a topic of intense research over the last decades. Researchers h... more Image assessment continues to be a topic of intense research over the last decades. Researchers have presented many image attributes and proposed countless computational methods for the image assessment. In this work we define an image attributes measurement method for quantifying frame interestingness. The attributes are used for the purpose of obtaining a set of candidate key frames. The first challenge was to define adequate frame attributes and how they could be measured. In this work, nine attributes were acquired from human experts and these attributes were judged as important in an image interestingness measure. In the context of this study, the definitions of attributes were restricted to obtaining candidate key frames for key frame consideration. Even though this is a preliminary finding, the constraints for each attribute were found to be satisfiable since the filtering rule allows for a selection of candidate key frames based on human preference factor.
This paper proposes a normalization technique of noisy terms that occur in Malaysian micro-texts.... more This paper proposes a normalization technique of noisy terms that occur in Malaysian micro-texts.Noisy terms are common in online messages and influence the results of activities such as text classification and information retrieval.Even though many researchers have study methods to solve this problem, few had looked into the problems using a language other than English. In this study, about 5000 noisy texts were extracted from 15000 documents that were created by the Malaysian.Normalization process was executed using specific translation rules as part or preprocessing steps in opinion mining of movie reviews.The result shows up to 5% improvement in accuracy values of opinion mining.
Indonesian Journal of Electrical Engineering and Computer Science
This paper presents about Ant Colony Algorithm (ACO) for Text Classification in Multicore-Multith... more This paper presents about Ant Colony Algorithm (ACO) for Text Classification in Multicore-Multithread Environment in Artificial Intelligent domain. We had develop a software which assimilate concurrency concept to multiple artificial ants. Pheromone in ACO is the main concept used to solve the text classification problem. In regards to its role, pheromone value is changed depending on the solution finding that has been discovered at the pseudo random heuristic attempt in selecting path from text words. However, ACO can take up longer time to process larger training document. Based on the cooperative concept of ants living in colony, the ACO part is examined to work in multicore-multithread environment as to cater additional execution time benefit. In running multicore-multithread environment, the modification aims to make artificial ants actively communicate between multiple physical cores of processor. The execution time reduction is expected to show an improvement without compromi...
This paper reviews and analyses the limitation of the existing method used in the IR process in r... more This paper reviews and analyses the limitation of the existing method used in the IR process in retrieving Malay Translated Hadith documents related to the search request. Traditional Malay Translated Hadith retrieval system has not focused on semantic extraction from text. The bag-of-words representation ignores the conceptual similarity of information in the query text and documents, which produce unsatisfactory retrieval results. Therefore, a more efficient IR framework is needed. This paper claims that the significant information extraction and subject-related information are actually important because the clues from this information can be used to search and find the relevance document to a query. Also, unimportant information can be discarded to represent the document content. So, semantic understanding of query and document is necessary to improve the effectiveness and accuracy of retrieval results for this domain study. Therefore, advance research is needed and it will be experimented in the future work. It is hoped that it will help users to search and find information regarding to the Malay Translated Hadith document.
Uploads
Papers by Mazidah puteh