Learning predictors for student retention is very difficult. After reviewing the literature, it i... more Learning predictors for student retention is very difficult. After reviewing the literature, it is evident that there is considerable room for improvement in the current state of the art. As shown in this paper, improvements are possible if we (a) explore a wide range of learning methods;(b) take care when selecting attributes;(c) assess the efficacy of the learned theory not just by its median performance, but also by the variance in that performance;(d) study the delta of student factors between those who stay and those who are retained.
Abstract There are many machine learning algorithms currently available. In the 21st century, the... more Abstract There are many machine learning algorithms currently available. In the 21st century, the problem no longer lies in writing the learner but in choosing which learners to run on a given data set. We argue that the final choice of learners should not be exclusive; in fact, there are distinct advantages in running data sets through multiple learners. To illustrate our point, we perform a case study on a reuse data set using three different styles of learners: association rule, decision tree induction, and treatment.
Abstract Despite the widespread availability of software effort estimation models (eg COCOMO [2],... more Abstract Despite the widespread availability of software effort estimation models (eg COCOMO [2], Price-S [12], SEER-SEM [13], SLIM [14]), most managers still estimate new projects by extrapolating from old projects [3, 5, 7]. In this delta method, the cost of the next project is the cost of the last project multiplied by some factors modeling the difference between old and new projects [2]. Delta estimation is simple, fast, and best of all, can take full advantage of local costing information.
Knowledge Acquisition for Performance Systems; or: When can" tests" replace" tasks"?
ABSTRACT: Currently," task analysis" is the dominant paradigm in the knowledge acquisition commun... more ABSTRACT: Currently," task analysis" is the dominant paradigm in the knowledge acquisition community. We argue that for performance systems (ie systems that do not have to offer a knowledge-level description of their performance at runtime) a simpler" test analysis" approach may suffice. We offer examples were a seemingly-naive testing regime gives rise to competent performance systems.
Most process models calibrate their internal settings using historical data. Collecting this data... more Most process models calibrate their internal settings using historical data. Collecting this data is expensive, tedious, and often an incomplete process. Is it possible to make accurate software process estimates without historical data? Suppose much of uncertainty in a model comes from a small subset of the model variables.
Abstract Adaptive systems are systems whose function evolves while adapting to current environmen... more Abstract Adaptive systems are systems whose function evolves while adapting to current environmental conditions, Due to the real-time adaptation, newly learned data have a significant impact on system behavior When online adaptation is included in system control, anomalies could cause abrupt loss of system functionality and possibly result in a failure. In this paper we present a framework for reasoning about the online adaptation problem.
Evaluation issues for visual programming languages
Abstract Many claims are made regarding the benefits of visual frameworks. This case that picture... more Abstract Many claims are made regarding the benefits of visual frameworks. This case that pictures assist in explaining complicated knowledge seems intuitively obvious. But is it correct? Pre-experimental intuitions must be verified, no matter how compelling they may seem. This article takes a critical look at the available evidence on the efficacy of visual programming (VP) systems.
Abstract Software compiles and therefore is characterized by a parseable grammar. Natural languag... more Abstract Software compiles and therefore is characterized by a parseable grammar. Natural language text rarely conforms to prescriptive grammars and therefore is much harder to parse. Mining parseable structures is easier than mining less structured entities. Therefore, most work on mining repositories focuses on software, not natural language text.
We discuss using a single inference procedure (abduction) for implementing the various modules of... more We discuss using a single inference procedure (abduction) for implementing the various modules of an intelligent decision support systems.
Abstract It is an interesting and exciting challenge to change programming modalities from a trad... more Abstract It is an interesting and exciting challenge to change programming modalities from a traditional textbased approach to a 2-D screen. Based on a survey of current visual programming systems, we nd that numerous software engineering and knowledge engineering techniques are required to meet that challenge. Further, we argue that VP systems can bene t from on-going knowledge engineering research on the computational complexity of different representations.
Abstract Modern software is often constructed using¿ spiral specification¿; ie the specification ... more Abstract Modern software is often constructed using¿ spiral specification¿; ie the specification is a dynamic document that is altered by experience with the current version of the system. Mathematically, many of the sub-tasks within spiral specification belong to the NP-complete class of tasks. In the traditional view of computer science, such tasks are fundamentally intractable and only solvable using incomplete, approximate methods that can be undependable.
The late Herbert Simon characterized design as a search through a space of options [Simon, 1969].... more The late Herbert Simon characterized design as a search through a space of options [Simon, 1969]. This definition can be extended as follows: given some predicate GOOD that can assess a design, then a design discussion can be characterized as a debate between options in the design space that maximize the score of GOOD. In the usual case, the design discussion is complicated by a set of uncontrollable variables which are set via some nondeterministic process outside the control of the analyst.
The purpose of assurance activities is to reduce risk, thereby ensuring requirements. However, as... more The purpose of assurance activities is to reduce risk, thereby ensuring requirements. However, assurance activities incur costs such as budget, schedule, mass ( e g , radiation shielding), etc. The selection of assurance activities to perform is thus an assurance optimization problem. For example, for a given budget, selection of the set of assurance activities that will minimize risk (i.e., maximize requirements). Alternately, for a given level of requirements, selection of the minimal cost set of assurance activities that will achieve that level of requirements. Our work demonstrates a novel technique to assurance optimization. Users indicate their preferences by assigning relative weights to solution classes (e.g., weighting highly a solution class that is low risk and at or below the users' target cost threshold, and weighing less highly a solution class that is low risk but slightly above the users' target cost threshold). The technique uses machine learning to identify the critical choices that lead to contrastingly different classifications. The net result is near-optimal solutions to assurance optimization problems, even in huge search spaces. Furthermore, the technique reveals which of the many decisions are the most crucial to achieving those optimal results. The technique is realized in an operational computer program. Experiments on assurance datasets of considerable size show promising empirical results. For example, we experimented on an assurance model that arose from a study of an advanced spacecraft technology. This assurance model contained 99 options of risk mitigation actions, i.e. 299(= lo3') possible combinations of these actions. Our technique was successful at determining the 16 actions most crucial to perform, the 14 actions most crucial to not perform, and the remaining 66 actions whose influence was the least on the quality of the solution. A literature review [7], a mathematical analysis [6] and experiments on numerous case studies suggest this technique has broad applicability.
Abstract. Software defect detectors input structural metrics of code and output a prediction of h... more Abstract. Software defect detectors input structural metrics of code and output a prediction of how faulty a code module might be. Previous studies have shown that such metrics many be confused by the high correlation between metrics. To resolve this, feature subset selection (FSS) techniques such as principal components analysis can be used to reduce the dimensionality of metric sets in hopes of creating smaller and more accurate detectors.
Our research findings are captured in what we call the 21st Century Effort Estimation Methodology... more Our research findings are captured in what we call the 21st Century Effort Estimation Methodology (2cee). 2cee has been encoded in a Windows based tool that can be used to both generate an estimate and allow the model developer to calibrate and develop models using these techniques.
Abstract Early practical experience, based upon empirical observations strongly indicates that us... more Abstract Early practical experience, based upon empirical observations strongly indicates that using classes as the basis for e ort estimation, can help to achieve DeMarco's holy grail. In providing evidence for this claim this position paper will describe (i) empirical size observations;(ii) estimations processes, based upon these observations; and (iii) initial experiences using these processes. This paper will conclude with lessons learned and further work to be done.
Abstract We believe that the OO model is a useful heuristic view for large networks of inter-depe... more Abstract We believe that the OO model is a useful heuristic view for large networks of inter-dependent attributes. The mistake of the OO modeling world is to assume that this view should be actual architecture of the software. We will argue that the OO model needs to import numerous concepts from the data modeling literature. Once incorporated, the OO model would become a view of some underlying network of attribute dependencies.
Abstract| Many expert systems tasks imply making assumptions. Where assumptions con ict, these as... more Abstract| Many expert systems tasks imply making assumptions. Where assumptions con ict, these assumptions must be managed in separate worlds. We describe an abductive device for managing such con icts. This device lets us manage the con icting assumptions, validate the (potentially) con icting theory, as well as perform expert systems inference across theories that generate con icts. We nd that con ict management is a general framework for building test engines and inference engines in knowledge-based systems.
Uploads
Papers by Tim Menzies