In RFID (Radio Frequency IDentification) systems, a tag reader communicates with tags, reads thei... more In RFID (Radio Frequency IDentification) systems, a tag reader communicates with tags, reads their identification codes, and accesses their related database through network infrastructure. There are many research activities in RFID systems for industrial applications such as delivery, manufacturing, and so on, but there is few on mobile devices such as cellular phones and PDAs. This paper presents architecutre of a multi-protocol RFID reader on mobile devices. We have considered several design parameters, such as low power consumption, cost effectiveness and flexibility. We prototyped our system on the ARM-based Excalibur FPGA with iPAQ PDA, and also a chip with 0.18um technology for verification of our architecture.
Recent proposals for multithreaded architectures allow threads with unknown dependences to execut... more Recent proposals for multithreaded architectures allow threads with unknown dependences to execute speculatively in parallel. These architectures use hardware speculative storage to buffer uncertain data, track data dependences and roll back incorrect executions. Because all memory references access the speculative storage, current proposals implement this storage using small memory structures for fast access. The limited capacity of the speculative storage causes considerable performance loss due to speculative storage overflow whenever a thread's speculative state exceeds the storage capacity. Larger threads exacerbate the overflow problem but are preferable to smaller threads, as larger threads uncover more parallelism. In this paper, we discover a new program property called memory reference idempotency . Idempotent references need not be tracked in the speculative storage, and instead can directly access non-speculative storage (i.e., the conventional memory hierarchy). Thu...
Where Does the Speedup Go: Quantitative Modeling of Performance Losses in Shared-Memory Programs
Parallel Processing Letters, 2000
Even fully parallel shared-memory program sections may perform significantly below the ideal spee... more Even fully parallel shared-memory program sections may perform significantly below the ideal speedup of P on P processors. Relatively little quantitative information is available about the sources of such inefficiencies. In this paper we present a speedup component model that is able to fully account for sources of performance loss in parallel program sections. The model categorizes the gap between measured and ideal speedup into the four components memory stalls, processor stalls, code overhead, and thread management overhead. These model components are measured based on hardware counters and timers, with which programs are instrumented automatically by our compiler. The speedup component model allows us, for the first time, to quantitatively state the reasons for less-than-optimal program performance, on a program section basis. The overhead components are chosen such that they can be associated directly with software and hardware techniques that may improve performance. Although ...
In this paper, we present several tools for analyzing parallel programs. The tools are built on t... more In this paper, we present several tools for analyzing parallel programs. The tools are built on top of a compiler infrastructure, which provides advanced capabilities for symbolic program analysis and manipulation. The tools can display c haracteristics of a program and relate this information to data gathered from instrumented program runs and other performance analysis tools. They also support an interactive compilation scenario, giving the user feedback o n h o w the compilation process performed and how to improve it. We will present case studies demonstrating the tool use. These include the characterization of an industrial application and the study of new compiler techniques and portable parallel languages.
Exploiting reference idempotency to reduce speculative storage overflow
ACM Transactions on Programming Languages and Systems, 2006
Recent proposals for multithreaded architectures employ speculative execution to allow threads wi... more Recent proposals for multithreaded architectures employ speculative execution to allow threads with unknown dependences to execute speculatively in parallel. The architectures use hardware speculative storage to buffer speculative data, track data dependences and correct incorrect executions through roll-backs. Because all memory references access the speculative storage, current proposals implement speculative storage using small memory structures to achieve fast access. The limited capacity of the speculative storage causes considerable performance loss due to speculative storage overflow whenever a thread's speculative state exceeds the speculative storage capacity. Larger threads exacerbate the overflow problem but are preferable to smaller threads, as larger threads uncover more parallelism.In this article, we discover a new program property called memory reference idempotency . Idempotent references are guaranteed to be eventually corrected, though the references may be te...
, respectively. He has co-authored 15 technical papers. His research interests include heterogene... more , respectively. He has co-authored 15 technical papers. His research interests include heterogeneous distributed computing, ubiquitous computing, computer architecture, performance measures, resource management, evolutionary heuristics, energy-aware computing, and reliable and collaborative computing. He is a member of the IEEE and ACM.
Uploads
Papers by Seon Wook Kim