Optimization of the Lifting Scheme DWT on a VLIW processor
2006 International Workshop on Computer Architecture for Machine Perception and Sensing, 2007
ABSTRACT This paper describes a new approach to implement the 5/3 integer lifting scheme for the ... more ABSTRACT This paper describes a new approach to implement the 5/3 integer lifting scheme for the wavelet transform on a VLIW CPU core, with the goal to improve computational performance in terms of cycles and memory accesses. The lifting scheme is part of the most recent standard for image coding (JPEG2000), for which a highly optimized software implementation is mandatory on embedded processor systems. We use one such processor as reference, to highlight the requirements on VLIW architectures that offer a limited form of instruction level parallelism and a fixed ratio of memory-to-general purpose instructions within a long word. We show that a careful analysis of the data access typical of the lifting scheme allows reducing by a factor of over 60% data misses and execution times measured in clock cycles with respect to a straightforward implementation.
Uploads
Papers by Marco Ferretti