Improving Proxy Cache Performance: Analysis of Three Replacement Policies

Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing , Volume: 3 Issue: 6 , Nov.-Dec. 1999 Page(s): 44 -50

Outline • Introduction • Related work • New replacement policies • Simulation • Conclusion

Introduction • The goal of web cache • Reducing the bandwidth demand on the external network. • Reducing the average time it takes for aweb page to load.

The location of web cache • Client side • Proxy side • Web server side

Cache replacement policy • The reason why we need replacement • A cache server has a fixed amount of storage. • When this storage space fills, the cache must choose a set of objects to evict.

Cache replacement policy (cont.) • The goal of cache replacement policy • To make the best use of available resources, including disk and memory space and network bandwidth.

Web proxy workload characterization • The characteristic of workload that we interest • The patterns in the number of objects referenced and the relationship among accesses.

Web proxy workload characterization (cont.) • The workload that we choose to simulate • Traces of actual live execution • The drawback of traces of actual live execution • It does not capture changing behavior or the behavior of a different user set.

Measures of efficiency • Hit rate • It’s the most popular measure. • Byte hit rate • It’s interesting on the limited bandwidth and expensive resource. • CPU and I/O system utilization • Object retrieval latency (or page load time)

An optimal cache replacement policy • To know a document’s future popularity and choose the most advantageous way to use its finite space.

The heuristics to approximate optimal cache replacement • Caching a few documents with high download latency might actually reduce average latency. • To maximize object hit rate, it is better to keep many small popular objects.

Related work • Replacement algorithms • Least Recently Used (LRU) • It works well when there is a high temporal locality in workload. • It’s data structure is doubly linked list to implement.

Replacement algorithms (cont.) • Least Frequently Used (LFU) • It evicts the the objects with the lowest reference count. • It’s data structure is heap. • It’s drawback is the problem of cache pollution.

Replacement algorithms (cont.) • LFU-Aging (LFUA) • It considers both an object’s access frequency and its age in cache. • It avoid the problem of cache pollution.

Replacement algorithms (cont.) • Greedy Dual-Size (GDS) • It takes size and a cost function for retrievingobjects into account.

New replacement policies • LFU with Dynamic Aging (LFUDA) • It adds a cache age factor to the reference count, when a new object is added or an existing object is referenced.

New replacement policies (cont.) • GDS-Frequency (GDSF) • It considers reference frequency in addition to object size. • It assigns a key to each object. • Key= (reference count of object / size ) plus cache age

Simulation • Two types of cache replacement algorithm • Heap-based such as GDSF, LFUDA, LRU_H • It’s data structure is heap. • Let memory usage grow to a high watermark before examining any cache metadata

Two types of cache replacement algorithm (cont.) • Listed-based such as LRU_L • It’s data structure is linked list • Examining object metadata more often to implement the LRU reference age.

The result of simulation • Hit rate

Byte hit rate

CPU utilization

Finding • LRU_L has higher and more variable response times than LRU_H because of the extra time spent computing the reference age and choose the object to evict. • LRU_L’s extra time seems to translate into a higher hit rate for LRU_L. • GDSF shows consistent improvement over LRU.

Conclusion • More sophisticated policy (heap-base for example) can be implement without degrading the cache’s performance. • The bottleneck is on I/O (network and disk), not on CPU utilization. • To achieve the greatest latency reduction, to optimize the cache hit rate.

While the bandwidth id dear, we must focus on improving the byte hit rate. • The impact of consistency validation of cached objects becomes more significant.

Improving Proxy Cache Performance: Analysis of Three Replacement Policies