1 / 20

World Wide Web Caching: Trends and Technology

World Wide Web Caching: Trends and Technology. Greg Barish and Katia Obraczka USC Information Science Institute IEEE Communications Magazine, May 2000 Presented By: Ossama M. Younis Ph.D. Student, Purdue University. Presentation Outline. Introduction Limitations and problems

jerod
Download Presentation

World Wide Web Caching: Trends and Technology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. World Wide Web Caching: Trends and Technology Greg Barish and Katia Obraczka USC Information Science Institute IEEE Communications Magazine, May 2000 Presented By: Ossama M. Younis Ph.D. Student, Purdue University

  2. Presentation Outline • Introduction • Limitations and problems • Desirable properties of a Web Caching system • Cache deployment options • Design techniques • Caching architectures • Load balancing and Content-based routing • Conclusion

  3. Introduction • What is Web Caching ? • Introducing proxy servers at certain points in the network that serve in caching Web documents for faster client access. • Comparable to the cache memory in a computer system. • Why is it needed ? • Rapid growth in HTTP traffic to form the largest part of the Internet traffic which causes more network congestion and server unavailability. • The number of Web static pages almost doubles every year.

  4. The Expected gains: • Bandwidth saving • Improving content availability. • Improving web server availability. • Reducing network latency. • Server load balancing. • Improving user’s perception about network’s performance.

  5. Main Issues and Problems • Caching system architecture • Proxy placement • Caching contents • Proxy cooperation/Data sharing • Cache resolution/Routing • Prefetching • Cache placement/Replacement • Cache coherency • Security and legal ethics of caching • Control information distribution

  6. Desirable Features: • Fast access • Robustness • Transparency • Scalability • Efficiency • Adaptivity • Stability • Load balancing • Dealing with heterogeneity • Simplicity

  7. Cache Deployment options • Near the content consumer • Better response time • Local service of requests • Near the content provider • Improves access to logical sets of data • Problem: critical to delay sensitive content • Problem: security constraints • At strategic points in the network • Problem with administrative control • Caching on a per-user basis • Uses the user’s local file system

  8. Design Techniques • Main Concerns: • Speed • Reliability • Scalability • Main design techniques: • Hierarchical caching • Intercache communication • Other design issues

  9. Hierarchical Caching • Caches are arranged in a tree-like structure • A child cache can query parent caches and other siblings • A parent cache can never query children • This maintains information gradually filtering down to the leaves • To avoid swamping parents with information, clustering may be applied to hierarchies.

  10. Intercache Communication • Multiple Distributed Caches in meshes • Improves scalability, availability, and physical locality • Protocols: • ICP (Internet Cache Protocol) [Squid]: Caches issue queries to other caches to determine the best location of object retrieval. Main problem is the message overhead • CRP (Content Routing Protocol): ICP with multicast feature to query cache meshes • Cache digests [Squid]: summarizes cache objects

  11. Intercache Communication • Protocols (cont.) • WCCP (Web Cache Communication Protocol) [Cisco]: Enables transparent redirection of HTTP traffic to Cisco Cache Engine • CARP (Cache Array Routing Protocol) [Microsoft]: Uses Hashing Schemes for location determination of the required proxy having the requested information

  12. Other Design Issues • Hash-Based request routing • Maps a key (such as the url) to a cache within a cluster • Reduces (eliminates) the need of caches to query each other • Optimized disk I/O • Improves spatial locality • Use in-memory data structures to avoid disk I/O • Microkernel O.S. • Designed to optimize cache performance (resource alloction, task execution, disk access, and transfer time)

  13. More Design Issues • Content prefetching • Local based • Server-hint based • Implementation: • Between clients and servers • Between clients and proxies • Between proxies and servers • Improvements: • Less latency (from 26% improvement to 57%) • Improved access time

  14. More Design Issues • Cache coherency (consistency) • Strong consistency techniques: • Client polling (If-Modified Since) • Invalidation callbacks • Weak consistency techniques • TTL and Adaptive TTL • Piggyback Invalidation

  15. Caching Architectures • Proxy Caching • Deployed at the edges of the network • Unavailable cache  Unavailable network • Single point of failure • User browser manual reconfiguration in times of failure

  16. Caching Architectures • Reverse Proxy Caching • Placing proxies near the content provider • Transparent Caching • Eliminates the needs to manually configure web browsers

  17. Caching Architectures • Adaptive Web Caching • Uses distributed cache meshes to solve the hot spot problem • Caches dynamically join and leave the groups based on content demand • Administrative boundaries must be relaxed • Push Caching • Keep data close to those clients requesting this information • Assumption: we are able launch caches that may cross administrative boundaries • Incurs cost (storage and transmission)

  18. Caching Architectures • Active Caching • Applies caching to dynamic documents • 30 % of client HTTP requests contains cookies • The servers provides the cache with the objects and any associated cache applets

  19. Load Balancing and Content Based Routing • Main goal: Improving performance and scalability • Local load balancing: Incoming requests are intercepted and redirected to one member of a group of servers or caches, all of which exist in the same geographic area. This can be achieved by using L4 or L5 switches. • Global Load balancing: Instead of distributing requests among members of a local group, requests are distributed to servers or caches which are near the client, to achieve lower network latencies. • Examples: Cisco’s WCCP, Distributed Director, DFP, etc.

  20. Conclusions • In this paper, the main web caching techniques, design issues, and architectures are discussed. • The paper doesn’t compare any of these techniques in terms of scalability, availability, or performance. • Some major issues were not addressed: • Content security • Handling more complex objects and real-time data

More Related