210 likes | 344 Views
Modeling and Optimizing Large-Scale Wide-Area Data Transfers. Raj Kettimuthu, Gayane Vardoyan, Gagan Agrawal, and P. Sadayappan. Exploding data volumes. Astronomy. Climate. 2004: 36 TB 2012: 2,300 TB. MACHO et al.: 1 TB Palomar: 3 TB 2MASS: 10 TB GALEX: 30 TB Sloan: 40 TB.
E N D
Modeling and Optimizing Large-Scale Wide-Area Data Transfers Raj Kettimuthu, Gayane Vardoyan, Gagan Agrawal, and P. Sadayappan
Exploding data volumes Astronomy Climate 2004: 36 TB 2012: 2,300 TB MACHO et al.: 1 TB Palomar: 3 TB 2MASS: 10 TB GALEX: 30 TB Sloan: 40 TB Pan-STARRS: 40,000 TB Genomics 105 increase in data volumes in 6 years 100,000 TB
Data movement • Datasets must frequently be transported over WAN • Analysis, visualization, archival • Data movement bandwidths not increasing at same rate as dataset sizes • Major constraint for data-driven sciences • File transfer - dominant data transfer mode • GridFTP - widely used by scientific communities • 1000s of servers deployed worldwide move >1 PB per day • Characterize, control and optimize transfers
GridFTP • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area networks • Based on FTP protocol - defines extensions for high-performance operation and security • Globus implementation of GridFTP is widely used. • Globus GridFTP servers support usage statistics collection • Transfer type, size in bytes, start time of the transfer, transfer duration etc. are collected for each transfer
Parallelism vs concurrency in GridFTP Data Transfer Node at Site A Data Transfer Node at Site B Parallel File System Parallel File System TCP Connection GridFTP Server Process GridFTP Server Process TCP Connection TCP Connection TCP Connection GridFTP Server Process GridFTP Server Process TCP Connection TCP Connection TCP Connection GridFTP Server Process GridFTP Server Process TCP Connection TCP Connection Parallelism = 3 Concurrency = 3
Problem formulation • Objective - control bandwidth allocation for transfer(s) from a source to the destination(s) • Most large transfers between supercomputers • Ability to both store and process large amounts of data • Site heavily loaded, most bandwidth consumed by small number of sites • Goal – develop simple model for GridFTP • Source concurrency - total number of ongoing transfers between the endpoint A and all its major transfer endpoints • Destination concurrency - total number of ongoing transfers between the endpoint A and the endpoint B • External load - All other activities on the endpoints including transfers to other sites
Modeling throughput • Linear models • Models that consider only source and destination CC • Separate model for each destination • Data to train, validate models – load variation experiments • Errors >15% for most cases • Log models Y’ = a1X1 + a2X2 + … + akXk + b • DT = a3 *DC/SC + b2 • DT = a1*DC + a2*SC + b1 • DT = SCa4 *DCa5 * 2b3 • log(DT)=a4*log(SC) + a5*log(DC) + b3
Modeling throughput • Log model better than linear models, still high errors • Model based on just SC and DC too simplistic • Incorporate external load • External load - network, disk, and CPU activities outside transfers • How to measure the external load? • How to include external load in model(s)?
External load • Transfers stable over short duration but vary widely over entire day • Multiple training data – same SC, DC - different days & times • Throughput differences for same SC, DC attributed to difference in external load • Three different functions for external load (EL) • EL1=T −AT, T - throughput for transfer t, AT - average throughput of all transfers with same SC, DC as t • EL2=T−MT, MT - max throughput with same SC, DC as t • EL3 = T/MT
Models with external load • Linear • DT = a6*DC + a7*SC + a8*EL + b4 ELa11 if EL>0 |EL|(−a11) otherwise • Log • DT = SCa9 * DCa10 * AEL{a11} * 2b5 AEL{a11} =
Calculating external load in practice • DT = a6*DC + a7*SC + a8*EL + b4 • Unlike SC and DC, external load is unknown • Multiple data points with same SC, DC used to train models • In practice, may not be any recent transfers with same SC, DC • Some recent transfers, no substantial change in external load over few minutes • Most recent transfer’s load as current load • Average load of transfers in past 30 minutes as current load • Average load in the past 30 minutes with error correction • Unknown • Given • Control
Recent transfers load with error correction Previous Transfer Method • DT = a6*DC + a7*SC + a8*EL + b4 • Compute • Known Recent Transfers Method Transfers in past 30 minutes Recent Transfers with Error Correction • DT = a6*DC + a7*SC + a8*EL + b4 + e Historic transfers
Applying models to control bandwidth • Experimental setup: DTNs at 5 XSEDE sites (Source: TACC, Destinations: PSC, NCAR, NICS, Indiana, SDSC) • Goal – control bandwidth allocation to destinations when source is saturated • Models express throughput in terms of SC, DC, and EL • Given target throughput, determine DC to achieve target • Often more than one destination transfer data, SC is also unknown. • Limit DC to 20 to narrow search space • Even then, large number of possible DC combinations (20n) • Heuristics to limit search space to (SCmax – ND + 1)
Experiments • Ratio experiments – allocate available bandwidth at source to destinations using predefined ratio • Achieve specific fraction of bandwidth for each destination • Four ratio combinations • Factoring experiments – increase destination’s throughput by a factor when source is saturated • Bandwidth increase because of certain priorities • Four models/methods (log EL1/EL3 models and RT/RTEC methods) were used • Effective in predicting the throughputs • 83.6% of the errors are below 15%, and 65.5% of them are below 10%
Results – Ratio experiments Ratios are 4:5:6:8:9 for Kraken, Mason, Blacklight, Gordon, and Yellowstone. Concurrencies picked by Algorithm were {1,3,3,1,1}. Model: log with EL1. Method: RTEC Ratios are 4:5:6:8:9 for Kraken, Mason, Blacklight, Gordon, and Yellowstone. Concurrencies picked by Algorithm were {1,4,3,1,1}. Model: log with EL3. Method: RT
Results – Factoring experiments Increasing Gordon’s baseline throughput by 2x. Concurrency picked by picked by Algorithm for Gordon was 5 Increasing Yellowstone’s baseline throughput by 1.5x. Concurrency picked by picked by Algorithm for Yellowstone was 3
Related work • Several models for predicting behavior & finding optimal parallel TCP streams • Uncongested networks, simulations • Several studies developed models to find optimal streams, TCP buffer size for GridFTP • Buffer size not needed with TCP autotuning • Major difference - attempt to model GridFTP throughput based on end-to-end behavior • End-system load, destinations’ capabilities, concurrent transfers • Many studies on bandwidth allocation at router • Our focus is application-level control
Summary • Understand performance of WAN transfers • Control bandwidth allocation at FTP level • Transfers between major supercomputing centers • Concurrency powerful than parallelism • Models to help control bandwidth allocation • Log models that combine total source CC, destination CC, and a measure of external load are effective • Methods that utilize both recent and historical experimental data better at estimating external load