240 likes | 360 Views
Evolution of Telecom Software. Perspectives from a Software Engineer Dr. Jey Veerasamy. Background: Education. BE Electronics and Communication Engineering, Anna University, India 1986-90 MS Computer Science @ UTD, 1991-94 PhD Computer Science @ UTD, 1994-99
E N D
Evolution of Telecom Software Perspectives from a Software Engineer Dr. Jey Veerasamy
Background: Education • BE Electronics and Communication Engineering, Anna University, India 1986-90 • MS Computer Science @ UTD, 1991-94 • PhD Computer Science @ UTD, 1994-99 • Dissertation: Graph algorithms - improved approximation algorithms for tour problems
Background: Software Engineer • Mobile Switching Center (MTX) software, BNR/Nortel, 1994-97 • Developed features on MTX platform • Base Station & Base Station Controller software, Samsung Telecom America (STA), 1997-2010 • Limited development, worked more on requirements & post-deployment support • Performance trending, troubleshooting & optimization
Wireless network : Block Diagram Source: Wiki
Snapshot : 1991 • Cost for one minute US India phone call? • $2.20 • Cost for one minute Dallas SFO phone call? • Anywhere from $0.25 to $0.60 • Long distance carrier business was great!
Snapshot: 1994 • Telecom companies were doing very well. • Focus on Features, capacity & reliability • New employees: 6 month honeymoon period • All UTD CS/EE graduates: • First stop: Nortel • 2nd stop: Ericsson • 3rd: time to think
1994: Software development Environment • Waterfall model • Documentation heavy • Reviews can be brutal or boring • Weekly load-build was a big deal • Proprietary real-time Operating Systems, HW & programming languages, even homegrown source code config. control software! • Why? Limited processing power, exercise full control, concerns over reliability & source code leaks … • Reluctance to try new tools/environments
1994: Telecom Software Engineer • Concerned about marketability of skills, but not worried about job security • Typical work week: • <50% spent on design work, • ~30% spent on learning standards, • ~20% spent on testing • Expensive & complex lab equipment: • 4 hours in setup & 2 hours in testing • Who knew the acronyms?
Concepts: BHCA capacity • Busy Hour Call Attempts • 1 Million BHCA central processor should spend < 2.5 milli-second per call (assuming 70% load) • Managing BHCA is a “system engineering” activity, done in every software release. • Per-call measurements & optimization
Capacity issues • 2 types of nodes: • control nodes – transaction processing – CPU load can vary a lot (>60% load is a concern). Use watch-dog timers that automatically reset the node if 100% sustained CPU load is seen. • traffic nodes – actual traffic processing – can safely operate at 90% CPU load • Power of Trending
Handling Overload • Overload can occur during mega-events or new years day • Similar to “Denial of Service” attack • Need to shed call requests with minimal effort. • Goal is to handle as many requests as possible in reliable manner.
Magic of CDMA Single Frequency Channel operation Soft Handoff Coverage vs. Capacity
Concept: Real-time OS • Traffic processing: every 20 msec once, for each call • load distributed by frame offsets (1.25 msec) • Control processing • Diagnostic processing
Magic of “always connected” • IP was not designed for mobility. • All IP traffic towards the mobiles is terminated at specific node in wireless network. That node takes care of delivering it to mobile using tunneling protocols • also known as “Mobile IP”.
Redundancy • Is it for hardware or software? • Control nodes: Active/Standby redundancy • Traffic nodes: N+1 redundancy • Load sharing algorithm? • Round-robin or load-balancing • Leaky bucket? …
Interesting SW bugs • Look at the following code: if (sector_id= 1) … Send call setup message to ALPHA • Lab tested the code in alpha sector. • What happens when this code is applied to field?
Blocking printf() • Debug port used for logs • printf() was used to output messages – cannot use break points due to timers • CDMA works based on GPS time • Timing drift is not good for soft handoffs handoff failures • More time spent in printf() less time in actual call processing less capacity
Working with limited pipe • There are two types of messages over the air: • Acknowledgement required • No ack required • I changed neighbor information message type to improve soft handoff success. • Resulted in more handoff failures, since actual handoff processing related messages could not get through.
BSC crashes • Unexpectedly long message or spurious content from mobiles causes buffer overrun • Fixed size stack was used in OS – more local variables added over time.
Software efficiency: Do we really care? • Goal is to make software work & meet deadline for most industry projects. • Game console: Algorithm takes longer to run requires higher-end CPU to keep realism higher price product fails amid competition • Web-server: Algorithm takes longer to run (consider 5 seconds vs. 20 seconds) tests web-users’ patience & requires more web server capacity. • Daily data crunching: What if it takes >1 day?
Snapshot: 2011 • All long distance-only carriers disappeared several years ago. • Too efficient for our own good • All-you-can-eat or bucket plans – Data usage picking up – carriers struggling to keep up • “Cost reduction” or efficiency is the goal! • New interns – help out with testing in the lab on day #1 • Continuous fight between Quality & deadlines.
2011 • Smart phones generate lot of data traffic even when the user is sleeping! “Femtocells” appealing to carriers. • IP has become acceptable protocol. • Real-time Linux is popular OS used in lots of telecom nodes. • Management nodes use Sun WS with Java applications & web browser. • Real-time nodes tend to use C/C++. • Focus has shifted to applications for smart-phones.
Questions? • jeyv@utdallas.edu