1 / 46

Safety Critical Software Development

Safety Critical Software Development. - Suparna Paruthy. Safety Requirements. Customer interaction Similar product in same intended market Competitive intelligence Professional assistance. Certification Killers. Unclear requirements Lack of clear evidence of compliance

paul
Download Presentation

Safety Critical Software Development

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Safety Critical Software Development - SuparnaParuthy

  2. Safety Requirements • Customer interaction • Similar product in same intended market • Competitive intelligence • Professional assistance

  3. Certification Killers • Unclear requirements • Lack of clear evidence of compliance • Not doing research up front • Lack of dedicated resources • Trying to safety certify too many things • Not accounting for enough resources to document the safety case • Not using a single contact to interface with the assessor • Not being honest with the weaknesses of the proposed system

  4. Project Planning Strategies • Determine the project certification scope early Identify which standards your product needs to meet. 2. Determine feasibility of certification Answer questions up front whether the product and solution are technically and commercially feasible. • Select an independent assessor Finding an assessor that has experience with your market segment.

  5. Project Planning Strategies (Contd.) 4. Understand your assessor’s role The assessor’s job is to assess your product with respect to compliance with standards and norms. 5. Assessment communication is key A clear line of communication between your team and the group controlling the standards 6. Establish a basis of certification Listing all of the standards and directives that your product needs to comply with.

  6. Project Planning Strategies (Contd.) 7. Establish a “fit and purpose” for your product Establishing a fit and purpose up front will prevent future headaches! 8. Establish a certification block diagram Generating a hardware block diagram of the system. 9. Establish communication integrity objectives Determine the “residual error” rate objectives for each digital communication path. 10. Identify all interfaces along the certification boundary Generating a boundary “Interface Control Document”.

  7. Project Planning Strategies (Contd.) 11. Identify the key safety defensive strategies Identifying and have the safety defensive strategies used to achieve the safety objectives for the program. 12. Define built in test (BIT) capability Identifying the planned BIT coverage, including initialization, periodic, conditional, and user-initiated. 13. Define fault annunciation coverage Keeping the system and user interface in mind, define which faults get annunciated. 14. Define reliance and expectation of the operator/user Clearly defining any reliance that is placed on the operator or user to keep the system safe.

  8. Project Planning Strategies (Contd.) 15. Define plan for developing software to appropriate integrity level Address the compliance with each element of the applicable standard you are certifying to. 16. Define artifacts to be used as evidence of compliance Listing all of the documents and artifacts you plan to produce as part of the system safety case. 17. Plan for labor-intensive analyses Plan on conducting a piece-part FMEA (failure modes and effects analysis).

  9. Project Planning Strategies (Contd.) 18. Create user-level documentation Plan on having a users’ manual. 19. Plan on residual activity Any changes to configuration must be assessed for the impact on safety certification. 20. Publish a well-defined certification plan Document a certification plan.

  10. Faults, Errors, and Failures • Fault: a characteristic of an embedded system that could lead to a system error. • Error: an unexpected and erroneous behavior of the system. • Failure: a system event not performing its intended function.

  11. Availability and Reliability • Availability:a measure of how much the embedded system will be running and delivering the expected services. • Reliability:the probability that an embedded system will deliver the requested services at a given point in time. • Systems that are both highly reliable and highly available are said to be dependable.

  12. Fault Handling • Fault Avoidance: developing a system that helps prevent the introduction of software and hardware faults • Fault Tolerance:a layer of software that is able to “intercept” faults that occur in the system. • Fault Removal:consists of either modifying the state of the system or removing the fault through debugging and testing. • Fault prediction: ability to predict a fault that may occur in the future and alerting maintenance.

  13. Hazard Analysis • FMEA

  14. Hazard Analysis (Contd.) • Fault Tree Analysis

  15. Hazard Analysis (Contd.) • Event Tree Analysis

  16. Risk analysis • Method where each of the hazards identified is evaluated more carefully. • First step: FMEA is used to make certain that the classification is correct. • Values of rating: unacceptable, acceptable or tolerable. • Redundancy

  17. Is this architecture fit for safety critical system?

  18. Safety Critical Architectures • “Do-er” / “check-er”

  19. Safety Critical Architectures (Contd.) • Two Processors

  20. Safety Critical Architectures (Contd.) • Voter

  21. Rules of Software Implementation • Restrict all code to very simple control flow constructs • Giving all loops a fixed upper bound • Not using dynamic memory allocation after initialization • No function should be longer than single sheet of paper • Two assertions per function

  22. Rules of Implementation (Contd.) • Declaring all data objects at smallest possible level of scope • Check return value of non-void functions and check parameters provided by caller • Limit use of preprocessor to inclusion of header files and simple macro definitions • Restricting use of pointers • Compiling all code from first day of development

  23. Software Implementation Strategies Have a well-defined, repeatable peer-review process

  24. Using existing safety coding standards • MISRA C, 127 guidelines for using C in safety-critical applications • Updated later to include 121 required rules and 20 advisory

  25. Handle all combinations of input data • Account for every combination of input value • Check the external data that is coming into the system for all possible values

  26. if ( input_data_byte== 0 ) { Movement = STOP; } else if ( input_data_byte== 1 ) { Movement = GO; } else { Movement = STOP; // Most restrictive case here Log_Error( INP_DATA_BYTE_INV, “Unknown Value” ); }

  27. Specific variable value checking if ( relay_status!= RELAY_CLOSED ) { DO_Allow_Movement(); // Let the vehicle move, everything ok } else { DO_Stop(); //The relay isn’t positioned correctly, stop! }

  28. Specific variable value checking (Contd.) if ( relay_status== RELAY_OPEN ) { DO_Allow_Movement(); // Let the vehicle move, everything OK } else if ( relay_status== RELAY_CLOSED ) { DO_Stop(); // It is closed, so we need to stop } else // This case shouldn’t happen - { DO_Stop(); //The relay isn’t positioned correctly, stop! Log_Error( REL_DATA_BYTE_INV, “Unknown Value” ); }

  29. Mark safety-critical code sections

  30. Timing execution checking • Checking that all intended software is able to run in a timely manner • Making sure that all lower priority tasks are able to run • Making sure that the entire clock rate of the system hasn’t slowed

  31. Stale Data • Making sure there is no stale data in the system • Finding a way to delete data once safety critical code has generated output • Using sequence numbers in case of serial data • CRC or error check for large blocks of data

  32. Comparison of outputs • Cross-checking outputs of safety-critical functions • One processor running in parallel with another • For serial stream of data:

  33. Comparison of outputs (Contd.)

  34. Initializing data to least permissive state • Continuously making decisions on whether to allow any state to be more permissive than the least permissive • “Safest Condition” • Initializing the code in safe state

  35. Order of execution • Having safety checks for code sections running one after another • Using sequence number for the tasks • Using semaphores/flags • Running tasks in time frames

  36. Volatile data checking • Integrity checks for offboard data using CRC • The most useful parameter of CRC is Hamming distance • Checking all safety critical volatile data • Updated data set to least permissive state • Non – updated data checked using CRC • Check safety-critical data used throughout the code.

  37. Non-volatile data checking • Calculating a CRC for the program image at build time • Using multiple CRCs for various sections of the code space • Making sure low priority task doing image check runs • Check integrity of the image

  38. Make sure the entire system can run • RTOS may become complicated for a safety critical system • More checking needed for safety-critical tasks in RTOS system • Easier checking in simple scheduler system • External timing circuit that provides a reference • Making sure that the timing is real or not

  39. Remove “dead” code • Remove any code not currently being called • Putting conditional compiles around the block of code #if defined (LOGDEBUG) index =20; LOG_Data_Set( *local_data, sizeof( data_set_t)); #endif

  40. Remove “dead” code (Contd.) #if defined (LOGDEBUG) #if !defined(DEBUG) neverCompile #else index = 20; LOG_Data_Set( *local_data, sizeof( data_set_t )); #endif #endif

  41. Fill unused memory • Filling non volatile unused memory with meaningful data • Filling the memory with instructions that causes the processor to reset

  42. Static code analysis • Running static code analyzer when code is compiled • No warnings at the end of analysis

  43. Aviation - SIFT • Aircraft control computer system • Life – threatening failure less than 10-10 per hour in 10 hour flight • Replicates processors and adaptive voting • Voting mechanism implemented entirely in software • Verified in two stages

  44. Monitor – Actuator Pattern

  45. References • J. Bowen, “Safety-critical systems, formal methods and standards”, Software Engineering Journal (Volume:8 ,  Issue: 4 ) • G.J. Holzmann, “The power of 10: rules for developing safety-critical code”, Computer (Volume:39 ,  Issue: 6) • http://link.springer.com/chapter/10.1007/978-3-642-33678-2_30

  46. THANK YOU

More Related