1 / 29

English Shellcode

16th ACM CCS. English Shellcode. Joshua Mason, Sam Small Johns Hopkins University. Fabian Monrose University of North Carolina. Greg MacManus iSIGHT Partners. Outline. Introduction On the arms race Related work Our approach Automatic generation Implementation Evaluation.

damian
Download Presentation

English Shellcode

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 16th ACM CCS English Shellcode Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners

  2. Outline • Introduction • On the arms race • Related work • Our approach • Automatic generation • Implementation • Evaluation Advaced Defense Lab

  3. Introduction • Code-injection attack • Source code for script-language • Byte-code • Machine code • The common component • The injected code or … • shellcode Advaced Defense Lab

  4. Misconception • Shellcode is delivered in tandemwith the exploitation. • Store shellcode in memory, then exploit • Shellcode takes the form of directly executable machine code. • polymorphism Advaced Defense Lab

  5. Misconception…? • Even polymorphic shellcode is constrained by an essential component: the decoder. • Shellcode is fundamentally different in structure than non-executable payload data. • This paper!!! Decoder Encoded data Advaced Defense Lab

  6. About This Paper • Automatically producing English Shellcode • Although it is not indistinguishable form authentic English prose. • Do you want to analyze? Advaced Defense Lab

  7. On The Arms Race • Shellcode developers are often faced with constraints that limit the range of byte-values aceepted. • e.g. printable, alphanumeric, MIME • Encoding • Self-modification Advaced Defense Lab

  8. On The Arms Race • Much literature describing code injection attacks assumes a standard attack template. • A NOP sled, shellcode, and one or more pointer • While emulation and static analysis have bean successful in identifying some failings of advanced shellcode. • But…overhead Advaced Defense Lab

  9. On The Arms Race • It has been suggested that malicious polymorphic behavior cannot be modeled effectively. • On the infeasibility of Modeling Polymorphic Shellcode. • By Y. Song et al. Advaced Defense Lab

  10. Related Work • Limit the spoils of exploitation and to prevent developers from writing vulnerable code • Preventing the execution of injected code • Content-based input-validation • Polymorphic • To identify self-decrypting shellcode • But … non-self-contained polymorphic shellcode Advaced Defense Lab

  11. Our Approach • Shellcode is simply an ordered list of machine instructions. • “Shake ShakeShake!” • push %ebx; push “ake ”;push %ebx; push “ake ”;push %ebx; push “ake!”; • But add, mov, call • To develop an automated approach • Arbitrary shellcode English representation Advaced Defense Lab

  12. High-level Overview • English shellcode is completely self-contained. Advaced Defense Lab

  13. The Decoder • The decoder must be English-cpmpatible • Cannot use many instruction • E.g. loop instructions • Our decoder has the form: • Initialization • Decoder • Encoded payload Advaced Defense Lab

  14. The Decoder principle • Only English-compatible instructions • English-compatible instructions that can produce useful instructions • Favor instructions that have less-constrained ASCII equivalents • push %eax (“P”) > push %ecx (“Q”) Advaced Defense Lab

  15. Decoder - initialization • Overwriting registers and patching some instructions • Using inc instruction and manipulatiing the alignment of the stack Advaced Defense Lab

  16. Advaced Defense Lab

  17. Decoder - Unpacking • “and r/m8, r8”(0x20, ASCII space character) • add • lods (load string from esi) Advaced Defense Lab

  18. Decoder - Decoding • Two pointer: %esi, %edi ”,” and “ ” ”u” and “decode” ”G” Advaced Defense Lab

  19. Advaced Defense Lab

  20. Decoder – Initialing Registers • Using popa instruction (ASCII character “a”) Advaced Defense Lab

  21. Automatic Generation • Taken as-is, the custom decoder will have common English characters, but will not appearance of English text. • Add some instructions between decoder instructions • Augmenting a statistical language generation algorithm. Advaced Defense Lab

  22. Automatic Generation • n-gram model length is 5 • the ith instruction in decoder have a level i • A sentence have score i when it complete level i Advaced Defense Lab

  23. Advaced Defense Lab

  24. Using beam search algorithm • Keep the best m(=20,000) candidates during the process • For encoded payload, observe how many target byte are encoded Advaced Defense Lab

  25. Implementation • The training data • Over 15,000 Wikipedia articles • 27,000 books from the Project Gutenberg • Language engine was constructed in the Java language using the LingPipe API • Scoring engine  using ptrace API • Executor • Watcher • Taking 12 hours Advaced Defense Lab

  26. Advaced Defense Lab

  27. An Optimized Design • Emulation • Expand 1 instruction into tens of instructions • Monitored direct execution • Maintain 2 machine state • Use 3 separate stacks • Pause 2 conditions • Encounter a jump • Change memory • Roughly in less than 1 hour Advaced Defense Lab

  28. Evaluation • Exit(0) • 2054 bytes Advaced Defense Lab

  29. Compare with Spectrum Analysis • Windows Bind DLL Inject Advaced Defense Lab

More Related