290 likes | 390 Views
String Analysis for Dynamic Field Access. Esben Andreasen Magnus Madsen. Department of Computer Science Aarhus University. Motivation. Static Analysis of JavaScript type analysis, bug finding or refactoring a key component is Points-To analysis Analysis of JavaScript is a difficult
E N D
String Analysis for Dynamic Field Access Esben Andreasen Magnus Madsen Department of Computer Science Aarhus University
Motivation • Static Analysis of JavaScript • type analysis, bug finding or refactoring • a key component is Points-To analysis • Analysis of JavaScript is a difficult • a flexible object model (prototype-based) • dynamic field access • coercions and eval • non-standard scope rules • ... Today
Static Field Access Receiver Field Name
Dynamic Field Access (DFA) where e is any expression
Dynamic Reads Object Array v = o["p"] v = o["p" + "q"] v = o[???] o length map pop push reduce reduceRight reverse shift slice some sort splice unshift __defineGetter__ __defineSetter__ __lookupGetter__ __lookupSetter__ constructor hasOwnProperty (4 more) concat every filter forEach indexOf join lastIndexOf
Dynamic Writes o["p"] = v o["p" + "q"] = v o[???] = v All fields of o may now point to v!
Dynamic Reads & Writes o[e1][e2] = v e.g. prototype e.g. toString All fields of Object may now point to v!
Spurious Event Handlers var elm = $("#button"); elm.onclick = function() {} elm[???] = function() {} The function is registered as all possible event handlers
Usage in Practice Survey by Richards et al. [1]: • 8.3% of all reads are dynamic • 10.3% of all writes are dynamic Dynamic field access is prevalent in libraries: • jQuery, Mootools and Prototype: 300+ DFAs [1]: Gregor Richards, Sylvain Lebresne, Brian Burg, and Jan Vitek. An Analysis of the Dynamic Behavior of JavaScript Programs. In PLDI, 2010.
Proposed Solution What we need: A way to distinguish strings flowing into dynamic field accesses. Solution: A set of light-weight lattices. • focus on concatenate, equaland join • compact and efficient • ideally O(1)time and space
(Simple) Lattices • Constant String (CS) single concrete string, e.g. "push". • String Set (SS) set of k strings, e.g. "pop"and "push". • Length Interval (LI) min. and max. length, e.g. the interval [3; 4] for the strings "pop" and "push".
Character Inclusion (CI) Sets of characters which may and mustoccur. Example for the strings "pop" and "push":
Prefixes and Suffixes • Prefix-Suffix Characters (PS) first and last character, e.g. for the strings "pop" and "push": and • Prefix Suffix Inclusion (PSI) may and must sets of characters for the first and last character. (like the character inclusion lattice.)
Index Predicate A boolean valued predicate which may/must hold for each of the first string indices. Examples: • isUppercase (useful for e.g. ”lastIndexOf”) • isUnderscore (useful for e.g. ”__defineGetter__”) • isDigit (useful for array indices)
Index Predicate: Concatenation + 1 0 0 1 0 1 1 0 0 0 1 0 0 1 1 1 0 0 0 may-case 1 0 0 1 1 1 1 0 0 0 1 0 0 1 0 1 1 0 0 0
String Hash • Pick a distributive hash function : where is some universe of a fixed size . • Distributive: • The lattice is the powerset lattice of . • Length Hash (LH) • Hashes the length instead of the string itself.
String Hash: Example foo = (the quick brown fox) 4 33 29 40 13 • String Constant • Prefix Suffix • Character Inclusion • Index Predicate
String Hash: Concatenation Example: Let and Assume a universe of size then: Easy to compute in iterations. Paper has solution in time.
Overview • The Hlattice is the (reduced) product of • the string set lattice (SS) • the character inclusion lattice (CI) • the string hash lattice (SH)
Evaluation Q1: How precise are the lattices, independent of any particular analysis, for reasoning about strings used in DFAs? Q2: To what degree does a more precise string lattice, for DFAs, improve overall precision and performance of a static analysis?
Evaluation: Dynamic Analysis We perform a recordand replay of several popular JavaScript libraries: Record: • The history of every string flowing to a DFA. • The field names of every receiver object at a DFA. Replay: • For every DFA merge the histories of strings and determine for each lattice if it has a false positive. (example next slide)
Dynamic Analysis: Example e = (c ? "a": "b") + "x" o = {"ax": 1, "bx": 2} v = o[e]; + What fields are read by evaluating o[e] in the abstract? join "x" "a" "b"
Evaluation: Dynamic Analysis DFAs with zero false positives Const. Str. Insufficient Hybrid Prefix/Suffic Incl.
Evaluation: Static Analysis Flow-sensitive dataflow analysis for JavaScript • inter-procedural, context-insensitive • instantiated with the constant and hybrid lattices Benchmarks • Mozilla Sunspider + Google Octane • Various GitHub projects
Summary • Dynamic field access is common in JavaScript. • Simple constant propagation is insufficient for reasoning about dynamic field accesses. • The proposed hybridlatticeimproves precision and performance for 7 out of 10 benchmark programs. Thank You!
Arrays var a = [1, 2, 3]; var i = 0; while (...) { var x = a[i++]; }
String Patterns "a|b|c|d".split("|")
Number- & Type String Lattices • Number String (N) A powerset lattice of the strings: {Infinity, -Infinity, NaN, 0, 1, ...} • Type String (T) A powerset lattice of the strings: {Boolean, Function, Object, String, Undefined}