210 likes | 341 Views
Images, Alternative Text, and Artificial Intelligence. Agenda. About Us About Me The Project What’s Next. http://amp.ssbbartgroup.com/public/research/Automatic_Image_Classification_090707.doc http://amp.ssbbartgroup.com/public/research/SSB_BART_Group_Image_Alt_CSUN_2008.ppt. History
E N D
Images, Alternative Text, and Artificial Intelligence
Agenda • About Us • About Me • The Project • What’s Next http://amp.ssbbartgroup.com/public/research/Automatic_Image_Classification_090707.doc http://amp.ssbbartgroup.com/public/research/SSB_BART_Group_Image_Alt_CSUN_2008.ppt
History Founded in 1997 by engineers with disabilities 750 commercial and government customers 1,500 enterprise projects successfully completed Pioneers of commercial accessibility validation tools Approach Data driven and scalable Violation profiling across 5.5M human validated accessibility issues Scalable Solutions One to one million developers One to one thousand production systems Fifty percent staffing mix of individuals with disabilities Appropriately mixed automated, human and code level validation Corporate Overview
Web HTML XML JavaScript CSS AJAX Adobe Flash and Flex Adobe Acrobat Documents Streaming Audio and Video Compiled Software JFC and SWT Java Applications .Net Applications MFC Windows Native Applications Macintosh Applications BMC Remedy Applications Standalone Systems Telecommunications Hardware IVR Systems Agent Systems Digital Imaging Supported Platforms
Public Sector Federal Solutions United States European Union Education K-12 Universities State and Local Government System Integrators Healthcare Primary Care Providers Insurance Information Technology Manufacturers Software Hardware Web Based Service Providers Mass Transit Financial Services Consumer Banking Insurance Legal Web Based Service Providers Industry Solutions
Accessibility Management Platform AMP – SSB’s web based platform for managing all aspects of Accessibility process Benefits • Single point for tracking compliance over time • Scalable solutions from one to one million developers across multiple domestic markets • Support for all aspects of a successful accessibility initiative
General Story Founder and Managing Director of SSB BART Group Also Known As President and CEO Professional web site developer for 13 years Started in 1994 at the dawn of the Web BS Computer Science Leland Stanford Junior University (AKA Stanford) Odds on Brad Pitt to play me in the movie Accessibility Work Involved in Web Accessibility activities, validation and education since 1999 Architected and developed first commercial accessibility testing and fixing tool InSight and InFocus 1.x -> 4.x Initial release in mid-200 Next release in a few months Architected and developed Accessibility Management Platform (AMP) Current Version – 2008 R1 Personal work with fifty enterprise class software vendors About Me
Project Overview Project Description • Create a decision tree to classify images into one of eight types • Image types are organized by alternative text requirements • Upon classification, alternative text validity can then be tested via straightforward heuristics Project Utility • Alternative text provides a textual description of an image • Alternative text validity • Ensures access to content for people with disabilities • Allows pages to be adapted effectively - low resolution, alternative browsers • Increases search engine relevance for pages • Bottom Line – Good alternative text is good for society and good for profits
A brief note on automated testing tools First generation of automated testing tools, where we are now, can test about 25% of requirements accurately Another 25% with so-so accuracy And the rest need to be checked manually We think the next generation of tools can double this efficacy through better AI, more complex page models and better leveraging of human judgment… …but ultimately tools can only facilitate the process of human review they cannot replace it Automated Testing Tools
Layout Element – The image is used solely to layout elements on the page Decorative Picture – The image is a picture that is used solely for the purpose of making the page more visually appealing and it provides no information Text – The image is used to stylize text on the page but is not used as an active element on the page Picture – The image is a picture that contains information important to the use of the page Hidden Link – The image provides a “hidden” link on a page for search engine optimization or screen reader users Linked Text – The images is used to stylize text and provide a link to another page Skip Link - The image is the root of an inner-document link that provides a means of skipping past page content that is not relevant Linked Picture – The image is a picture that provides a link to another page Image Types
Project Functionality Challenge • No database of relevant image classifications exists • Subject Matter Experts (SMEs) use experience to determine form of alternative text • Without a good data set the decision tree isn’t going to decide much Solution • Build a spider to crawl sites and gather sample data • Classify the images using a basic interface • Store the image classification and additional variables in a database • Build a decision tree from the database rather than a live site • Repeat using updated tree Result • Created an image database of 1000 images with about an hour of actual data entry
Project Functionality Challenge • Build the decision tree • …which became build the decision tree before the end of time • …which became build the decision tree once and store it for later use Discussion • Building the tree is fairly straightforward and involves splitting on variables and analyzing remaining sets • Implementation uses Russell, Norvig algorithm • More on the tricky parts later • The “catch” - a lot of the queries involve eliminating groups of images • SQL doesn’t have good concepts for handling unordered sets of keys so you enumerate out elements for queries…
Project Functionality Discussion (Continued) • This results in lots of nasty queries and a fair amount of time to build the tree • This more or less grows exponentially as you add variables and quanta Solution • Build the tree once and persist to disk • Limit quanta for variables and require minimum information gain Result • Creation of the tree takes about forty minutes • Reading in the tree takes about forty milliseconds • Resolving against the tree takes about forty nanoseconds
Project Functionality Challenge • Test the decision tree for accuracy • Avoid peeking at the data set Solution • Always test on new data [Tank!] • Don’t store the test set so we avoid any temptation to peek
The Tricky Parts Information Gain • Successful classification provides 2.391 bits of information • Which means, what, exactly? • Technically – You have enough information to answer 2.391 yes/no questions • Practically – You can order nodes to split on by information gain • At each split choose node that provides highest information gain • Note - The amount of information provided by an attribute will change as you move through the tree Solution • Calculate information gain for each split • This is where the nasty set queries occur Overfitting • Observe • Permutations of Variable Quanta - 460,800 • Sample Data Size – 1000 • 460,800 >> 1000 • Thus the risk of over fitting is significant Solution • Require that we gain at least .05 bits to split – otherwise just return the modal value for the remaining set
The Tricky Parts Variable Quantification • Strategy • Make everything an integer • Define ranges for all variables • Initially picked quanta based on guesses divisions • These turned out to be wildly inaccurate Solution • Picked variables based on image type grouping and average • SQL AVG and COUNT make this easy Edge Detection • Used Sobel Edge detection and Java convolution application for images • Count the number of edges in the image • Lots of images have edges Solution • Count vertical and horizontal edges • Turns out to be a great proxy for text in the image • Accuracy goes from 78.23% to 92.63% with this types of edge detection
Future Features Second Order Variables • First order variables are primary data from images • Second order variables are derived from one or more primary variables • Specifically • edge_count, color_depth have much more relevance as ratios to size • height is more relevant as a ratio for width Classification Tightening • Current classifications have some overlap which could be refined out • Certain classifications evolved over the course of the project and the data set should be updated to reflect the final classification
Future Features Safe Failure • Okay to require alternative text when not necessary than not require text when necessary… • …or is it?? Celebrity Endorsement • If K-Fed uses it wouldn’t you
Silicon Valley Phone (415) 975-8000 E-mail sales@ssbbartgroup.com Fax (415) 624-2708 300 Brannan Street Suite 608 San Francisco, CA 94107-1876 Washington DC Phone (703) 637-8955 E-mail sales@ssbbartgroup.com Fax (703) 734-8381 1489 Chain Bridge Road Suite 204 McLean, VA 22101 For More Information