560 likes | 838 Views
Exploring a KMDF Storage Driver. Grigory Lyakhovitskiy SDE II Device and Storage Technologies grigly@microsoft.com. David Burg Senior SDE Lead Device and Storage Technologies daviburg@microsoft.com. Praveen Rao Senior SDE Device and Storage Technologies psrao@microsoft.com. Agenda.
E N D
Exploring a KMDF Storage Driver • Grigory Lyakhovitskiy • SDE II • Device and Storage Technologies • grigly@microsoft.com • David Burg • Senior SDE Lead • Device and Storage Technologies • daviburg@microsoft.com • Praveen Rao • Senior SDE • Device and Storage Technologies • psrao@microsoft.com
Agenda • Motivation for a KMDF storage driver • Storage driver design with KMDF • Debugging the new storage driver (demo) • Writing high performance drivers in KMDF • Resources
Motivation for using KMDF • Advantages make Storage Drivers easier than with WDM • I/O cancellation, PnP and power management • Reduced lines Of code (LOC) maintained by our team • Support for side-by-side major versions • Support for Windows 2000 onwards
Readiness for ‘re-implementing’ • Know your driver well – how documented is it? • Interfaces (MSDN), code (WDK sample), device and applications workaround • Test & Partners • How good is your coverage? (functionality, performance, …) • Do you have a plan with your partners?
So, which Storage Driver? File System drivers CDFS.SYS UDFS.SYS FASTFAT.SYS Class drivers CDROM.SYS DISK.SYS TAPE.SYS Port drivers USBSTOR.SYS ATAPORT.SYS STORPORT.SYS
Design and implement CDROM.SYS with KMDF • David Burg • Senior SDE Lead • Device and Storage Technologies • daviburg@microsoft.com
Driver stack in Vista and Windows 7 WDM WDF Vista Windows 7
IRP Process in Vista and in Windows 7 Vista & XP – do it all yourself Windows 7 – focus on value-add
Asynchronous I/Os – watch out for performance • By KMDF when queues are not empty • De-queuing occurs in a worker thread, at Dispatch level • Storage driver updated to perform most requests at Dispatch level, avoiding another context-switch • By storage driver when queues are empty • Do as much as possible processing now, and create own completion routine
The KMDF CDROM.SYS ‘adventure’ • Planning in late 2006, implementing in 2007 • Progressive introduction • Large variety of hardware to test (90 vendors, 1614 models …) • Variety of legacy applications to satisfy (burning, A/V playback, game copy protection, drive emulation, …) • The first release vehicle is Windows 7 • Make sure you have adequate ‘Bake Time’
A developer’s dream: Less code, more features • 23k LOC in Vista cdrom.sys & classpnp.sys • No I/O cancellation • No DVD streaming • Primitive timers • 16k LOC in Windows 7 cdrom.sys • Leverage KMDF’s EvtIoCanceledOnQueue callback for both queues, before queuing, and before retrying • DVD streaming added • Use WDF coalesce-able timers with 500 ms delay tolerance for power saving
KMDF & storage drivers tricky parts • IRPs without FileObject • If you have IRPs without FileObject, set WdfFileObjectCanBeOptional to avoid KMDF assertion • DeviceEvtFileClose • KMDF will call DeviceEvtFileClose with NULL file object handle when you set WdfFileObjectCanBeOptional • WdfRequestMarkCancelableEx • For large I/Os broken into smaller I/O requests, use WdfRequestMarkCancelableEx to retrieve possible immediate STATUS_CANCELLED • Moving to WDM, plan also to update KD extension • !scsikd was updated for KMDF cdrom.sys
Demo: Debugging KMDF CDROM.SYS with WDK KD extension • Grigory Lyakhovitskiy • SDE II • Device and Storage Technologies • grigly@microsoft.com
Debugging of KMDF drivers • Instrumentation is crucial to problem diagnosability • KMDF has provisions for easy debugging • WDFKD debugger extension is the most powerful tool • Explore driver internals, object states, etc. • Find out the underlying WDM objects • Dump log from the most recent driver events
Summary • KMDF works well for storage drivers • Migration to KMDF is not free • Plan well ahead, analyze dependencies
Material to use back at the office • KMDF cdrom.sys is in Windows 7 WDK! • KMDF homepage on Windows Hardware Developer Central http://www.microsoft.com/whdc/driver/wdf/KMDF.mspx Call for action • Ask yourself! • Would my driver benefit from KMDF? • Is my driver ready for KMDF?
Best Practices for a High-performance KMDF Driver • Praveen Rao • Senior SDE • Device and Storage Technologies • psrao@microsoft.com
Outline • Plan for performance • Use appropriate configurations for performance • Minimize object creation • Be cognizant of cost of operations • Use framework-provided mechanisms
Plan for Performance • Performance should be considered at every stage of the development cycle • At design time incorporate • Best practices • Things costly to change later • Incorporate finer-grained optimizations after measurement • Best to have a prototype to do these measurements early • Continuously measure and monitor performance
Plan for Performance (contd.) • Things discussed in this talk are: • Provided as guidelines • Intended to make you aware of the performance consequences of various choices • Not meant to be applied indiscriminately • If optimizations incur complexity, measure performance before optimizing it
Object - SynchronizationScope • Specified as part of WDF_OBJECT_ATTRIBUTES during object creation • Use WdfSynchronizationScopeNone if object’s callbacks don’t need serialization or if you use your own lock • Use narrowest scope possible. For example: • Use WdfSynchronizationScopeQueue (new in KMDF 1.9) instead of WdfSynchronizationScopeDevice if synchronization is needed only in the queue scope
Object - ExecutionLevel • Specified as part of WDF_OBJECT_ATTRIBUTES during object creation • Avoid using WdfExecutionLevelPassive for objects that could cause a work item to be queued for every I/O. For example: • I/O Queue • If passive callback is needed for specific types of I/O, create a separate queue and specify WdfExecutionLevelPassive • Objects that are created and disposed for every single I/O, such as objects parented to request
Object – ParentObject • Specified as part of WDF_OBJECT_ATTRIBUTES during object creation • Provides convenient way to manage object lifetime • Use the narrowest scope for parent object • For example, a memory object created for a request should be parented to request and not to device
Object – ParentObject (Contd.) • Not using a narrow scope for ParentObject makes objects unnecessarily accumulate in memory • Eventually they get freed so leak detection mechanisms do not catch this situation • You can use !wdfhandle to dump object hierarchy and see any accumulated objects • Use flag 0x00000010 to dump the hierarchy recursively
FileObjectClass • Specified in WDF_FILEOBJECT_CONFIG parameter in WdfDeviceInitSetFileObjectConfig • Specify FileObjectNotRequired if device stack doesn’t require or use file objects • Else specify WdfFileObjectWdfCanUseFsContext or WdfFileObjectWdfCanUseFsContext2 if these are not already used by your device stack • Specifying WdfFileObjectWdfCannotUseFsContexts makes framework perform a linear search to find the framework file that corresponds to the WDM file • Use only as the last resort
I/O Queues • Create different queues for different request types to avoid queue lock contention if several such I/Os are handled concurrently • Create different I/O queues for read, write, IOCTL, and internal IOCTL requests by using WdfIoQueueCreate
Coalescable Timer • Provide a reasonable tolerance for coalescable timers (new in Windows 7) • Set reasonable TolerableDelayin WDF_TIMER_CONFIG structure
Object Creation/Deletion • WDF objects provide good programming model and lifetime management • They also have a performance cost • Object creation/deletion is the primary overhead of WDF over WDM • Hence minimize object creation where possible
Object Reuse • Minimize object creation/deletion overhead by reusing objects • Reuse request objects across I/Os • Children and any associated resources (such as timer used to support timeout) get reused too • Reuse DMA transaction objects across I/Os • Store objects for reuse—although be mindful of memory vs. performance trade-off
Object Hierarchies • Be careful about creating object hierarchies • Adding/removing siblings will contend on the same lock • If any object in the object tree requires Passive cleanup and the tree is deleted at Dispatch level, framework needs to use a work item for tree cleanup • As a corollary, avoid disposing objects at Dispatch level wherever possible
Object Contexts • Avoid using too many object contexts in hot paths • Leads to framework performing a linear search in context list • Consolidate contexts, if possible, instead
Object Reference • Avoid redundant WdfObjectReference/Dereference • Have a good reason to extend the lifetime of an object
Property Get • Avoid calling property Get DDIs repeatedly in hot I/O path • Examples: WdfRequestGetParameters, WdfRequestGetFileObject • Incurs overhead such as: • Going through DDI layer • Handle validation • Store the property in a local variable • If property is needed across threads or functions store in context • Memory vs. performance trade-off, so do this after measurement and do not go overboard
Request Formatting • WdfIoTargetFormatRequest* DDIs provide convenient ways to format a request before sending to next layer • Formatting may incur overhead of preparing common buffer, MDLs, etc. • Reusing requests avoids such overhead in IOCTL case • If performance measurement shows this to be an issue: • Use WdfRequestFormatRequestUsingCurrentType • Modify stack location directly by escaping to WDM • Framework may optimize this in future releases, so please check back
Sending Request with Timeout • One can use WdfRequestSend and specify Timeout in WDF_REQUEST_SEND_OPTIONS to send request with timeout • This option incurs overhead of timer creation if request is sent asynchronously • That is, if WDF_REQUEST_SEND_OPTION_SYNCHRONOUS is not specified • Timer will be reused if request is reused • Try your best to reuse requests if you need to use timeout
Work items • Do not use work items for every single I/O • For example, do not queue a work item from every DpcForIsr • Be careful with blockingwaits in work item callbacks • Don’t wait on something that might never be signaled, such as an interrupt or a user-mode application • Work items are a limited resource. Blocking them affects system adversely, in such cases use your own thread • Avoid dependencies on other work items that run before yours (may lead to deadlock)
Counted Queues • Use counted queues for I/O throttling (new in KMDF 1.9) • Instead of forwarding I/O requests that the driver can’t handle at a given point to a manual queue
Sending Request with Timeout • To send requests with timeout, avoid creating your own timers • Use WdfRequestSend and specify timeout in WDF_REQUEST_SEND_OPTIONS • Avoids DDI overhead for the timer creation • Timer gets reused if request is reused
Call To Action • Plan for performance early • Use best practices • Measure! Measure! Measure! • Look out for signs of inefficient choices
Additional Resources • Web resources • White papers: http://go.microsoft.com/fwlink/?LinkID=79335 • Channel 9 talk:http://channel9.msdn.com/ShowPost.aspx?PostID=226116 • Blogs • http://blogs.msdn.com/doronh/default.aspx (A Hole In My Head) • http://blogs.msdn.com/peterwie/default.aspx (Pointless Blathering) • http://blogs.msdn.com/iliast/default.aspx (driver writing != bus driving) • Newsgroups and Lists • Microsoft.public.device.development.drivers • OSR NTDev Mailing List • Book: Developing Drivers with the Windows Driver Foundationhttp://www.microsoft.com/MSPress/books/10512.aspx