1 / 12

Pfizer Enterprise Elastic HPC

Pfizer Enterprise Elastic HPC. Mike Miller Pfizer Research Business Technology May 18 th Prism Meeting Stockholm Sweden. How do we define HPC?. Simply summarized as the computational laboratory Consists of: Desktop/Services, integrated with Global high performance cached file system

rjohnathan
Download Presentation

Pfizer Enterprise Elastic HPC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pfizer Enterprise Elastic HPC Mike Miller Pfizer Research Business Technology May 18th Prism Meeting Stockholm Sweden

  2. How do we define HPC? • Simply summarized as the computational laboratory • Consists of: • Desktop/Services, integrated with • Global high performance cached file system • Centralized large capacity/capability compute resources • Used by: • Direct • 300-400 expert computational scientists in chemistry, biology, DMPK, stats, pharmsci & clinpharm • Indirect • >2000 lab scientists using desktop apps that utilize HPC compute

  3. The Evolution of HPC at Pfizer 2009 6 x3950 (520 cores) 2000 SGI Origins (128 cores) 2010 on-demand Amazon VPC 2004 150 blades (300 cores)

  4. Intersection of “The Cloud” and HPC

  5. Pfizer’s isolated VPC resources AWS Virginia DC Subnets Router VPN Gateway AmazonWeb Services Cloud Groton DMZ Secure VPN Connection over the Internet Pfizer VPC Overview • The Pfizer Virtual Private Cloud (pilot effort) has been implemented an extension of our physical data center. • Infrastructure as a service affords rapid provisioning without compromising on: • Security • Compatibility • Accessibility • Agility • Utility • Implementation

  6. Feature AWS Internal VM’s Data Center Security Monitoring Required to be joined to the Pfizer network Confidentiality Moderate High high med low Complexity Simple Stand Alone Low 1000’s $ Provisioning Costs mid-10’s $ $0 Public Bare Metal VMWare AMI/Xen Avail. Config. Xen 2-8 wks 4 hrs Provisioning SLA 1 hr Min. Billable Period 1 hr. 1 mo. 6 mo. Computing Requirements come in Many forms 100-1000s 10-100s 1-10s Request Capacity/Wk Low-100’s $ Runtime/Depreciation high-10’s$ low-10’s $ 7x24 Self / incident Support Model 8x5 Immediate Support SLAs None 24 hr Windows server 2003/2008 Solaris, AS 400 OS Configurations Linux REHL 5.x POC Prod Environment Dev / Test HPC HPC Controls Black Box Qualified / Validated System root level access

  7. Security • Amazon practices & security measures successfully met audit criteria for Research level use • Pfizer employed the same security systems used internally • IP-sec tunnels in to AWS • Pfizer Global Active Directory • Joining machines and managing permissions • Linux & Windows

  8. Compatibility • To get the most benefit from the cloud it was necessary to align AWS resource offerings with existing internal systems: • AMI’s (VM)  Pfizer Qualified RHEL 5 image • Centrify/AD provides identification/authorization • Kerberos credentials via AD • File cache (storage)  OpenAFS volumes accessible • IP mappings  Pfizer DNS • AMI’s have Pfizer network identities & are discoverable • Allows AMI’s to be part of our LSF cluster • Users can do development work accessing the full range of Pfizer resources • e.g. Software licenses utilize the pfizer flexlm server

  9. Availability • AD & DNS give us full range of access to internal systems • LSF for job scheduling • Oracle / mySQL instances for accessing structured data • AFS for secure access to unstructured data • High performance via local caching • Access to licensed and internally developed software

  10. Agility • The $50M decision • Required completion of a time sensitive chemoinformatics task • Workload was diverted from internal resources so they could be dedicated. • Within 30 min 64 cores were spun up and joined to LSF • For 4 days >50,000 jobs were executed • Total cost <$1,500 • Results were obtained on-time and the decision taken

  11. Utility • Internal Application Development • Tomcat web applications • Nightly builds & regression testing • HPC capacity • Over 250 apps are accessible • LSF uses resource specifications to determine suitability and schedules jobs accordingly • Over 100,000 jobs run • QM, ab initio • Virtual screening • Systems biology

  12. Implementation • From PoC  Production • Provisioning, exploring commercial solutions that enable: • One-time actions • Integrate with our procurement system • Move to a debit (pre-allocated funding) model • Standard configurations • Repeatable actions • Start/ Stop instances via a user centric dashboard • User’s manage / are accountable for the resources they use • LSF • Custom code • detect workload • Start / Stop AMI’s • Leverage accounting

More Related