1 / 235

Data Management Center

Data Management Center. Critical Design Review. February 16, 2006. Agenda. Overview and Context (30 min, 8:00 – 8:30) – Swade Element Architecture (60 min, 8:30 – 9:30) – Swade Element Design External Interfaces (30 min, 9:00 – 9:30) – Swade

Download Presentation

Data Management Center

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Management Center Critical Design Review February 16, 2006

  2. Agenda • Overview and Context (30 min, 8:00 – 8:30) – Swade • Element Architecture (60 min, 8:30 – 9:30) – Swade • Element Design • External Interfaces (30 min, 9:00 – 9:30) – Swade • Data Product Design (30 min, 10:00 – 10:30) – Swade • Data Processing Software Applications (30 min, 10:30 – 11:00) – Swam and Sontag • Calibration Software (30 min, 11:00 – 11:30) – Grumm and Hodge • Archive System Software (30 min, 11:30 – 12:00) – Miller • Lunch (60 min, 12:00 – 1:00) • Hardware (60 min, 1:00 – 2:00) – Singer • Prototype Results (30 min, 2:00 – 2:30) – Sontag and Swam • Functional and Performance Requirements Compliance (30 min, 2:30 – 3:00) – Swade • Development Plans (30 min, 3:00 – 3:30) – Swade • Integration, Test, and Verification Plans (30 min, 3:30 – 4:00) – Goldstein • Operations Support Plans (30 min, 4:00 – 4:30) – Hamilton • Programmatics (30 min, 4:30 – 5:00) – Taylor

  3. Presenters

  4. Overview and Context Daryl Swade

  5. DMC Context in Ground Segment YOU ARE HERE

  6. DMC Background • Located at Space Telescope Science Institute in Baltimore, MD. • DMC will leverage Kepler software systems development off existing systems for HST, FUSE, GALEX, and other STScI supported missions. • Kepler operations at DMC will be integrated with HST operations at STScI. • Kepler data will become part of the Multi-Mission Archive at Space Telescope (MAST).

  7. DMC High-Level Functions • Science data processing • Unpack telemetry • Uncompression • Format in FITS • Incorporate ancillary engineering • Incorporate s/c ephemeris • Science data archive • Data ingest • Archive catalog • Data Distribution • Archive user interface • Calibration • Remove cosmic rays from black level • Subtract black level from target and collateral pixels • Calculate dark current • Remove smear

  8. DMC Functional Block Diagram

  9. Results of Prior Reviews Daryl Swade

  10. DMC PDR RFA Status

  11. DMC Peer Review I Pipeline and Calibration • Topics discussed • Pipeline infrastructure • MOC-DMC interface • Science data processing applications • Ancillary Engineering Data Processing • Calibration • Data formats • Data processing hardware • Data processing verification and test plans • Review team • Rick Thompson (NASA/Ames) • David Mayer (NASA/Ames) • Tim Conrow (JPL) • Peg Stanley (STScI) • Melissa Russ (STScI)

  12. DMC Peer Review II Archive • Topics discussed • SOC – DMC data products • Archive hardware • Ingest • Archive catalog • DADS/NSA • Data distribution • Archive user interface • Demonstration queries • Archive verification an test plans • Review team • Rick Thompson (NASA/Ames) • David Mayer (NASA/Ames) • Pam Marcum (NASA/HQ) • Jim Etchison (STScI) • Melissa Russ (STScI)

  13. Driving Requirements Daryl Swade Reference: DMC Requirements Document, KDMC-10001-001A, February 13, 2006

  14. SDP Driving Requirements (1 of 3)

  15. SDP Driving Requirements (2 of 3)

  16. SDP Driving Requirements (3 of 3)

  17. Calibration Driving Requirements

  18. Archive Driving Requirements (1 of 5)

  19. Archive Driving Requirements (2 of 5)

  20. Archive Driving Requirements (3 of 5)

  21. Archive Driving Requirements (4 of 5)

  22. Archive Driving Requirements (5 of 5)

  23. Performance Driving Requirements

  24. Requirement Management and Flowdown • DMC uses the DOORS database at Ball for requirement management. • Allows links with Ground Segment requirements • All Ground Segment requirements allocated to DMC have been traced to DMC requirements within DOORS. • See GSRD to DMC Requirements Traceability Matrix • DMC Requirement Documents generated with DOORS. • DMC Requirements Document, KDMC-10001 • DMC Verification and Test Matrix, KDMC-10020 • DMC Traceability Matrix, KDMC-10021 • Links requirements to high level design in DMC Architecture Document, KDMC-10002

  25. DMC Architecture Daryl Swade Reference: DMC Architecture Document, KDMC-10002-002a, February 13, 2006

  26. DMC Implementation Strategy • Kepler DMC will be implemented adopting existing science software systems at STScI. • Systems will be tailored for Kepler. • Software has been designed, as much as possible, to isolate the instrument/mission specific code. • The OPUS platform will be used to construct pipelines for data processing and ingest systems. • Science telemetry will be converted into astronomically standard FITS format files. • DADS will be used for the Data Archive and Distribution System. • MAST web archive interface will be used for data retrieval.

  27. Data Processing Infrastructure

  28. Pipeline Architecture

  29. Science Data Processing (1 of 3) • Science Data Receipt • Receive and store science telemetry • Identity data type as long cadence, short cadence, utility target, or FFI • Unpack telemetry • Requires target and aperture definition • Verify data completeness at the pixel level for photometer data • Uncompress • Level of compression determined from telemetry packet header • Partition Data • Data sorted by pixel type: target, collateral, and background • S/c clock to UTC conversion • Based on s/c clock time coefficients supplied by the MOC

  30. Science Data Processing (2 of 3) • Incorporate ancillary engineering • Ancillary data extracted from engineering telemetry will be incorporated into the science data set • Incorporate s/c ephemeris information • Update header keywords • Identity target as PI or GO • Populate photometer operating parameters for calibration • Determine Barycentric time correction • Determined for each CCD channel and stored in the FITS science table extensions • Uses S/c ephemeris data • Determine World Coordinate System parameters • Convert pixel coordinates to RA and Dec for center pixel of each channel • Determined for each CCD channel and stored in the FITS science table extensions

  31. Science Data Processing (3 of 3) • Calibrate • Remove detector specific signatures • Transfer data to the SOC • Make original and calibrated data available to SOC within 12 hours after receipt from the MOC

  32. Calibration • Calibrate data to remove pixel level systematic errors • Remove cosmic rays from black level • Subtract black level from target and collateral pixels • Calculate dark current • Remove smear • Collateral data associated with each observational cadence will be used to remove black level (bias), smear, and dark-current • Preserve temporal and spatial resolution • Comply with FITS format

  33. Archive (1 of 2) • Data ingest • Archive and catalog all original data • Archive and catalog all calibrated data • Receive and archive relative light curves from the SOC • Preserve Kepler data archive for at least 10 years past EOM • Archive Catalog • Provide on-line science catalog of Kepler metadata for searches, data mining, and data retrieval • Metadata from cadence processing incorporated into archive database • Stage Kepler Input Catalog, Characteristics Table, Kepler Target Catalog, and Results Catalog from SOC for archival use

  34. Archive (2 of 2) • Data Distribution • Make Kepler data available to the astronomical community though the MAST interface at STScI • Ensure proper proprietary access to Kepler data in the archive based on the Science Office defined Kepler Data Release Policy • Make relevant cadence data available to Guest Observer Program • Archive Interface • Provide a Kepler science data archive that is accessible at three access levels: • PI, Co-Is, SOC, and PSP participants • GOs and DAP • general users • Create and maintain software tools required to search the Kepler archive catalog • This includes user access to the target list through the Kepler data archive • Create and maintain software tools required to access original data, calibrated data, and light curves

  35. DMC Pipeline Data Volume Estimates • Assumes 170,000 targets • FFI: 389 MBytes each • Other data types volume very small in comparison: utility targets, ancillary engineering, s/c ephemeris, pixel mapping reference files, gap reports, … • See DMC Architecture Document, KDMC-10002, for details

  36. Archive Data Volume • Assumes 170,000 targets for entire mission • FFI: 389 MB each, total archive size < 50 GB for ~100 FFIs over mission • Other data types archive volume very small in comparison: utility targets, ancillary engineering, s/c ephemeris, pixel mapping reference files, gap reports, … • Factor of 2 gzip compression anticipated when writing to storage media (not include in above table) • See DMC Architecture Document, KDMC-10002, for details.

  37. System Performance Estimate • Assumption: Kepler data processing at the DMC will be similar to HST processing with regard to compute cycles and I/O • Actually, the estimates determined by comparisons to HST are probably worst case since HST processing times are dominated by calibration, and Kepler calibration is relatively less complex • An average throughput for original data processing on the system can be assumed to be 55 MBytes/minute • Estimate 5 GBytes per day of Kepler original and calibrated data • 93 minutes to process 1 day, or about 6 hours to process 4 days, of Kepler cadence data (16:1 processing ratio) • 15 minutes to process one FFI

  38. Proprietary Rights • Data release timeline (from Kepler Data Release and Scientific Publications Policy, KKPO-16001-001): • First 3 months of data – end of year 1 • Second 3 months of data – end of year 2 • Data to year 1 – end of year 3 • Data to year 1.5 – end of year 4 • Data to year 2 – end of year 5 • Data to year 2.5 – end of year 6 • Etc. • Except that all data are released one year after the end of the Kepler operational lifetime • Stars dropped from the target list are released to the public in two months from the time of the drop decision

  39. Levels of Data Access

  40. Cumulative Volume of Public Data – No Down Select *Time measured in months from end of commissioning / beginning of science data collection

  41. Cumulative Volume of Public Data – with Down Select 70,000 target down select at end of year one.

  42. Accounting and Information Management • Data Processing reports • MOC -> DMC Data Receipt History • Compression Performance Statistics • Target Missing & Unusable Pixels History • Configuration History • Calibration History • Data processing throughput statistics • DMC -> SOC Data Delivery History • Information posted to web site accessible by GS elements and mission management • Statistics generated and posted as data processed and archived • Driving requirements: • Archive usage statistics • Data Archiving Status • Archive ingest rates • Data completeness in the archive • GO Data Delivery Status • Data distribution rates

  43. Significant Design Changes Daryl Swade

  44. Changes from Concept Study Review • Photometric analysis of all targets using Difference Image Analysis will now be performed at the SOC • Independent photometric analysis on a subset of planetary target stars will be performed at STScI • Data analysis function tracked within Science Working Group, not DMC • P-mode analysis no longer supported under mission baseline

  45. Design Changes Since Preliminary Design Review (PDR) • Pixel mapping reference files • Cadence-level target data • Manual reprocessing in lieu of On-the-Fly Reprocessing • Black level cosmic ray rejection at DMC

  46. Pixel Mapping Reference Files (1 of 2) • For cadence data, each pixel must be tagged with its x and y position within the channel • Channel identified by FITS binary table extension number • Each pixel also needs to be tagged with a target id • Tagging each pixel with associated aperture id also helpful • Pixel location information adds 10 bytes to each pixel • Target id – 4 bytes • Aperture id – 2 bytes • X location – 2 bytes • Y location – 2 bytes • This information is the same for any given pixel for the duration of a target definition (typically 90 days for long cadence data) • Header keywords point to (reference) appropriate PMRF • Extracting pixel location information into a separate reference file saves about 7 TB of file space per year for cadence data

  47. Pixel Mapping Reference Files (2 of 2)

  48. Cadence-level Target Data - Ingest • It is anticipated that archive users will retrieve Kepler cadence data on a target basis • However, original and calibrated pixel data are processed in files containing data for each long and short cadence • Plan to sort pixels by target during Ingest • The ingest pipeline will read the target pixels from the cadence files and append them to the appropriate target file • On average, 64 pixels will be appended to each target file, 32 original pixel values and 32 calibrated flux values • When a target/aperture definition change is implemented, the target files will be permanently closed for writing and archived • If a target/aperture definition covers a 90-day period, the typical target file size will be 2.14 MB • Each cadence-level target file will reference a single pixel mapping reference file • For each cadence, collateral and background pixels will be stored in separate files • Collateral and background pixels are non-proprietary • Archive users will have an option to retrieve the relevant collateral and background pixels on a channel basis for a given target

  49. Cadence-level Target Data - Distribution • Data retrieval options with cadence-level target data include: • All non-proprietary files for that target over the time range specified in request • Default over all time that target observed • All associated pixel mapping reference files • Collateral pixels • All collateral pixels for the target’s channel • Collateral pixels from a projection of target aperture, completely analogous to short cadence collateral pixels • Background pixels for the target’s channel within a given requested radius or for the entire channel • Other non-proprietary targets within that channel within a given radius • Once all targets within a channel are non-proprietary, an option will be provided to distribute all data in the entire channel • Providing original and calibrated pixel data sorted by target in addition to sorted by cadence essentially triples the cadence-level data in the Kepler Data Archive from 1.8 TB/year to 5.4 TB/year (uncompressed, unmirrored)

  50. Distribution of GO Cadence Data • Cadence-level target data files simplify requirement to distribute data to GOs on a target basis • GO targets must be extracted from the cadence data and distributed to a GO without including data from the primary mission or other GO targets • Once data for an individual target is released to the public, the cadence-level data for that target must be made available separate from the proprietary targets • General archive users can request data for just a few individual targets • GOs would most likely want pixel list data from other nearby targets and background pixels for calibration purposes • Such a request can now be satisfied by allowing a GO to select additional non-proprietary data based on the above distribution scheme

More Related