1 / 44

Data Access Layer

US N ATIONAL V IRTUAL O BSERVATORY. Data Access Layer. Doug Tody (NRAO). Data Access Layer. What does it do? Provides access to data data discovery mediation to a standard model data retrieval on-demand data generation server-side computation (subsetting, filtering) What is it for?

Download Presentation

Data Access Layer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. US NATIONAL VIRTUAL OBSERVATORY Data Access Layer Doug Tody (NRAO) NVO Summer School, Aspen 9-Sep-2005

  2. Data Access Layer • What does it do? • Provides access to data • data discovery • mediation to a standard model • data retrieval • on-demand data generation • server-side computation (subsetting, filtering) • What is it for? • Supports client data analysis • distributed, multiwavelength • How does it work? • Object (dataset) oriented • catalog, image, spectrum, time series, SED, etc. • Services • cone search (also SkyNode), SIA, SSA NVO Summer School, Aspen 9-Sep-2005

  3. Cone Search NVO Summer School, Aspen 9-Sep-2005

  4. Cone Search • Provides basic catalog access • Query by position and aperture (cone in space) • Query consists of base-URL (service endpoint) plus parameters • e.g., http://base-url%RA=12.0&DEC=0.0&SR=1.0 • Catalog returned as a VOTable • Advantages • Simple but powerful, provides standard interface • Easy to implement and use • Limitations • Catalog metadata is not defined • No data model support • Future • Supplanted by basic SkyNode (Greene, Saturday) • Supports metadata discovery, SQL-like syntactical queries • We will continue to support the basic cone search query however! NVO Summer School, Aspen 9-Sep-2005

  5. Simple Image Access NVO Summer School, Aspen 9-Sep-2005

  6. Simple Image Access (SIA) • Basic Usage, Highest Level • Client queries Registry to find interesting services • Each service is queried (in turn or simultaneously) for data • Client collates and analyzes results • Selected datasets are retrieved NVO Summer School, Aspen 9-Sep-2005

  7. Simple Image Access (SIA) • Basic Usage, Single Service • Query • find data of interest from a single service • http://base-url %POS=12.0,0.0&SIZE=0.2&FORMAT=image/fits • Query response • VOTable, one row per candidate dataset • "access reference" (a URL) points to data • Data selection • Performed by the client using query response metadata • Dataset retrieval • Retrieve actual datasets, if any NVO Summer School, Aspen 9-Sep-2005

  8. Service Capabilities • Types of Services • Atlas Precomputed survey image (entire image) • Pointed Image from pointed observation (entire image) • Cutout Cutout existing image (pixels unchanged) • Mosaic Reprojected image (pixels resampled) • Virtual Data • Data model mediation • Subsetting, filtering, etc. on the fly • Possible to view same data in different ways • Interface • RESTful interface currently (HTTP GET) • Document oriented (VOTable, FITS, JPEG, etc.) NVO Summer School, Aspen 9-Sep-2005

  9. Data Model • SIA data model is the familiar "astronomical image" • Generally this means a 2D sky projection • Data array is logically a regular grid of pixels • Encoded as a FITS image, GIF/JPEG, etc. • Standardized dataset metadata • Provenance • Image geometry • Scale • Format • Position, WCS • Time of observation • Spectral bandpass • Access information NVO Summer School, Aspen 9-Sep-2005

  10. Input Parameters • Required parameters • POS center of ROI (ra, dec decimal degrees ICRS) • SIZE width; or width, height • FORMAT ALL, GRAPHIC, image/fits, image/jpeg, text/html,… • Optional parameters • INTERSECT values: covers, enclosed, center, overlaps • VERB table verbosity • Service-defined parameters • used to further refine queries, but not yet standardized • e.g., BAND, SURVEY, etc. • Image generation parameters • NAXIS, CFRAME, EQUINOX, CRPIX, CRVAL, CDELT, ROTANG, PROJ • used for cutout/mosaic services to specify image to be generated NVO Summer School, Aspen 9-Sep-2005

  11. Query Response • Output is a VOTable • Must contain a RESOURCE element with tag="results", containing the results of the query. • The ‘results’ resource contains a single table • Each row of the table describes a single data object which can be retrieved. • The fields of the table describe the attributes of the dataset • These are the attributes of the SIA data model • In SIA 1.0, the UCD is used to identify the data model attribute • e.g., POS_EQ_RA_MAIN, VOX:Image_Scale, etc. NVO Summer School, Aspen 9-Sep-2005

  12. Query Response • Image metadata • Describes the image object (required) • Coordinate system metadata • Image WCS • Spectral bandpass metadata • Prototype data model describing spectral bandpass of image • Processing metadata • Tells whether the service modified the image data • Access metadata • Tells client how to access the dataset (required) • Resource-specific metadata • Additional optional service-defined metadata describing image NVO Summer School, Aspen 9-Sep-2005

  13. Image Metadata VOX:Image_Title Brief description of image POS_EQ_RA_MAIN Ra (ICRS) POS_EQ_DEC_MAIN Dec (ICRS) INST_ID Instrument name VOX:Image_MJDateObs MJD of observation VOX:Image_Naxes Number of image axes VOX:Image_Naxis Length of each axis VOX:Image_Scale Image scale, deg/pix VOX:Image_Format Image file format NVO Summer School, Aspen 9-Sep-2005

  14. NVO Summer School, Aspen 9-Sep-2005

  15. Image Retrieval • Completely optional • Typically only a fraction of the available images are retrieved • Query response • If an access reference is provided, the data can be retrieved • SIAP can also be used to describe data which is not online • The same data may be available in multiple formats • Image retrieval • Very simple; access reference is a URL • Standard tools can be used to fetch the data • (browser, wget, curl, i/o library, etc.) • Data is often computed on-the-fly • All retrieval is synchronous (currently) • No provision for restricting access (currently) NVO Summer School, Aspen 9-Sep-2005

  16. Service Registration NVO Summer School, Aspen 9-Sep-2005

  17. Future Development • SIA V1.1 • Based on work done on SSA • Expanded query interface • no longer limited to positional queries • Much richer query response • generic dataset identification, characterization, etc. • metadata extension mechanism • Selected features • VOTable 1.1 with UCD 1+, GROUP, UTYPE • query response can be ordered by "score" • logical groupings of related query records • compression support • Versioning • required to make protocol upgrades manageable NVO Summer School, Aspen 9-Sep-2005

  18. NVO Summer School, Aspen 9-Sep-2005

  19. NVO Summer School, Aspen 9-Sep-2005

  20. Future Development • Service verification • for testing at development time • when registered; level of compliance metric • Grid capabilities • Data staging • asynchronous image generation (long running jobs) • batch generation of images (multiple images) • Data management • support for single sign-on authentication, authorization • network data caching, third party delivery (VOStore etc.) • Web service interface • resource metadata • service availability (etc.) • ADQL integration • Capability to use query language for queries NVO Summer School, Aspen 9-Sep-2005

  21. Simple Spectral Access NVO Summer School, Aspen 9-Sep-2005

  22. Simple Spectral Access (SSA) • What is it? • Provides access to 1Dspectra, time series, SEDs • Tabular spectrophotometric data (photometry points) • Represents second generation, data model-based DAL interfaces • Status • Draft V0.9 query interface reviewed in Kyoto (May 05) • Revisions in progress; draft PR targeted for Madrid (Oct 05) • Much work on data models however still being revised • Some initial prototypes already exist (services, client apps) • IVOA/Madrid discussions will be held immediately after the ADASS and are open to all NVO Summer School, Aspen 9-Sep-2005

  23. Basic Usage • SSA specification may be complex, but basic usage is simple • Simple query • POS, SIZE, FORMAT - like cone search, SIA • Possibly refined by spectral or time bandpass, etc. • Most metadata in query response is optional • Data retrieval • Simple retrieval is again URL-based • Get back a dataset "document" (VOTable, FITS, JPEG, etc.) • In simplest case could be wavelength, flux as text (for Spectrum) • Pass-through of external data is permitted • Data Analysis • Standard data model isolates application from quirks of • external project data NVO Summer School, Aspen 9-Sep-2005

  24. Concepts - Dataset-oriented • Data object type • Spectrum, TimeSeries, SED • Dataset creation type • Atlas Whole datasets, uniform survey data • Pointed Whole datasets, variable instrumental data • Cutout Subset, data samples are not modified • Resampled Subset, data samples computed by service • Dataset derivation • Observed An observation • Composite Combination of several observations • Simulated Simulated observation made from real data • Synthetic Data from a theoretical model NVO Summer School, Aspen 9-Sep-2005

  25. Data Models • Data models used in SSA • Spectral data Spectrum, TimeSeries, SED • Dataset Generic dataset descriptor • Target Astronomical target observed • Curation Origin of data • Characterization Physical characteristics of data • Provenance Instrument which generated the data • User defined data models • Metadata extension mechanisms • additional data model attributes (table fields) • additional resources in VOTable, linked back to main table • Provide a mechanism to "subclass" dataset to tailor it for a given data collection NVO Summer School, Aspen 9-Sep-2005

  26. Spectral Data (SED) Photometry point spectrum segment NVO Summer School, Aspen 9-Sep-2005

  27. Spectral/SED Data Model NVO Summer School, Aspen 9-Sep-2005

  28. NVO Summer School, Aspen 9-Sep-2005

  29. Query Interface • Mandatory query parameters • POS RA, DEC (ICRS) • SIZE diameter (decimal degrees) • TIME data1,date2 (epoch in decimal years UTC) • BAND wave1,wave2 (meters in vacuum; source or observer) • FORMAT VOTable, fits, xml, text, graphics, html, external NVO Summer School, Aspen 9-Sep-2005

  30. Query Interface • Recommended query parameters • APERTURE approx spatial resolution (decimal degrees) • SPECRES spectral resolution (meters) • TOP number of top-ranked records to return • OBJTYPE mandatory if service returns multiple object types • COLLECTION data collection identifier NVO Summer School, Aspen 9-Sep-2005

  31. Query Interface • Optional parameters • CREATORID creator-assigned dataset identifier (at most 1) • PUBID publisher-assigned dataset identifier (at most N) • COMPRESS enable compression (for both data _and_ queries?) • SNR signal-to-noise ratio • REDSHIFT redshift range (dlambda/lambda) • TARGETCLASS star, galaxy, pulsar, PN, QSO, AGN, etc. NVO Summer School, Aspen 9-Sep-2005

  32. Query Response • Classes of query metadata • Query metadata Describes the query itself • Dataset metadata Describes data object; object-specific • Target metadata Astronomical target • Curation metadata External identification of dataset • Characterization Coverage, Accuracy, Frame, etc. • Instrument metadata Service-defined; hard to standardize • Access metadata Describes how to access the dataset NVO Summer School, Aspen 9-Sep-2005

  33. Query Response • Query Metadata • Query.Score How well object matches query • Query.LName Logical name (identifier) • Query.LNameKey Logical name key (id-ref) • Example: LName="MyObj123" LNameKey="server,format" NVO Summer School, Aspen 9-Sep-2005

  34. Query Response • Dataset Metadata • Dataset.Type Spectrum, TimeSeries, SED, etc. • Dataset.DataModel DM name, e.g., "SSA-V0.90" • Dataset.Title Brief descriptive title of dataset • Dataset.SSA.NSamples Total samples in dataset Dataset.SSA.Aperture Characteristic aperture diameter • Dataset.SSA.TimeAxis TimeCoord axis (external data) • .SSA.SpectralAxis SpectralCoord axis (external data) • Dataset.SSA.FluxAxis Flux axis (external data) • Dataset.CreationType atlas, pointed, cutout, resampled • Dataset.Derivation observed, composite, simulated, synthetic NVO Summer School, Aspen 9-Sep-2005

  35. Query Response • Target Metadata • Target.Name Name of astronomical object • Target.Class Target class (star, galaxy, QSO, etc.) • Target.SpectralClass Spectral class (e.g., 'O', 'B', etc.) • Target.Redshift Nominal redshift for object • Derived.VarAmpl Variability amplitude (fraction 0-1) • Derived.SNR Observed signal to noise ratio NVO Summer School, Aspen 9-Sep-2005

  36. Query Response • Curation Metadata • Curation.Collection Data collection name (identifier) • Curation.Creator Creator identify (identifier) • Curation.CreatorID Creator-assigned dataset identifier • Curation.PublisherID Publisher-assigned dataset identifier • Curation.Date Dataset creation date (ISO date string) • Curation.Version Dataset version (within same ID) NVO Summer School, Aspen 9-Sep-2005

  37. Query Response • Characterization1 - Coverage • .Location.Spatial Position (e.g., RA, DEC) • .Location.Time Observation time characteristic value • .Location.Spectral Spectral bandpass characteristic value • .Location.Spectral.BandID Bandpass ID (band or filter name) • .Bounds.Spatial Aperture footprint (polygon on sky) • .Bounds.Time Low/High time values • .Bounds.Spectral Low/High spectral values • .Bounds.Flux Limiting flux, saturation limit (Jansky) • .Fill.Spatial Spatial sampling filling factor (0-1) • .Fill.Time Time sampling filling factor (0-1) • .Fill.Spectral Spectral sampling filling factor (0-1) NVO Summer School, Aspen 9-Sep-2005

  38. Query Response • Characterization2 - Accuracy • Accuracy.*.Calibrated uncalibrated, relative, absolute • Accuracy.*.Resolution Resolution of measured signal • Accuracy.*.StatErr Statistical error (measured) • Accuracy.*.SysErr Systematic error (estimated) ('*' = Spatial, Time, Spectral, Flux) NVO Summer School, Aspen 9-Sep-2005

  39. Query Response • Characterization3 - Reference Frames • Frame.Spatial.Type Coordinate frame (default ICRS) • Frame.Spatial.Equinox Coordinate system equinox (J2000) • Frame.Time.System Timescale (TT) • Frame.Time.SIDim SI factor and dimension • Frame.Spectral.SIDim SI factor and dimension • Frame.Flux.SIDim SI factor and dimension • Frame.Flux.UCD UCD of flux value (flux type) (These apply only to the query response) (SIDim metadata still under construction) NVO Summer School, Aspen 9-Sep-2005

  40. Query Response • Instrument Metadata • Instrument.Name Instrument name (identifier) • Instrument.Exposure Total exposure time (seconds) • Instrument.<other> Service-defined • Notes • Optional; provided for instrumental data collections • In general, Collection, Bounds.Time, etc. are preferred • In general Instrument metadata is service-defined • Use Observation model as a starting point NVO Summer School, Aspen 9-Sep-2005

  41. Query Response • Access Metadata • Access.Reference Data access URL • Access.Format MIME type of returned dataset • Access.Size Approximate dataset size (bytes) • Access.Server Server endpoint URL • Staging support goes here in the future • e.g., will dataset access require asynchronous staging • estimated cost to construct dataset NVO Summer School, Aspen 9-Sep-2005

  42. Service Metadata • Usage • Describe service type and capabilities • Characterize service (data resources served, coverage, etc.) • Describe interface (optional query parameters) • Interface • Requires new service metadata query method • Returns resource metadata descriptor (XML) • Format • Registry resource descriptor (XML) NVO Summer School, Aspen 9-Sep-2005

  43. Data Retrieval • Based on GET as with SIA • Variety of formats available • Compression supported • Data representation • Data model defines logical content of data • The same data object may be represented in various formats • Hence we need to specify both the data model, and the file format NVO Summer School, Aspen 9-Sep-2005

  44. Data Retrieval • Data models • SSA data model for fully-compliant data • Provider-defined data model for external data • Data formats • VOTable (a container), native XML (direct serialization) • FITS binary table (another container; uses FITS spectral WCS) • Text, e.g., CSV • Graphics (JPEG etc.) • text/html (rendered into browser page) NVO Summer School, Aspen 9-Sep-2005

More Related