1 / 26

Statistical Cross-Matching Across Distributed Archives

Statistical Cross-Matching Across Distributed Archives. H.-M. Adorf & GAVO Team MPI f. extraterrestrische Physik adorf@mpe.mpg.de. Statistical cross-matching. Cross-matching of astrometric and photometric catalogues core functionality of a virtual observatory Operational modes

sven
Download Presentation

Statistical Cross-Matching Across Distributed Archives

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Cross-Matching Across Distributed Archives H.-M. Adorf & GAVO Team MPI f. extraterrestrische Physik adorf@mpe.mpg.de

  2. Statistical cross-matching • Cross-matching of astrometric and photometric catalogues • core functionality of a virtual observatory • Operational modes • on an area of the sky • using an input catalogue (GAVO matcher)

  3. Philosophy • Build a cross-matcher application that • should be usable by scientists and help producing science results • uses what’s there and what works now • doesn’t get stopped by a missing standard • Support the VO process by • helping to generate appropriate VO-standards • adopting new VO-standards whenever feasible

  4. Querying remote archives • Movie

  5. Querying remote archives • Movie • Using up to 10 servers • distributed around the world • operating in parallel • Sneak preview of grid computing • Locally specify your tasks • Execute them remotely at the data centers • Receive results locally for final combination

  6. Software demo (#1) • Input list • 67 galaxies from FIRST radio catalogue • Query • 2 remote archives: SDSS, VizieR • 20 catalogues: radio, infrared, optical, X-ray • Task • get counterparts for each input coordinate • gather counterparts to form reasonable matches

  7. The matching problem (#1)

  8. The matching problem (#2)

  9. Matcher workflow

  10. Metadata • Querying and cross-matching requires metadata about catalogues & archives • astrometric fields and associated uncertainties • photometric fields and associated uncertainties • some metadata … • … are locally generated and stored • … are retrieved from archives in real-time

  11. Software demo (#2) • Issue: false alarms • matching is non-unique • input: 67 sources • output: almost 500 match candidates • many of these match candidates are “false alarms”

  12. Issue: false alarms (#3) • Two fundamental, independent probabilities • Hit probability: p(c|C) • False alarm probability: p(c|not C) • Goal • keep the hit probability high (completeness) • while keeping the false alarm probability low • goodness depends on S/N ratio in the data

  13. Issue: false alarms (#4) • Solution: use statistics (``fuzzy’’ matching) • compute statistical (Mahalanobis) distance between counterparts and center position • Compute reliability measure for match candidate (reduced chi-squared)

  14. Software demo (#3) • Lower reduced chi-squared from 10,000 to 3

  15. Software demo (#3) • Lower reduced chi-squared from 10,000 to 3 • Result • Hit-rate is still pretty high • False-alarm rate is dramatically reduced

  16. Issue: server reliability • An archive server • may be down (easy to detect) • may be slow today (more difficult to detect) • may deliver wrong results (spoils the science)

  17. VO Standards • Status • Input • CSV files for data • XML files for query & match process description • Sending plain HTTP/HTML to archive servers • Receiving • CSV file from SDSS SkyServer • VOTable from VizieR (VO-Std) • Output • VOTable with complete match result (VO-Std) - VOPlot • various CSV files

  18. Software demo (#4) • VOPlot

  19. Plans & Ideas • GUI for newcomers • Facilitates selection of catalogues, astrometric & photometric columns, etc. • Generates configuration file • for query including server selection • for core cross-matcher, including chi-squared limit • Automatic monitoring of server response and reliability • Improved matching algorithm • GUI panel for match candidate visualization

  20. Summary • Shown a working cross-matcher application • Operates with distributed archives queried in parallel • Demonstrated that • fuzzy matching is needed • reduced chi-squared is a powerful statistical discriminator • High hit-probability, low false-alarm probability • GAVO cross-matcher currently being used in a first science application

  21. Thanks • Particularly to the folks • from SkyServer/SDSS, and • from VizieR @ CDS and @ mirror sites, who, with their services, have enabled the cross-matcher

  22. The end

  23. Issue: false alarms (#5)

  24. Issue: false alarms (#6)

  25. GAVO • GAVO I • Funded by BMBF • Started end of 2002 • Ended end of March 2005 • GAVO interim • Fundend • 50% by Leibniz-prize money • 50% by BMBF

  26. The matching problem (#3)

More Related