330 likes | 461 Views
Grid Enabled Analysis Workshop Caltech - June 23-25, 2003. JAS – Distributed Data Analysis. Contents. JAS2 History client-server mode JAS2 and the Grid JAS3 What’s new JAS3 and AIDA Plans for Gridification. JAS History. First version of JAS2 released in 2000.
E N D
Grid Enabled Analysis WorkshopCaltech - June 23-25, 2003 JAS – Distributed Data Analysis
Contents • JAS2 • History • client-server mode • JAS2 and the Grid • JAS3 • What’s new • JAS3 and AIDA • Plans for Gridification JAS – Distributed Data Analysis
JAS History • First version of JAS2 released in 2000. • Incremental improvements released over time. JAS – Distributed Data Analysis
JAS2 History – Use Cases • With WIRED event display • Online Monitoring JAS – Distributed Data Analysis
JAS2 History – Use Cases • Custom Applications • Web Servlets JAS – Distributed Data Analysis
Data Analysis Engine User’s Java Code DATA Padded Cell JAS Client-Server Mode GUI Experiment Extensions (Event Display) Java Compiler + Debugger JAS – Distributed Data Analysis
Distributed Analysis System: Goals • Prototype for GRID enabled JAS analysis • Run analysis on a farm of machines • Use multiple CPU’s in parallel for CPU-intensive analysis • Access multiple I/O channels for data-intensive analysis • Use standard JAS (Client) as if we are running a local Job • Get interactive feedback • Create analysis modules (code) • Control job execution • View results (Plots/Histograms) • Access distributed datasets as if they were local datasets JAS – Distributed Data Analysis
Distributed Analysis System: Architecture JAS DataServer JAS DataServer JAS DataServer … CatalogServer Network ControlServer ControlServer Network JAS Client JAS Client JAS Client … Users JAS – Distributed Data Analysis
JAS 2 – GRID interface (Tech-X) JAS – Distributed Data Analysis
JAS3 Overview • A completely new version of JAS • Design based on Application Shell, into which many (optional) modules can be plugged • Highly customizable for different application domains • HEP/Astrophysics/Other • DST analysis/Online Monitoring/GRID analysis • Experiment/User specific modules • Modules can be updated independently of shell • Possible to release bug fixes fast • Includes support for programming in many languages • Scripting: Python, Pnuts, Dynamic Java, …. • Command prompt • Java (compiled) • Analysis (histograms, tuples, fitting) based on AIDA standard • Not technically backwards compatible with JAS2 • But migration is straightforward. JAS – Distributed Data Analysis
AIDA Overview • AIDA = Abstract Interfaces for Data Analysis • Covers key areas for data analysis • Histograms, Tuples, Fitting, Data Points, Plotting, Management • Developed collaboratively at series of workshops by groups at CERN, LAL, SLAC. • Next workshop June 30-July 4 -- CERN • Interfaces developed for C++ and Java ( and maybe Python?) • Several implementations/tools available • Anaphe/Lizard/LCG PI – CERN • Open Scientist – LAL • JAIDA/JAS/AIDAJNI – SLAC JAS – Distributed Data Analysis
JAS3 and AIDA • JAS3 has adopted AIDA for analysis • AIDA allows us to leverage experience and skill of other developers • AIDA is functionally more complete than JAS2 analysis package • AIDA allows JAS to exchange data with other AIDA tools • AIDA provides bridge to C++ programs (e.g. Geant4) • AIDA encourages creativity and innovation • JAS3 HEP Analysis tools based on JAIDA • JAIDA = Java implementation of AIDA • JAIDA is part of FreeHEP library • Usable as standalone library for any Java Application • AIDAJNI = Interface between C++ and Java AIDA • Allows C++ programs to use JAIDA, JAS3 JAS – Distributed Data Analysis
C++ program AIDA- JNI Java program AIDA AIDA AIDA JAS3, AIDA and C++ C++ AIDA Implementation .aida file (XML) JAIDA JAS3 JAS – Distributed Data Analysis
JAS3 and AIDA • JAS3 supports all AIDA functionality, including • Histograms (includes arithmetic, projections, etc.) • Clouds (unbinned histograms, scatterplots) • Plotter • Tuples • Fitting – AIDA interfaces allow for multiple fitters • Uncmin -- pure java minimizer • Minuit -- Fortran called by Java Native Interface (JNI) • IO • AIDA XML, PAW, Root • JAS3 supports user interaction with AIDA in three ways • Scripting (Pnuts, Python etc) • Compiled (Java) code • GUI – Plotting, Fitting, Cuts etc. JAS – Distributed Data Analysis
JAS3 Scripting • JAS3 has multi-language OO scripting support • Command line, Console, Editor • Major components (e.g. AIDA) have scripting interfaces • Currently have plugins to support • Pnuts – syntax almost identical to Java, fast, well documented and feature complete • Python (using Jython) • More scripting languages can be added • not restricted to Java implementations (e.g. could use C-Python, JPE) JAS – Distributed Data Analysis
JAS3 Lightning Tour • Tour designed to give you an overview of the capabilities of JAS3, you can try them out for yourself this afternoon. Welcome Page, gives initial info and links to example scripts and programs Memory monitor JAS – Distributed Data Analysis
Opening Files Use file menu Drag from explorer JAS – Distributed Data Analysis
Graphical Interface to AIDA Histograms, Clouds, Tuples all presented in AIDA tree .aida files, .hbook files, .root files all presented as AIDA objects Drag items onto page, or use (popup) menus JAS – Distributed Data Analysis
Printing Or copy/paste into Word, PowerPoint etc. Can send individual plots or full page direct to printer Or save as PS, EPS, PDF, SWF, SVG, PNG, GIF… JAS – Distributed Data Analysis
Java Editor, Compiler and Loader Tree shows loaded programs Built-in editor for writing analysis code Unlike JAS2 which only supported “event analyzers” JAS3 allows any Java program to be loaded. This example “main routine” is taken directly from the AIDA manual Built-in Java compiler JAS – Distributed Data Analysis
Scripting Can also write and run scripts Console allows direct interaction with scripting language JAS – Distributed Data Analysis
Pnuts Language • Currently support Pnuts scripting language • Complete and well documented • http://javacenter.sun.co.jp/pnuts/doc/guide.html • Fast (although not as fast as compiled Java) • Syntax very similar to Java • Can easily call compiled Java classes from scripts – best of both worlds • Plan to support other languages in future • In particular Python JAS – Distributed Data Analysis
Record Sources Opening record (or event) based files causes the run control toolbar to appear Works similarly to JAS2 Job control, but now also supports random access and “tagged” data sets (mainly for event displays) JAS – Distributed Data Analysis
Tuple Explorer - Plots Histogram Profile Works with any tuple, read from file or dynamically created ScatterPlot XY Data (More appropriate for smaller data sets) JAS – Distributed Data Analysis
Tuple Explorer – Define Columns JAS – Distributed Data Analysis
Tuple Explorer - Cuts JAS – Distributed Data Analysis
Tuple Explorer - Tabulate JAS – Distributed Data Analysis
Tuple Explorer – Record Source To be used with record loop JAS – Distributed Data Analysis
JAS3 Spreadsheet • Simple spreadsheet plugin • for • Displaying results • Calculations • Simple Plots • Supports reading/writing • .csv files • Excel files • Cut/Paste with Excel etc • Coming Soon… • Scripting interface • GUI for building plots • User defined functions • Java, scripting JAS – Distributed Data Analysis
Miscellaneous Features Save/Restore configuration User Preferences Plugin Manager JAS – Distributed Data Analysis
Status • Currently released JAS3 version 0.7.1 • AIDA functionality is quite solid • Compiler, Loader, Record Loop all quite recently added, • Certainly still some rough edges • Documentation limited but available • Built-in example scripts and programs • Tutorial on web • If you are used to JAS2 you will find some functionality not yet ported to JAS3 • Remote (client/server) access to data. • 3D Lego/Surface plots JAS – Distributed Data Analysis
JAS3 and the GRID • We plan to add client-server/distributed capabilities to JAS3 similar to those in JAS2 • Will be based on (distributed) AIDA • Next AIDA workshop (at CERN next week) will discuss this • Want to use Grid standards where they exist • Work with others (PPDG-CS11,???) to define standards where they do not exist • Want to be compatible with C++ servers • Tech-X have submitted phase II SBIR and if approved will work closely if approved JAS – Distributed Data Analysis
JAS3 Links, More Info • JAS – Java Analysis Studio - http://jas.freehep.org • JAS3 – http://jas.freehep.org/jas3 • JAIDA – http://java.freehep.org/jaida/ • AIDA – http://aida.freehep.org • FreeHEP - http://www.freehep.org • FreeHEP Java Libraries - http://java.freehep.org • WIRED – http://wired.freehep.org JAS – Distributed Data Analysis