850 likes | 977 Views
Creating a caGrid Data Service. caGrid Data Service backed by caCORE SDK 4.1.1. caGrid Knowledge Center knowledge@cagrid.org May 6, 2009. Introduction.
E N D
Creating a caGrid Data Service caGrid Data Service backed by caCORE SDK 4.1.1 caGrid Knowledge Centerknowledge@cagrid.org May 6, 2009
Introduction This tutorial walks you through the process of creating a caGrid 1.3 Data Service to share data on the Grid. This data is stored in a database and accessed via the caCORE SDK 4.1.1 APIs. The model is the same model as used in the caCORE SDK 4.1.1 systems. For more details, refer to other tutorial in the May 5-6, 2009 caBIG Developer Boot Camp. Prerequisites This tutorial has the following prerequisites: • caCORE SDK 4.1.1 is installed locally. • The SDK has been configured with the boot camp model. • Your caCORE installation has been configured with your data source information. • You have deployed the resulting web application to JBoss or Tomcat • You have started your container to make the caCORE SDK web application available on port 8080. • The caGrid software has been installed on your host at c:\bootcamp\cagrid
caGrid Software Overview The hosts provided for this Boot Camp have been preinstalled with the caGrid 1.3 software. caGrid and required software: • The caGrid 1.3 Installer: C:\bootcamp\cagrid\caGrid-installer-1.3 • caGrid 1.3: C:\bootcamp\cagrid\caGrid • Apache Ant: C:\bootcamp\cagrid\apache-ant-1.7.0 • Globus Toolkit for Java: C:\bootcamp\cagrid\ws-core-4.0.3 Additional software: • Notepad++, A free file editor application: C:\bootcamp\cagrid\Notepad++ • Software Distributions: C:\bootcamp\cagrid\dist • Completed caGrid Boot Camp Data Service: C:\bootcamp\cagrid\BootCampDataSvc-Solution
caGrid Development Phases The phases to generating a caGrid Data Service backed by the caCORE SDK. • Deploy a Secure Tomcat Container • Create the Data Service Skeleton • The Data Service caCORE SDK 4.1 Wizard • Set Service Metadata • Creating a Test Class • Create Test Files • Deploying the Data Service • Executing the Tests • Add Security • Join the May Bootcamp Group • Re-run the tests
Phase 1: Deploy a Secure Tomcat Container (1) Begin Secure Tomcat Deployment These steps use a caGrid Installer that has been configured to use a local Tomcat installer rather than download from the Apache web site. This is to avoid the time required to perform the download. Note: your caGrid install has been configured to use the Community Training Grid. You must synchronize with the Community Training Grid trust fabric first. • Open a Windows Command Prompt Click “Start”->”Run” and type “cmd”, then press the Enter key • Synchronize with Community Training Grid trust fabric %> c:\bootcamp\cagrid\syncWithTrustFabric.bat • Change directory to the caGrid Installer %> cd c:\bootcamp\cagrid\caGrid-installer-1.3 • Execute the provided installer batch %> installLocal.bat The caGrid Installer will launch.
Deploy a Secure Tomcat Container (2) • Select the "I agree to this license" checkbox and click the "Next" button.
Deploy a Secure Tomcat Container (3) • Select the "Install/Configure Grid Service Container" checkbox. De-select the “Install/Configure caGrid Software” checkbox. Click the "Next" button.
Deploy a Secure Tomcat Container (4) • Select Tomcat as the Container you would like to install. • Check the box "Should this container be secure?" and click the "Next" button.
Deploy a Secure Tomcat Container (5) • We will use the default hostname and ports identified by the Installer. Click the "Next" button. Note: In real word usage the hostname must be externally routable and fully-qualified name or IP Address. For example, the Training Grid Master GTS external hostname is mastergts.training.cagrid.org and internal, non-routable hostname is ‘ cagrid-1_3-training-master-gts.cagrid.org. We specified mastergts.training.cagrid.org.
Deploy a Secure Tomcat Container (6) Obtain Grid account and Host Certificate In order to deploy a secure container to the Grid you must have host certificates that have been created by the Dorian Service. Using these steps you will register an account and obtain host certificates using the GAARDS UI. • Select the “Use GAARDS to obtain host credentials” option to create credentials and click the "Next" button. • The GAARDS UI will automatically open.
Deploy a Secure Tomcat Container (7) Register with GAARDS Note: if you have an NCI account (or KC account), you can skip this step. 11. Click the Account Management Menu item, then select Local Accounts ->Registration. 12. Provided the requested information, then click “Apply”. Note on password requirements: A valid password CANNOT contain a dictionary word and MUST contain at least one upper case letter, at least one lower case letter, at least one number, and at least one symbol (~!@#$%^&*()_-+={}[]|:;<>,.?)
Deploy a Secure Tomcat Container (8) Login to the Community Training Grid 13. Click the "Login" button. 14. In the Login dialog, enter your User ID and Password and click the “Login” button. Note: if you are using your NCI/KC username and password, select the NCI organization. Otherwise, use the default of “Training”.
Deploy a Secure Tomcat Container (9) Request Host Certificate Host certificates are used to establish secure communications between clients and services. • Open the "Request Host Certificate" Panel via the "My Account" menu and "Request Host Certificate” menu item.
Deploy a Secure Tomcat Container (10) • Accept the Host name that GAARDS identifies. • Accept the default location for creating the host credentials. On Windows, this will be a path like "C:\Documents and Settings\Administrator\.cagrid\certificates”. • Click Request Certificate. A dialog will display the outcome of your request. Note: if you receive an error (see the pic to the right), then follow the steps on the next slides. You will see the “Host Certificate Issued” dialog to the lower right after a successful request. Note the location of the certificate and key as shown in the dialog. • Close the GAARDS UI. • Click the “Next” button on the installer.
Host Certificate Request Error Workaround (1) If you receive the error to the right, type in a new hostname in the Request Host Certificate dialog. You can see the “host” text field to the right, where you should type a new hostname. After successfully retrieving a certificate, you will see the window to the right. Please note the file locations for your certificate and key files. You will use these next. Close GAARDS.
Host Certificate Request Error Workaround (2) Next, you will see the installer error below. Click OK. Click “Previous”. Select “Browse to host credential on the file system”. Click Next.
Host Certificate Request Error Workaround (3) Browse to the certificate and key. Click Next.
Deploy a Secure Tomcat Container (11) Finish Tomcat Installation 21. Please enter C:\bootcamp\cagrid in the "Directory" text box and click the "Next" button.
Deploy a Secure Tomcat Container (12) • The next screen will display a list of tasks that the installer will perform to install and configure tomcat, click the "Next" button. • Once the installer has completed installing all the components, click the "Next" button. • The final screen ask you to set the following environment variables: ANT_HOME, GLOBUS_LOCATION and CATALINA_HOME. These should already have been set by the Bootcamp administrators. • Click Finish. • Click Close.
Phase 2: Create the Data Service Skeleton (1) This phase of this tutorial involves starting the Introduce toolkit and using it to create the skeleton of the new grid data service that will communicate with the caCORE SDK service that you have created. • Open a Windows Command Prompt Click “Start”->”Run” and type “cmd”, then press the Enter key • Change directory to the caGrid Installer %> cd c:\bootcamp\cagrid\caGrid • Start Introduce using the provided Ant task %> ant introduce
Create the Data Service Skeleton (2) Create the Service Skeleton • Click Create caGrid Service Skeleton buttonon the toolbar at the top of the Introduce portal. The Create caGrid Service Skeleton screen will appear (see right). • STEP 1: Type the following as your service directory "C:\bootcamp\cagrid\BootCampDataSvc". ] STEP 2: Type "BootCampDataSvc" in the Service Name field. STEP 3: Type "gov.nih.nci.cagrid.bootcamp” in the Package Name field. STEP 4: Verify that the Namespace field contains "http://bootcamp.cagrid.nci.nih.gov/BootCampDataSvc". Click the “Data Service” radio button in the “Customize the Service” pane. Click the “Create” button.
Create the Data Service Skeleton (3) Select the Service Style The caGrid data services extension provides a pluggable framework for creating highly custom data services known as data service styles. Styles may be provided by a third party, or installed with caGrid. Styles are provided to create data services backed by various versions of the caCORE SDK. • When the Data Service Configuration window appears, use the drop down menu to select the caCORE SDK v 4.1 option. • Leave the check boxes for WS-Enumeration and Bulk Data Transport unchecked for this tutorial. • Click the OK button.
Phase 3: The Data Service caCORE SDK 4.1 Wizard (1) The caCORE SDK v. 4.1 Data Service style provides a wizard interface which is run prior to generation of the data service's source code. This wizard provided facilities to select the client interface to the caCORE SDK and configure it, as well as selection of a domain model and mapping that domain to XML schemas for use on the grid. The first panel of the wizard is strictly informational. The wizard displays the current step in the lower left hand corner of the window, and provides simple 'Previous' and 'Next' buttons to navigate the steps.
The Data Service caCORE SDK 4.1 Wizard (2) • Click the “Next: SDK Directory” button.
The Data Service caCORE SDK 4.1 Wizard (3) Selecting the caCORE SDK Directory The second step of the caCORE SDK v 4.1 Data Service Creation Wizard asks the service developer to select the directory in which the caCORE SDK system resides. Note: it is a requirement that this caCORE SDK installation has only one project in the “output” folder for the wizard to work. • Click the “Browse” button at the top-right. • Use the file browsing dialog to select the “C:\bootcamp\SDK411-Solution\sdk411” directory. • Click “Next: API Type” • to continue the wizard.
The Data Service caCORE SDK 4.1 Wizard (4) Choose the API Type There are two API types available when communicating with the caCORE SDK application. The Local API option requires that the grid data service be deployed on the same machine as the caCORE SDK application, which offers better query performance because the queries will not use the network. The Remote API option will work regardless of where the caCORE SDK application and grid data service are deployed (same machine or another machine). We will use the Remote API option.
The Data Service caCORE SDK 4.1 Wizard (5) Choose the API Type (continued) • Select the Remote API option from the API Type group. • Enter “127.0.0.1” as the Hostname. • Enter “8080” as the Port Number. • Leave the Use HTTPS option unchecked. • Click the “Next: Login” button.
The Data Service caCORE SDK 4.1 Wizard (6) Log In to the caCORE SDK Application Step four allows the service developer to supply login information to the caCORE SDK Application Service in the form of a username and password combination. If these values are supplied, the caCORE SDK Application Service API will be initialized with them when the grid data service starts up. • Click “Next: Domain Model” as we will not use a login in this tutorial.
The Data Service caCORE SDK 4.1 Wizard (6) Select the Domain Model Step five of the wizard requires the service developer to supply the domain model which the new data service will expose to the grid. Domain Models define the classes, their attributes, and relationships such that data services may be discovered by the types they expose, and CQL queries can be formulated without a priori knowledge of an arbitrary data service's contents. There are three potential sources for the domain model: Default XMI: This setting directs the wizard to convert the XMI model which is used by the caCORE SDK system specified on step 2 as the domain model. The XMI will be converted to a domain model XML document. Pre-Generated: This option allows the service developer to specify a pre-generated domain model XML document from the local file system. caDSR: This option lets the service developer select a project and packages from the cancer Data Standards Repository (caDSR) and generate a domain model extract from it for use in their data service.
The Data Service caCORE SDK 4.1 Wizard (7) Select the Domain Model (continued) • For this tutorial, select Default XMI as the Domain Model Source. • Type “1.0” as the Project Version to create a caGrid version of your Domain Model. Verify the other fields are as shown in the picture below. • Click the “Next: Schemas” button.
The Data Service caCORE SDK 4.1 Wizard (7) Map XML Schema to Model Packages Every class included in the domain model must have a corresponding XML schema representation so it may be utilized in the grid. The mapping panel of this wizard streamlines this process by simultaneously generating the mapping from model to schema and configuring serialization of the XML data types to correspond to the domain model's Java beans. The table shows the following information: 1) packages included in the domain model, 2) the current mapping status of the package, and 3) a “Map Schema” button to manually resolve the mapping for each package. The Automatically Map From SDK Generated Schemas button in the bottom of the panel makes use of the XML schemas provided by the caCORE SDK's output to map each package of the domain model.
The Data Service caCORE SDK 4.1 Wizard (8) Map XML Schema to Model Packages (continued) • Click the Automatically Map From SDK Generated Schemas button to perform the mapping from domain model packages to XML schema. • Verify that the “Status” field changes from “No Schema Assigned” to “OK”. • Click Done to complete the wizard.
The Data Service caCORE SDK 4.1 Wizard (9) Save your Service! You will see a progress bar during creation of your caCORE backed caGrid Data Service. • Click the “Save” button at the bottom of the "Modify Service Interface”.
Phase 4: Set Service Metadata (1) Service Metadata contains information about the developer of the service, the maintainer or administrator of the service and the hosting research center as well as optional information such as a web site URL. This information is used when registering your service with the Index Service for the Grid and is accessible via the caGrid Portal. In order to deploy your Grid service via Introduce or the command line you will need to set the service metadata.
Set Service Metadata (2) Open the Introduce Service Metadata Resource Property Editor • In Introduce click the Services tab of your Grid service. • Underneath "BootCampDataSvc", click the ServiceMetadata Resource Property • Click the Edit Resource Property button, on the right.
Set Service Metadata (3) The Resource Property EditorThe Resource Property Editor allows you to define your service metadata. Required fields will be identified by a pink background and a red "x”.
Set Service Metadata (4) Hosting Center and Point of Contact Hosting Center and Point of Contact fields identify where the service is deployed and an individual responsible for the service such as an administrator. Type “National Cancer Institute” in the Display Name field. Type “NCI” in the Short Name field. Click “<unspecified>” in the Current Points of Contact. Type your first name in the First Name field. Type your last name in the Last Name field. Type your email in the Email address field. Type your Affiliation. Ex. Your department Select the “Maintainer” Role.
Set Service Metadata (5) Hosting Center Address Provide the address of your Hosting Center. For this tutorial we'll use the NCI address: 2115 E. Jefferson, Rockville, MD 20852 Click the Address Tab Set “2115 E. Jefferson” to Street 1 field. Leave Street 2 empty. Set “Rockville” in the Locality(City). Set “MD” as the 2-character State abbreviation. Set “20852” as the Zip Code. Set "US” as the 2-character Country Code.
Set Service Metadata (6) Hosting Center Additional Information This tab provides you with the ability to supply metadata that allows users to find more information about you. All fields are optional.
Set Service Metadata (7) Service Information Point of Contact The Service Information tab provides information about a POC for the Service itself. Click “Service Information” tab. Click “<unspecified>” in the Current Points of Contact. Type your first name in the First Name field. Type your last name in the Last Name field. Type your email in the Email address field. Type your Affiliation. Ex. Your department Select the “Maintainer” Role. Click “Done” to finish editing metadata. Note: If you have missed a required field you will get an error message.
Set Service Metadata (8) Save Your Service! Remember to save your service. Introduce will allow you to roll back to a save point. This can be very useful in the event that you run into problems. Click the Save button at the bottom of the "Modify Service Interface"
Phase 5: Create a Test Class (1) Once the tutorial caGrid data service has been created and deployed to a service container, it may be invoked by a grid service client. This portion of the tutorial covers creating the test client Java class that will be used to query our caCORE SDK application. Note: Introduce generated services are generated to be supported by the Eclipse development environment. The service can be imported easily. For this tutorial we will use a more light weight editor, Notepad++, provided in c:\bootcamp\cagrid\Notepad++ directory.
Create a Test Class (2) Open the Base Test Class File To begin making use of the tutorial data service service, we'll need a place to put source code, as well as a Java source file to make use of the generic data service client and handle our query results. While the client class produced with the service itself can be used for very basic testing, domain specific logic should never be placed in the client when used in a production level system. This is because the client may be regenerated, or have methods added and removed by the Introduce toolkit as changes are made to the service model. • Open a Windows Command Prompt Click “Start”->”Run” and type “cmd”, then press the Enter key • Execute the makeQueryRunner.bat file %> c:\bootcamp\cagrid\makeQueryRunner.bat
Create a Test Class (3) Edit the Test Class file • Open Windows Explorer and navigate to C:\bootcamp\cagrid\Notepad++ • Double-click "Notepad++.exe” 3. Open the file: C:\bootcamp\cagrid\BootCampDataSvc\src\gov\nih\nci\cagrid\bootcamp\test\QueryRunner.java
Create a Test Class (4) Create the Class Constructor Now we need to add some implementation to the test class. Start by creating a very simple class constructor for the QueryRunner class. Copy and paste the following code into the new Java file, immediately after the class declaration and opening bracket (i.e., after “public class QueryRunner {”): private String serviceUrl; private String queryFilename; public QueryRunner(String serviceUrl, String queryFilename) { this.serviceUrl = serviceUrl; this.queryFilename = queryFilename; }
Create a Test Class (5) Import Required Classes and implement the performQuery method Now, a method must be added to the class which can handle calling the data service. To make code from caGrid available in this class some import statements must be added to the class. At the top of the class file, below the package declaration and before the class declaration, add the following: import java.util.Iterator; import gov.nih.nci.cagrid.common.Utils; import gov.nih.nci.cagrid.cqlquery.CQLQuery; import gov.nih.nci.cagrid.cqlresultset.CQLQueryResults; import gov.nih.nci.cagrid.data.client.DataServiceClient; import gov.nih.nci.cagrid.data.utilities.CQLQueryResultsIterator;
Create a Test Class (6) Add the following method after the main() method: private void performQuery() { try { DataServiceClient client = new DataServiceClient (serviceUrl); // initialize the generic data service client // deserialize the CQL query CQLQuery query = (CQLQuery) Utils.deserializeDocument(queryFilename, CQLQuery.class); System.out.println ("Querying"); // execute the query on the data service CQLQueryResults results = client.query(query); // create a results iterator System.out.println("Iterating"); Iterator iter = new CQLQueryResultsIterator(results, true); while (iter.hasNext()) { // iterate and print XML String value = (String) iter.next(); System.out.println("-- RESULT --"); System.out.println(value); } System.out.println("Done"); } catch (Exception ex) { ex.printStackTrace(); } }
Create a Test Class (7) Modify the main() method with the following content: public static void main(String[] args) { QueryRunner runner = new QueryRunner(args[0], args[1]); runner.performQuery(); } Note: C:\bootcamp\cagrid\dist\QueryRunner-final.java contains the complete file. You can copy this over to QueryRunner.java if you so choose.
Create a Test Class (8) Save the file. • Select “Save” from the File menu or type Ctrl+S
Phase 6: Create Test Files (1) The Test Files To make use of the QueryRunner test class, we'll need CQL queries, and Ant script and two convenience batch files. These files will be used to execute queries against the data service developed earlier. Copy prepared test files into your BootCampDataSvc directory: • Open Windows Explorer • Double-click the file c:\bootcamp\cagrid\copyTestFiles.bat The Organism Query (C:\bootcamp\cagrid\BootCampDataSvc\organismCQLQuery.xml) The following query will return a list of the organisms defined in the Organism database table and defined by our Data Model. <ns1:CQLQuery xmlns:ns1="http://CQL.caBIG/1/gov.nih.nci.cagrid.CQLQuery"> <ns1:Target name="gov.nih.nci.training.BootCamp.domain.Organism"> </ns1:Target> </ns1:CQLQuery>