Web sites vs Web services

Many astronomical web sites already exist. They provide access to archives of objects information, collections of catalogs, as well as databases of images and spectra. These web sites are mainly designed with a human user in mind. Information is expected to be entered manually via Web forms and results of queries are presented in HTML or graphics.
To integrate access to archives into a more automated data processing flow, data must be transported from application to application in an efficient manner. Web services are create for this purpose. While Web sites are designed for human consumption, Web services enable machines to communicate with other machines without human intervention.


VO services are Web services that can:

VO services provide a consistent interface to access astronomical information by following standards that define the data exchange format. Web services, and therefore VO services, rely on XML, a flexible and easy to parse language to describe the information to be transported from application to application. By using XML and related standards, data consuming applications can dynamically discover interfaces and services offered by data producing applications with minimal human interaction.

The VO Registry

Demonstration:Keyword search, Advanced search

VO Registry is a VO service that provides information on available VO services and other related applications. A VO Registry is the yellow pages directory for VO services. Currently, several registries are available. And they can be found by querying the registry. In theory, all registries contain the same information by mirroring and updating each other with the newest changes. Also, they should implement the same Web service interface.
The VO Registry implementation at STSCI (http://nvo.stsci.edu/voregistry/index.aspx) features not only a SOAP interface , but also a HTTP-GET and a HTTP-POST implementation. The Registry also provides a WSDL (http://nvo.stsci.edu/VORegistry/registry.asmx?WSDL) that defines the syntax of methods and parameters.

There are two ways to query the VO Registry: via HTTP-GET/POST or via SOAP.


When using HTTP-GET or HTTP-POST, the querying program acts as a web browser. Conversely, any browser can act as a client program. Ignoring the HTTP part for a moment, the result is a XML file that starts like this:

<?xml version="1.0" encoding="utf-8"?>
<ArrayOfResource xmlns="http://www.us-vo.org">
  <Resource created="date" updated="date" status="active or inactive or deleted">
    <title xmlns="http://www.ivoa.net/xml/VOResource/v0.10">string</title>
    <shortName xmlns="http://www.ivoa.net/xml/VOResource/v0.10">string</shortName>
    <identifier xmlns="http://www.ivoa.net/xml/VOResource/v0.10">anyURI </identifier>
For example, try this link with your browser:
http://nvo.stsci.edu/VORegistry/registry.asmx/KeywordSearch?keywords=chandra+einstein&andKeys=True

Client programs must parse the XML and extract the desired information: for example, the URLs of VO Services that match the search criteria. The HTTP-GET/POST method does not provide any special mechanism to handle XML content. It is up to the client to convert the XML structure into a more useful form.

With SOAP, the situation is different. The underlying SOAP library (assuming we use one) hides the communication with the server, XML parsing, and error handling. This abstraction layer also converts the returned XML into native data structure that the client program can process easily. This is in contrast to the HTTP-GET/POST methods, where the client program handles the XML directly.

In either case, using HTTP-GET/POST or SOAP, with and without handling XML, one needs to know the semantics of the content, its fields (tags) and its values. The WSDL defines the syntax of the data. Human interpretation is still needed to determine the meaning of the data. Fortunately, thanks to the verbosity of XML and WSDL, plain text descriptions and other supporting materials can be embedded, so that the XML/WSDL files become self-documenting. This may lead to a more automated data exchange with less human interaction.

One way to understand the interface is to generate data types and functions (classes and methods) by using one of the conversion tools, for example, wsdl2java ($NVOSS_HOME/java/bin) and wsdl2py (not provided). Using the method stubs that these tools generate, users do not have to know about the underlying mechanics of networking and XML parsing.
Sample programs showing how to query the VO-Registry can be found at:

Example:

python VORegistry.py keywordSearch chandra einstein


This keywordSearch example displays only a few fields. The student can experiment with the program and add other fields. But how do we find out the field names?


python VORegistry.py printFields chandra


This example is similar to the last one, but displays all the fields of the first entry. Note that the result is from KeywordSearch, which means that it is a VOResource. Also note that some fields have subfields.

VO Metadata, see link http://nvo.stsci.edu/VORegistry/ListColumns.aspx.

Using the field names provided by the link, one can make a more advanced query.
See Advanced Query: http://nvo.stsci.edu/VORegistry/QueryRegistry.aspx?advanced=true&startRes=-1

Example:


python VORegistry queryResource "ResourceType = 'CONE' and contentlevel like '%research%'" 


The queryResource example displays amongst other things the service type and the service URL, which points to the VO Service that provides the actual data. The most common values of 'ResourceType' are CONE, SIAP, SKYNODE and 'TABULARSKYSERVICE', where SIAP types can be SIAP/CUTOUT and SIAP/ARCHIVE.

Another field of interest is CoverageSpectral, which can contain one or more of the following values:
The field CoverageSpatial could provide valuable information that enables applications to check whether the service covers the area of interest, but unfortunately it is not well populated.

Using the service URL provided by the VO Registry, one can proceed to query the VO service. At this point, we only know about the type and the spectral coverage of the service. Additional information must be retrieved from that service. A special Cone Search query with search radius set to 0deg can be used to retrieve information on the table columns, which must include ID and UCD. Other properties such as name (for presentation proposes) and plain text descriptions are optional.

Similarly, SIAP services accept the option 'FORMAT=METADATA', which produces a VOTable with only metadata.

SkyNodes offer a different interface. The operations Tables, Columns and Column can be used to retrieve meta information.

Shui Hung Kwok
NVO Summer School 2005