NISO Z39.50 Survey

NISO Z39.50 Survey Summary of Responses. Compiled by Z39.50 Maintenance Agency Fall 2000

Questions (summary) • Problems • ASN.1 or XML? • Compatibility with Version 3? 2? • Other Comments

Problems • Based on your experience what are the major problems with the Z39.50 standard and what is required to address them?

ASN.1 or XML • Would you like to see Z39.50 restructured to not require ASN.1 syntax and BER encoding? Would you like to see Z39.50 messages described by and encoded in XML, and transported via web protocols (such as SOAP)?

Compatibility/Version • Would your organization implement a new version of Z39.50 that was not compatible with version 3? With version 2? Following are responses. First: "Problems" 22 slides

Problems • Poor quality implementations which offer inferior search facilities to a systems native/proprietary interface. • Need better marketing of the benefits, make Z39.50 more accessible/understandable. • Lack of common profiles (being addressed). • The vast array of options available to implementers is probably the biggest hurdle to implementation; profiles (like Bath) are a vital solution • Servers may not always be available, addresses may have changed, or other connection information may be required. Need a central database of connection information.

Problems • Lack of uniform implementation. Tighter profiling, like development of the Bath Profile, will help with this. Impeding this effort, however, is the recent discussion of other, competing profiles. • Lack of a defined relationship between the query access points (bib1) and the structure of the records returned in Present Service. Standard provides no way for generating a new query based on the contents of a returned record.

Problems • Web model enables documents to be requested through a URL which results in the display of a document which, in turn, shows new URLs as hyperlinks; which leads to a natural cycle of browse and navigate. There can be no corresponding closure in Z39.50 because the query access points are not derived from an underlying document model. I cannot easily say "show me all the records whose author is the same as the author of this one" because the Author access point cannot be reliably discerned from the record syntax of the displayed record.

Problems • This separation was deliberate in order to enable databases with different physical structures to support a common query syntax, but it left the access points floating and they have never been satisfactorily structured. They need to be derived from a document model hierarchy to which physically disparate databases could be mapped. • The lack of closure in Z39.50 is a barrier to extensibility. Displaying a single record in Z39.50 is a protocol dead end, in contrast to WWW, which invites the user to use a displayed document to progress further.

Problems • Difference in implementation from Library Automation vendors as well as differences in what version of Z39.50 each vendor supports. Some of the more interesting possibilities in Z39.50 are not supported by many vendors (like holdings). • Too many concepts. Schemas, Tag Sets, Attribute Sets, ... Separation of physical from logical is wonderful. But surely it can be achieved with fewer semantic concepts. • Explain is too complex, inflexible, does not address the real need. It cannot be easily used to connect to a remote server and enable someone to search it unless it supports concepts the client understands.

Problems • It is almost impossible to write a generic client that can connect to any server and do queries and get results back. The client has to understand the same concepts as the server, and there are many of them. • It is hard to add new query parameterised operators due to the rigid structure of RPN queries. • uneven server capabilities. Needs better error processing, increased ability to deal with multimedia, federated database capability?

Problems • We wish Z39.50 would cover more attributes, and allow some local customization as well. • There are no major problems with the protocol itself. The language of the standard puts readers off, as well as the size and the number of options.The latter is a profiling issue. Another obstacle is the lack of standards for indexes; perhaps the new Texas effort will make some inroads.

Problems • Non-Roman text. We download RLIN MARC records with Hebrew 880's into our Innovative OPAC using Z39.50; the records display the Hebrew characters correctly. However, similar records downloaded from LC into our OPAC display strings of Roman characters that seem to be a different code for the Hebrew. I assume this is because the vendors have chosen different ways to encode Hebrew text.

Problems • In addition, searching Hebrew characters through Z39.50 do not work. RLIN clearly indexes these fields (one can search them in Hebrew using the RLIN software), but searches using Hebrew characters do not work for RLIN through the Z39.50 connection in our OPAC. (Does not work for the Z39.50 connection to the Library of Congress through the Web either.) • Don't know if Z39.50 itself has anything to do with choices for encoding non-Roman languages, but it should include provisions that would allow it to handle non-Roman text regardless of vendor.

Problems • The mainstream encoding schemes -- BER and MARC -- cannot be entered directly from keyboards or rendered directly onto displays without special software. This discourages widespread use, application development, and adoption. • Z39.50 documentation is encumbered with excessive formalism. This includes reliance on OSI terminology (e.g., why say Protocol Data Unit and service Element when you can say Command and Response instead?), the formulation of the query as "RPN" (very confusing and ultimately very irrelevant in parsing ASN.1/BER), …..

Problems • ….. the much-discussed but impoverished result set model, and the unnecessary reliance on ASN.1 (Z39.50 uses only strings and a few integers -- not enough to warrant the formalism). This discourages learning, understanding, hence use and adoption. • Vendors do not implement the standard in the same way and/or they limit what is implemented from the standard. The complexities of different vendor software do not allow the user to establish relationships with a variety of sources. ……..

Problems • …..Limitation of fields that a vendor supports - e.g. Endeavor doesn't support all of the Bib-1 attribute set and they don't support the ability to search authority files. There should be more mandatory requirements within the standard so that all Z39.50 servers support the same features. • Inability to limit by the three character MARC language code, the inability to do filtering, qualifying, or post-searching of results, and there is no support of the completeMARC21 character set for diacritics and special characters. …...

Problems • The diagnostic (error) messages supplied by the vendor and the standard are rather cryptic to the user; these messages should be more user friendly - e.g. 'Too many records retrieved' should be translated to 'There is a limit of 10,000 records which can be retrieved'. • Indexing differences. Least common denominator indexing is not the answer. LMSs should support multiple indexing schemes: one for Z39.50 access, one for richer, local access.

Problems • Version 3 seems to have many features not yet implemented by the majority of systems, which would make them difficult, if not impossible to test. The benefit of the Z profiles seems to be that they specify exactly which features a group of libraries will use from the standard. The profiles also specify which of the standard's features need to be supported in library systems. We are anxious to implement features which will help our customers.

Problems • Lack of uniform interpretation of searchattributes. This problem is related to both semantics (lack of adherence to prescribed semantics coupled with lack of clear semantics) and lack of indexing standards. For the semantics part, the effort to define a bib-2 attribute set to eventually replace bib-1, cast within the new attribute architecture, seems a step in the right direction. In addition, profiling efforts like Bath seem to be very useful…...

Problems • With respect to indexing: this problem will never be completely solved without indexing standards, and it is not clear that indexing standards are achievable, however, the Texas initiative on indexing standards is certainly worth pursuing. • The standard is mis-understood. People are sometimes intimidated by the scope of Z39.50, and are somehow led to believe that it is necessary to implement the entire standard; while Z39.50 is intended as a comprehensive information retrieval standard, its intention is not that you implement the whole standard, but rather, only what you need. …...

Problems • ….There aren’t always clear guidelines however, how to determine what part should and should not be implemented. Profiling efforts are helping with this problem. • Another area where Z39.50 is misunderstood: it is perceived by many (particularly in the Web community) to be strictly a library or bibliographic standard. This is a perception problem that needs to be addressed by more and better publicity.

Problems • …..The standard is difficult to understand. (This is different than the problem cited above, that the standard is misunderstood). Z39.50 is necessarily complex, but it could be described in a manner making it easier to understand. A new service definition should be written to replace the existing service definition. New service-definition techniques could be investigated, for example, a verb-based definition . • More tools needed to allow quicker implementation. These might include additional toolkits as well as a standard API.

Problems • Miscellaneous Z39.50 areas that merit serious study: • The Z39.50 URL: What purposes should a Z39.50 url serve? Are the existing definitions (z39.50r and z39.50s) serving these purposes? Should they be revised? Should a new Z39.50 url be defined? And if so should it replace or compliment the existing definitions? • Distributed searching: there should be serious study about how results from distributed searching can be integrated. • Explain. It is fairly well-agreed that clients and users need a mechanism to learn details about a particular server, and to discover servers. It is unclear that Explain is serving that purpose in a useful way. Should the existing Explain definition be replaced by something more useful?

Problems • Determination of specific areas of compliancy with both client and server software against the current published version of the Z39.50 standard, especially in regards to OPAC record syntax. Also, the major LMS vendors still provide only sketchy documentation on the basic information that is needed to connect any Z39.50 client to their Z39.50 server implementation. Often the client software vendors, have to profile Z39.50 server implementations themselves after sessions of debugging/tracing Z39.50 conversations between client and server.

Problems • The standard is difficult to read – certainly the changes proposed in the 5 year review to streamline it will help improve it, especially removing the OSI language. Anything else that can be done to improve readability should be examined. The use of older technology like ASN.1/BER is an impediment to implementation. Using ASN.1/BER requires understanding of some very obscure ISO documents and adds a level of complexity to debugging implementations. …..

Problems • ….. Need a serious look at functionality that has never been implemented or implemented only sparsely – removing these from future versions may reduce confusion to new implementers as to what is really core – and also a sense among them that the standard is complex and difficult to implement. • The command language is not widely known or understood by programmers The standard lacks market penetration outside of the library community because functionality is too limited. The assumption that all target databases are in MARC format is very limiting. Therefore, integration with other in-house information systems is difficult to achieve.

And Next: the "ASN.1 or XML" question

ASN.1/XML • From technical standpoint : No benefit. • From the viewpoint of achieving greater ‘mindshare’ /accessibility : Maybe, if it achieves this. Analysis necessary, to see if this would increase acceptance. • Complicated technology. Steep learning curve. Impossible to find programmers that already know about ASN.1 and BER. XML skilled programmers much easier to find.

ASN.1/XML • Not sure that this would make implementation easier, given the tools available to do the low-level stuff anyway. We're using the YAZ toolkit and I didn't have to deal with this at all. Introducing new encoding methods will give rise to new incompatibility issues. On the plus side though, this would make the messages more readable for debugging purposes. This could be useful but this usage would have to be implemented very quickly. • This seems like an inevitable transition, and NISO should begin this restructuring now. Newer standards like XML and SOAP have been broadly adopted by the computer and web industries. Recent decisions by the NISO Committee AT (NCIP) confirm this trend.

ASN.1/XML • Unlike ASN.1/BER, support for XML is being built into a wide variety of third party software. This, and the fast-growing pool of XML talent, will help speed development. • Could be done, but this would not address the fundamental weakness of the protocol. • What’s needed is a definition of a logical document hier-archy with an abstract Dublin Core document at the most general level, then a succession of document types more specific to user requirements. For instance, in the Library world, one could envisage a hierarchy with a DC document at the top defining simple Creator, Title, Dates etc.

ASN.1/XML • A generalised Bibliographic document could refine the DC definition to specify Personal Author, Corporate Author,Unifor Title etc. At the lowest level we might have a very detailed schema for a MARC record which define tags and subfields. • Databases would be require to support the most general level but would arbitrarily support more refined structures. The use of inheritance and derivation in XML Schemas would ensure that a query targeted at one level could be logically transformed into a more general level, enabling different databases to respond to the same query.

ASN.1/XML • Naturally, the display of a result document would make it easy to produce more queries, because the concept of "Subject" for instance, could be readily identified in the result document and therefore it would be straightforward to say, "show me other records on this subject". • XML would be great, if it could handle library holdings information. So far, the implementations I have seen do not. This is vital information for libraries, and means a database that is dynamic.

ASN.1/XML • Pros: • It will make Z39.50 much more accessible. • May make interoperability with other standards easier. • XML allows easier extensibility of packets for new features. • Cons: • XML will result in much longer packets than BER. • XML will be slower to decode (probably).

ASN.1/XML • Yes (to not requiring ASN.1 and BER). And yes, while XML is not perfect, it would be a good start for a better encoding • yes • XML is probably the way to go, yes.

ASN.1/XML • ASN.1 is becoming increasingly obsolete, particularly so for Z39.50, which cannot use an ASN.1-version beyond 1992 (the 1994 or 1998 versions of ASN.1 are not practical for Z39.50). Thus if Z39.50 contemplates migrating to a more contemporary (or even state-of-the-art) description mechanism, it must abandon ASN.1, and the clear choice (currently) to replace ASN.1 would be XML. We think that the question of replacing ASN.1 with XML is worthy of serious consideration by the ZIG and that it would be worth the resources required to devote serious study to this question. …..

ASN.1/XML • ….Any such effort must, however, retain the rich Z39.50 functionality, semantics and modeling. • In conjunction with such an effort, we would also support research into mapping Z39.50 (as an XML protocol) to SOAP. The ZIG should carefully monitor the current W3C activity on XML Protocols whose primary focus is to develop SOAP as a supporting protocol for XML application protocols. This is an opportune time for Z39.50 to be considering this sort of metamorphosis, because of the W3C activity.

ASN.1/XML • There were efforts in the past to map Z39.50 to HTTP, which failed, because Z39.50 is not a good match for HTTP, however these efforts were taken because of the desire to make Z39.50 a more web-friendly protocol. We think that SOAP is a much better match for Z39.50 than HTTP, and to render Z39.50 an XML protocol with bindings to SOAP would certainly make it a web-friendly protocol.

ASN.1/XML • Yes, very much so. XML seems to be a good option to pursue. • Yes, we believe that migrating Z39.50 to modern mainstream technologies would help reduce the complexities of implementation and might make the protocol more likely to be implemented in a wider range of application areas and would make it easier to integrate Z39.50 with a lot of the resources and developments going on in the Internet and Web communities. It will also be possible to take advantage of tools and infrastructure being developed for those larger communities in building new Z39.50 implementations..

ASN.1/XML • Don’t know about ASN.1 or BER. Yes to XML, and “transported via web protocols”. • No need for any of these activities. Any changes or new version development must start with a statement of requirements, not a statement of desired encoding methods. • What problem would these changes solve? The reasons I hear are political and marketing - not to be ignored perhaps. If changes such as these would create greater acceptance and expansion of use of the standard, I guess that is the argument for this type of change.

ASN.1/XML • Z39.50 as a query language is not completely successful now (e.g. our Info Warehouse cannot query the UNCAPS database effectively without both supporting MARC formats). Investing in further development of Z39.50 seems counterproductive. Better to move onto a more widely adopted query language, XML/Query, and invest the time in standardizing DTDs for different kinds of information, in addition to bibliographic information.

ASN.1/XML • From a museum point of view, more of the collections management software vendors need to provide a Z39.50 implementation. I would like to see more integration of the z39.50 standard into XML as a significant portion of our data is held in text rather than database format and we would like to insure integration. • Unlike ASN.1/BER, support for XML is being built into a wide variety of third party software. This, and the fast-growing pool of XML talent, will help speed future development.

Next up: the question of "version"

Would you implement a new version of Z39.50 not compatible with version 3? 2? • Yes • Absolutely(3)/probably(2). • I don't think it would be good to have a new version that was incompatible with the previous versions. There are already enough compatibility issues without introducing more. If a backward-incompatible version of Z39.50 was approved we would implement the new version, but likely there would be a significantly longer gap between the release of this version and our implementation than there would if a backward compatible version was released.

Would you implement a new version of Z39.50 not compatible with version 3? 2? • Yes • No. We would want to stay as compatible with the standard releases as possible. • Yes • If the user community demanded it, yes: but the introduction of a new version not backward compatible would likely result in the whole standard being abandoned. • We are currently compatible with version 2, I would love to implement a version 3 compatible version, but our vendor, epixtech, is not supporting it for Dynix, the library automation system we are using.

Would you implement a new version of Z39.50 not compatible with version 3? 2? • We would look at it seriously. We do lots of XML work. We would have to continue to support V2 and V3 so if it was radically different it would cause additional support overheads. But we do lots of custom local extensions that we may be able to drop with V4, meaning that we might drop the extensions and just use V4 between our own clients and servers. • It would not be a small job to undertake, and we would only do it if there was real benefit in terms of simplification of the standard.

Would you implement a new version of Z39.50 not compatible with version 3? 2? • We transfer data from a lot of vendors and as long as everybody upgrades at once that would work. • If it solves something. For our primary present application I see no reason to change. If significant number of information resources became available under a new method we would certainly be interested...

Would you implement a new version of Z39.50 not compatible with version 3? 2? • As of now, we have no compelling reason to implement a new version of Z39.50 that is not compatible with version 3 or version 2. • Maybe. Depends on what it is. • Only if the new version followed the development trends of the World Wide Web using XML. • Yes, depending on two issues: sufficient range of features for modern information retrieval must be available; implementation should be easy.

Would you implement a new version of Z39.50 not compatible with version 3? 2? • Not immediately. We have a huge, divergent user base to bring along. We often put things up for demo/trial though, while continuing to support older schemes. • This would be very much dependent on what our LMS vendor offers, as we are pretty much tied to what they provide. • We would hope to, but it would depend on the amount of pressure that can be exerted on vendors, which in turn would depend on how well the influential profiles (e.g. Bath) can adapt to the new version.

Would you implement a new version of Z39.50 not compatible with version 3? 2? • Decisions on implementing new versions of Z39.50 will be based on functionality offered, demand from the communities we serve and a sense of how widely deployed the new functionality will be in those communities, how well that new version integrates with developments taking place in the larger Internet environment, with consideration given to level of difficulty of implementation and how disruptive such implementation is to the currently deployed installed base.

NISO Z39.50 Survey