80 likes | 195 Views
Special Characters Implementation. Zbigniew Majewski. 12th Joint INIS/ETDE Technical Committee Meeting 21-22 October 2009, Vienna, Austria. Outcome of the 11 th JTCM. XML implementation for INIS output and a new input tool development should allow introduction of Unicode.
E N D
Special Characters Implementation Zbigniew Majewski 12th Joint INIS/ETDE Technical Committee Meeting 21-22 October 2009, Vienna, Austria
Outcome of the 11th JTCM • XML implementation for INIS output and a new input tool development should allow introduction of Unicode. • the recommendation to develop a detailed plan regarding the possible implications of UNICODE implementation 12th INIS/ETDE Joint Technical Committee Meeting
Problem • INIS allows characters a-Z, digits and a few special characters • INIS records quality is constrained due to the limited character set • Some abstracts, original titles, author names, conference and journal titles use multilingual characters • For some INIS records, formulas are needed in their abstracts • Extra effort to eliminate rich character set of electronic input 12th INIS/ETDE Joint Technical Committee Meeting
Impacts • Storage • Databases and data exchange files • Processing • QA (checking rules, authority validation) • Retrieval • External applications • Presentation • HTML/XML enabled browsers • User Interface using tool specific data formats 12th INIS/ETDE Joint Technical Committee Meeting
Approach options • Unicode enabled storage based • Unicode encoding (binary representation) implemented in all layers (storage, processing and presentation) • Use of XML for interfaces (like Atomindex) • Mark-up based • ASCII based mark-up for Unicode characters implemented for storage and presentation • Processing modified to recognize mark-up or to become character agnostic 12th INIS/ETDE Joint Technical Committee Meeting
Barriers 12th INIS/ETDE Joint Technical Committee Meeting
Actions • Finalize upgrading the software platform used by INIS applications • Modify FIBRE and IDPS to allow Unicode characters • Extend use of XML as the INIS record format throughout the entire INIS process • Agree on use of Unicode in Atomindex • Replace the search engine to allow searches with Unicode characters 12th INIS/ETDE Joint Technical Committee Meeting
Thank you! 12th INIS/ETDE Joint Technical Committee Meeting