Site mapAbout usConsultative CommitteeAsk LibrarianContributionCopyrightCitation GuidelineDonationHome        

CatalogAuthor AuthorityGoogle
Search engineFulltextScripturesLanguage LessonsLinks
 


Extra service
Tools
Export
On the Missing-Characters(Gaiji)of the Taisho Tripitaka Text Database Published by SAT
Author Moro, Shigeki
Source 太平洋鄰里協會會論文集=Proceedings of EBTI, ECAI, SEER & PNC Joint Meeting
Date1999
Pages323 - 328
Publisher中央研究院計算中心
Location臺北市, 臺灣 [Taipei shih, Taiwan]
Content type會議論文=Proceeding Article
Language英文=English
Note會議地點:台北中央研究院, 主協辦單位:中央研究院計算中心
KeywordMoro, Shigeki; 大正新脩大藏經; 中文文字輸入; 資料庫; 中文文字辨識; 佛經; Taisho Tripitaka; Chinese Character Input; Database; Chinese Character Recognition; 佛教經典=Sutra; Gaiji; SAT
AbstractIn March of 1998,the Association for the Computerization of Buddhist Texts (ACBUT) began publishing the electronic text database of the Taisho Tripitaka. SAT is the nickname of this project.

The Taisho Tripitaka includes both classical Chinese and Japanese texts, so that SAT texts are encoded by JIS code set at the present. In the not-too-distant future,they shall be changed to larger sets like Unicode. But there always are characters that can not be input. The solution of the Gaiji (missing characters) is the most important subject for the projects like SAT. Now SAT has about 90 published e-texts, and they include over 7 million characters. Over 17,000 characters cannot be input with JIS and about 1,500 with Unicode.

Following the KanjiBase developed by Dr. Christian Wittern,we now use SGML-style placeholders that are both standardized and system-independent. And we are investigating the empty-element-tag of XML as a new solution.
Hits475
Created date2001.02.20
Modified date2015.08.11



Best viewed with Chrome, Firefox, Safari(Mac) but not supported IE

Notice

You are leaving our website for The full text resources provided by the above database or electronic journals may not be displayed due to the domain restrictions or fee-charging download problems.

Record correction

Please delete and correct directly in the form below, and click "Apply" at the bottom.
(When receiving your information, we will check and correct the mistake as soon as possible.)

Serial No.
348740

Search History (Only show 10 bibliography limited)
Search Criteria Field Codes
Search CriteriaBrowse