Half-Width vs. Full-Width: A Tale of Two Characters
If you have filled out a Japanese online form, you've likely happened upon Japan's character input types of 半角 = hankaku = half-width and 全角 = zenkaku = full-width.
These two types of Japanese character sets must be used alternatively, depending on the input field.
For example, say an online credit card form asks you to input your mailing address. Typically, this must be inserted using zenkaku full-width characters. On the same page, you might be asked to input your phone number, which must be typed using hankaku half-width characters. And on and on it goes.
It's not just Japan's foreign residents who find toggling between the two character types annoying. As evidenced by numerous online discussions, Japanese, too, dislike the inconvenience of switching between full-width and half-width characters when filling out forms online.
So why hasn't this been fixed? And where did these two character types originate?
Image: Neto Labbo; Man enters information into an online form and receives an error message stating, "Please input alphanumeric in full-width characters."
What are full-width and half-width characters?
Image: Dekiru net
The above image shows full-width in the top row and half-width on the bottom row. Although they look similar, minus a little distortion, they come from different electronic character sets.
An article from Morgan Data System KK, a company that supports data entry and survey data aggregation, provides an explanation of zenkaku and hankaku, full-width and half-width characters, which I have translated and paraphrased below:
“Half-width characters are sometimes called 1-byte characters, and full-width characters as 2-byte characters. (Byte is a unit measuring the amount of data used in a computer.)
“1 byte is 8 binary digits (bits) and can show 256 different combinations. But 256 values is not enough to represent all the many Japanese letters. However, 2-byte characters (16 binary digits) are able to display 65,536 different combinations, which is sufficient for languages with many thousands of characters, such as Japanese and Chinese.”
The left-hand column of the above image shows a full-width character 「ア」 and the right-hand column shows the half-width character 「ｱ」. Both represent the same phonetic sound, but their source electronic character sets are different, as evidenced by their encoding.
The full-width half-width origin story
The very first standardized alphanumeric coding system (encoding standard), American Standard Code for Information Interchange (ASCII), was created by Bob Bember in May 1961.
Bember proposed using a common underlying code for alphanumeric characters to enable computers to communicate more easily.
ASCII has 8 slots (8 slots = 8 bits = 1 byte) for generating number combinations (256 patterns) which can then represent the same amount of characters of any language. English has only 94 letters, numbers, and symbols, etc., so 256 combinations was more than enough to accommodate English.
In 1969, Japanese programmers assigned half-width katakana to the remaining patterns next to the 94 alphanumeric characters, establishing the first Japanese electronic character set, then called JIS C 6220.
Image: The first Japanese electronic character set was half-width katakana, est. 1969, Japan Association of Graphic Arts Technology
Katakana, one of Japan's phonetic alphabets, was deemed sufficient for sending brief telegrams and the processes that were available in the public and private sector at that time.
Naturally, as computer functions advanced and word processing applications became a reality, a full Japanese electronic character set was needed that encompassed all of Japan's many thousands of kanji.
Lecture notes by Mr. Riyuichi Fujimoto from Kanazawa University, available here in Japanese, provide a fascinating history of the evolution in rendering Japanese characters computer-readable. The following is a brief timeline of events summarized from his discourse:
1963. ASCII characters: Computers can only render the English alphabet and numbers.
1969. ASCII characters + half-width katakana (JIS C 6220): Computers can handle half-width katakana characters. However, kanji cannot be displayed.
1978. Full-width characters (JIS C 6226.2): Kanji can now be displayed; however, there remain inconsistencies depending on the country and manufacturer.
1991. Unicode: Created to standardize language character sets through assigning uniform numbers to letters around the world, allowing computers to handle characters correctly on a computer from any country.
The current situation: no unification
At present, there are many encoding schemes that transfer Japanese electronic character sets (both half-width and full-width) into readable language across computers and programs. (This Stack Overflow thread explains the difference between character sets and encoding schemes.)
The following is a partial list of the most notable schemes that make Japanese readable.
However, as of yet, there has been no unification of Japanese electronic character sets. And so the use of both full-width and half-width persists.
Today, when filling out an online form, half-width is primarily used for fields where you might input an alphanumeric, such as emails, passwords, phone numbers.
Full-width is used for fields where you will need to type Japanese letters (your mail address, etc.)—in which case, the alphanumeric must be input using full-width.
If full-width is the newer/more advanced form, why are half-width characters still being used for numbers and email addresses?
Here's one answer proposed by Sphere System Consulting KK: Back in the day when computers' data processing capabilities were low, the focus was on keeping data as light and short as possible. This is why half-width characters were preferred by programmers over full-width because full-width takes up twice as much data as half-width consumes.
Additionally, in some cases, half-width katakana is preferred over full-width to present information when space is limited. For example, within bank books and receipts.
Where to go from here: solutions to the problem
Public opinion is, Japanese hate this, too, and have brought up this issue of half-width and full-width character sets (1, 2) on Idea Boxデジタル改革アイデアボックス, an open forum where individuals can make complaints or requests for improvement directly to the central government.
For Japanese users, one frustration is with double-byte characters not readable when typing in a URL, so they are forced to switch to half-width when inputting a URL. Full-width numbers must also be converted to half-width for phone numbers and email, etc.
For foreign residents, when typing their name using katakana, one must decide between half-width kana and full-width kana, which will vary from online form to online form and input field to input field, which often leads to mistakes that trigger delays in user task completion.
Until a unified electronic character set is adopted and enforced by the Japanese government, temporary solutions exist, which can improve the user experience on e-commerce pages and online bank and government forms.
A column written by a senior analyst of Forrester Research back in 2008 proposed a simple solution to automatically change alphanumeric characters, hiragana, and katakana to conform to a set code at the point of receiving the data.
Since solutions exist, some guesses as to the lack of progress: Japan's government and corporations haven't gotten around to updating their systems, are trying to conserve space for one reason or another, or are stuck in the status quo.
“It's always been like this” is an answer as to why, but it's not a good one.
An IT manager, who provided the technical review of this article noted, “There are often hierarchical issues where management (usually older people) lack sufficient knowledge of the technical spaces they manage and are risk-averse. They are promoted by years of service and not personal aptitude or managerial skill. They won't try to fix what doesn't seem broken, and they probably don't create open spaces for their subordinates to offer progressive ideas.
“I'm sure change is also greatly hampered by bureaucratic processes. Getting from idea to implementation is a long game of discussion, forms, approvals, and waiting. Maybe some [programmers] have tried and got rejected along the way. Others may simply just not want to go through the nightmare of dealing with the process.”
A closing thought: Master shortcuts for quick toggling
An immediate help to you while we all hope for better days ahead:
For Windows: Use F10 within an online form to toggle quickly between full-width and half-width characters.
For Mac users: Full-width, zenkaku katakana, is control + k. Half-width, hankaku kana, is control + ;.
Note: If you are an expert in this area and have further insight on this topic to share, feel free to get in touch at [email protected]
About TokyoMate’s suite of services
Your essential Japanese business needs provided by TokyoMate, a comprehensive solution trusted by the foreign executive community in Tokyo.
Get a virtual office address, a Tokyo-area phone number, your Japanese mail handled, and native Japanese bilingual business assistants, plus a no-risk 30-day moneyback guarantee with each of our pricing plans.