Annotation of wikisrc/tutorials/unicode.mdwn, revision 1.2
1.1 sevan 1: How to use wide-range characters a.k.a. UTF-8 in NetBSD.
2:
3: **Contents**
4:
5: [[!toc levels=3]]
6:
7: # Introduction
8:
9: This is all about Unicode on NetBSD.
10:
11: # Note on wscons
12:
13: wscons doesn't support UTF-8, you'll need **X11** and a proper **X terminal emulator** for this to be of any use, or you get character mash for lunch! Only the [ASCII](http://de.wikipedia.org/wiki/ASCII-Tabelle) part of Unicode, namely the **first 128 characters, will work** in your wscons console, as they overlap in both UTF-8 and ISO-8859 character sets:
14:
15: !"#$%&'()*+,-./0123456789:;<=>?
16: @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_
17: `abcdefghijklmnopqrstuvwxyz{|}~
18:
19:
20: # pkgsrc
21:
22: To make packages that support it use the ncurses library with wide-characters, add to `/etc/mk.conf`
23:
24: PKG_DEFAULT_OPTIONS+= ncursesw
25:
26:
27: # Soup up a shell
28:
29: ## ksh
30:
31: Works.
32:
33: chsh -s /bin/ksh
34:
35:
36: ## mksh
37:
38: This one is an OpenBSD based Korn shell, works pretty well compared to the pdksh.
39:
40: cd /usr/pkgsrc/shells/mksh
41: make install clean
42: chsh -s /usr/pkg/bin/mksh
43:
44:
45: ## zsh
46:
47: UTF-8 in the Z shell is enabled by default since 4.3.2.
48:
49: cd /usr/pkgsrc/shells/zsh
50: make install clean
51: chsh -s /usr/pkg/bin/zsh
52:
53:
54: ## tcsh
55:
56: Works out of the box.
57:
58: cd /usr/pkgsrc/shells/tcsh
59: make install clean
60: chsh -s /usr/pkg/bin/tcsh
61:
62:
63: ## bash
64:
65: Works out of the box.
66:
67: cd /usr/pkgsrc/shells/bash
68: make install clean
69: chsh -s /usr/pkg/bin/bash
70:
71:
72: ## Shell environment
73:
74: Set the variables `LANG` and `LC_CTYPE` in your shell configuration file
75:
76: export LANG="en_US.UTF-8"
77: export LC_CTYPE="en_US.UTF-8"
78: export LC_ALL=""
79:
80:
81: or if you have a C-style shell
82:
83: setenv LANG "en_US.UTF-8"
84: setenv LC_CTYPE "en_US.UTF-8"
85: setenv LC_ALL ""
86:
87:
88: The other locale variables should be left untouched, which is "`C`" by default, to not confuse programs. Locales other than `en_US` probably won't work too well, since the fonts aren't in the base system yet, but you can install them and try your luck, of course.
89:
90: The result should look like
91:
92: % locale
93: LANG="en_US.UTF-8"
94: LC_CTYPE="en_US.UTF-8"
95: LC_COLLATE="C"
96: LC_TIME="C"
97: LC_NUMERIC="C"
98: LC_MONETARY="C"
99: LC_MESSAGES="en_US.UTF-8"
100: LC_ALL=""
101:
102:
103: # X Terminal emulators
104:
105: ## xterm
106:
107: * Versions 239 and over work well with default "fixed" font
108: * Also works with ttf DejaVu Mono font
109: * Appears to have trouble with some other fonts such as Bitstream Vera Sans Mono despite this font being more complete than DejaVu
110:
111: ## gnome-terminal
112:
113: * Awesome and works great with the ttf Bitstream Vera Sans Mono or DejaVu Mono.
114: * Somewhat bloated considering the dependencies.
115:
116: ## urxvt
117:
118: recommended
119:
120: cd /usr/pkgsrc/x11/rxvt-unicode
121: make install clean
122:
123:
124: ## uxterm
125:
126: * Works, as the 'u' might suggest, but last time I checked it sucked. Anyone?
127:
128: ## aterm
129:
130: * Doesn't work and probably never will.
131:
132: ## Eterm
133:
134: * Doesn't work either. Last time I checked the author was too busy with real-life.
135:
136: # Utilities
137:
138: ## less
139:
140: * Set the shell environment variable `LESSCHARSET` to "`utf-8`".
141:
142: ## screen
143:
144: `.screenrc`
145:
146: defutf8 on
147: encoding UTF-8
148:
149:
150: ## lynx
151:
152: `.lynxrc`
153:
154: character_set=UNICODE (UTF-8)
155:
156:
157: Or change "Display character set" in the options menu.
158:
159: ## irssi
160:
161: /set recode_autodetect_utf8 yes
162: /set recode_fallback iso-8859-1 (or whatever seems fit)
163: /set recode_out_default_charset UTF-8
164: /set term_charset UTF-8
165: /save
166:
167:
168: ## silc-client
169:
170: /set term_type utf-8
171: /save
172:
173:
174: and restart.
175:
176: ## vi
177:
178: * NetBSD's vi is based on nvi. It doesn't support wide-range characters as of version 1.79nb16 from 10/23/96, which is the one in current 4.99.15 and all releases thereunder.
179:
180: ## nvi
181:
1.2 ! leot 182: * pkgsrc' nvi (v1.81.6) works with wide-range characters if built with `wide-curses` option.
1.1 sevan 183:
184: ## vim
185:
186: `.vimrc`
187:
188: set encoding=utf-8
189: set fileencoding=utf-8
190:
191:
192: ## emacs
193:
194: `.emacs`
195:
196: ; === Set character encoding ===
197: (setq locale-coding-system 'utf-8)
198: (set-terminal-coding-system 'utf-8)
199: (set-keyboard-coding-system 'utf-8)
200: (set-selection-coding-system 'utf-8)
201: (prefer-coding-system 'utf-8)
202:
203:
204: This one gives you umlauts:
205:
206: ; === Make ä, ö, ü, ß work ===
207: (set-language-environment 'german)
208:
209:
210: ## mutt
211:
212: mutt should work with all the above. If it doesn't, put in your .muttrc something like
213:
214: set charset="utf-8:iso-8859-1"
215:
216:
217: If you haven't set it in PKG_DEFAULT_OPTIONS already, you may also add to mk.conf
218:
219: PKG_OPTIONS.mutt+= ncursesw
220:
221:
222: # Servers
223:
224: ## Apache2
225:
226: `/usr/pkg/etc/httpd/httpd.conf`
227:
228: AddDefaultCharset UTF-8
229:
230:
231: # Converting files
232:
233: If you have files containing non-ASCII ISO-8859 characters your system now will assume these are UTF-8 characters. They're not though, and the characters in these files will be misinterpreted which means that tools that use them will start breaking. Use iconv to convert these, which is part of the base system.
234:
235: iconv -f iso8859-1 -t utf-8 file >file.new
236:
237:
238: # Filesystems
239:
240: * Be careful with special characters in filenames, as they'll look weird when you try to access them from a non-unicode environment.
241:
CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb