Diff for /wikisrc/Attic/unicode.mdwn between versions 1.3 and 1.4

version 1.3, 2014/10/13 05:23:02 version 1.4, 2019/04/27 14:13:56
Line 1 Line 1
 How to use wide-range characters a.k.a. UTF-8 in NetBSD.   How to use wide-range characters a.k.a. UTF-8 in NetBSD. 
   
 [![Just to show off. That's how UTF-8 encoded spam will look likeĀ ;-)][3]][4]  
   
    [3]: /images/200px-Unicoded-spam.png  
    [4]: /images/Unicoded-spam.png (Just to show off. That's how UTF-8 encoded spam will look likeĀ ;-))  
   
 [![][5]][6]  
   
    [5]: /images/magnify-clip.png  
    [6]: /images/Unicoded-spam.png (Enlarge)  
   
 Just to show off. That's how UTF-8 encoded spam will look like ;-)  
   
 **Contents**  **Contents**
   
 [[!toc levels=3]]  [[!toc levels=3]]
Line 22  This is all about Unicode on NetBSD.  Line 10  This is all about Unicode on NetBSD. 
   
 #  Note on wscons   #  Note on wscons 
   
 wscons doesn't support UTF-8, you'll need **X11** and a proper **X terminal emulator** for this to be of any use, or you get character mash for lunch! Only the [ASCII][40] part of Unicode, namely the **first 128 characters, will work** in your wscons console, as they overlap in both UTF-8 and ISO-8859 character sets:   wscons doesn't support UTF-8, you'll need **X11** and a proper **X terminal emulator** for this to be of any use, or you get character mash for lunch! Only the [ASCII](http://de.wikipedia.org/wiki/ASCII-Tabelle) part of Unicode, namely the **first 128 characters, will work** in your wscons console, as they overlap in both UTF-8 and ISO-8859 character sets: 
           
        [40]: http://de.wikipedia.org/wiki/ASCII-Tabelle (http://de.wikipedia.org/wiki/ASCII-Tabelle)       !"#$%&'()*+,-./0123456789:;<=>?     
            @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_ 
    !"#$%&'()*+,-./0123456789:;<=>?                `abcdefghijklmnopqrstuvwxyz{|}~  
        @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_   
        `abcdefghijklmnopqrstuvwxyz{|}~    
           
   
 #  Note on uwscons   
   
 Unofficial patches for 3.0 release can be found here: [ftp://tink.ims.ac.jp/pub/NetBSD/uwscons/][41]  
   
    [41]: ftp://tink.ims.ac.jp/pub/NetBSD/uwscons/ (ftp://tink.ims.ac.jp/pub/NetBSD/uwscons/)  
   
 #  pkgsrc   #  pkgsrc 
   
   * To make packages that support it use the ncurses library with wide-characters, add to /etc/mk.conf   To make packages that support it use the ncurses library with wide-characters, add to /etc/mk.conf 
           
       PKG_DEFAULT_OPTIONS+= ncursesw        PKG_DEFAULT_OPTIONS+= ncursesw
           
Line 48  Unofficial patches for 3.0 release can b Line 28  Unofficial patches for 3.0 release can b
   
 ##  ksh   ##  ksh 
   
   * Works.   Works. 
           
       chsh -s /bin/ksh        chsh -s /bin/ksh
           
   
 ##  mksh   ##  mksh 
   
   * This one is an OpenBSD based Korn shell, works pretty well compared to the pdksh.   This one is an OpenBSD based Korn shell, works pretty well compared to the pdksh. 
           
        cd /usr/pkgsrc/shells/mksh         cd /usr/pkgsrc/shells/mksh
        make install clean         make install clean
Line 64  Unofficial patches for 3.0 release can b Line 44  Unofficial patches for 3.0 release can b
   
 ##  zsh   ##  zsh 
   
   * Note: The stable version 4.2.x won't work. UTF-8 in the Z shell is enabled by default since 4.3.2.   Note: The stable version 4.2.x won't work. UTF-8 in the Z shell is enabled by default since 4.3.2. 
           
        cd /usr/pkgsrc/shells/zsh         cd /usr/pkgsrc/shells/zsh
        make install clean         make install clean
Line 73  Unofficial patches for 3.0 release can b Line 53  Unofficial patches for 3.0 release can b
   
 ##  tcsh   ##  tcsh 
   
   * Works out of the box.   Works out of the box. 
           
        cd /usr/pkgsrc/shells/tcsh         cd /usr/pkgsrc/shells/tcsh
        make install clean         make install clean
Line 82  Unofficial patches for 3.0 release can b Line 62  Unofficial patches for 3.0 release can b
   
 ##  bash   ##  bash 
   
   * Works out of the box.   Works out of the box. 
           
        cd /usr/pkgsrc/shells/bash         cd /usr/pkgsrc/shells/bash
        make install clean         make install clean
Line 91  Unofficial patches for 3.0 release can b Line 71  Unofficial patches for 3.0 release can b
   
 ##  Shell environment   ##  Shell environment 
   
   * Set the variables LANG and LC_CTYPE in your shell configuration file   Set the variables LANG and LC_CTYPE in your shell configuration file 
           
        export LANG="en_US.UTF-8"         export LANG="en_US.UTF-8"
        export LC_CTYPE="en_US.UTF-8"         export LC_CTYPE="en_US.UTF-8"
Line 135  The result should look like  Line 115  The result should look like 
   
 ##  urxvt   ##  urxvt 
   
   * recommended   recommended 
           
        cd /usr/pkgsrc/x11/rxvt-unicode         cd /usr/pkgsrc/x11/rxvt-unicode
        make install clean         make install clean
Line 161  The result should look like  Line 141  The result should look like 
   
 ##  screen   ##  screen 
   
   * .screenrc   `.screenrc` 
           
        defutf8 on         defutf8 on
        encoding UTF-8         encoding UTF-8
Line 169  The result should look like  Line 149  The result should look like 
   
 ##  lynx   ##  lynx 
   
   * .lynxrc   `.lynxrc`
           
        character_set=UNICODE (UTF-8)         character_set=UNICODE (UTF-8)
           
Line 205  and restart.  Line 185  and restart. 
   
 ##  vim   ##  vim 
   
   * .vimrc   `.vimrc`
           
        set encoding=utf-8                    set encoding=utf-8           
        set fileencoding=utf-8         set fileencoding=utf-8
Line 213  and restart.  Line 193  and restart. 
   
 ##  emacs   ##  emacs 
   
   * .emacs   `.emacs`
           
        ; === Set character encoding ===         ; === Set character encoding ===
        (setq locale-coding-system 'utf-8)         (setq locale-coding-system 'utf-8)
Line 231  This one gives you umlauts:  Line 211  This one gives you umlauts: 
   
 ##  mutt   ##  mutt 
   
   * mutt should work with all the above. If it doesn't, put in your .muttrc something like   mutt should work with all the above. If it doesn't, put in your .muttrc something like 
           
       set charset="utf-8:iso-8859-1"        set charset="utf-8:iso-8859-1"
           
Line 245  If you haven't set it in PKG_DEFAULT_OPT Line 225  If you haven't set it in PKG_DEFAULT_OPT
   
 ##  Apache2   ##  Apache2 
   
   * /usr/pkg/etc/httpd/httpd.conf   `/usr/pkg/etc/httpd/httpd.conf`
           
       AddDefaultCharset UTF-8        AddDefaultCharset UTF-8
           
   
 #  Converting files   #  Converting files 
   
   * If you have files containing non-ASCII ISO-8859 characters your system now will assume these are UTF-8 characters. They're not though, and the characters in these files will be misinterpreted which means that tools that use them will start breaking. Use iconv to convert these, which is part of the base system.   If you have files containing non-ASCII ISO-8859 characters your system now will assume these are UTF-8 characters. They're not though, and the characters in these files will be misinterpreted which means that tools that use them will start breaking. Use iconv to convert these, which is part of the base system. 
           
        iconv -f iso8859-1 -t utf-8 file >file.new         iconv -f iso8859-1 -t utf-8 file >file.new
           

Removed from v.1.3  
changed lines
  Added in v.1.4


CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb