File:  [NetBSD Developer Wiki] / wikisrc / tutorials / unicode.mdwn
Revision 1.4: download - view: text, annotated - select for diffs
Sun Sep 27 12:21:10 2020 UTC (3 months, 4 weeks ago) by leot
Branches: MAIN
CVS tags: HEAD
Setting LANG is enough to adjust user's locale, suggests that.

How to use wide-range characters a.k.a. UTF-8 in NetBSD. 


[[!toc levels=3]]

#  Introduction 

This is all about Unicode on NetBSD. 

#  Note on wscons 

wscons doesn't support UTF-8, you'll need **X11** and a proper **X terminal emulator** for this to be of any use, or you get character mash for lunch! Only the [ASCII]( part of Unicode, namely the **first 128 characters, will work** in your wscons console, as they overlap in both UTF-8 and ISO-8859 character sets: 

#  pkgsrc 

To make packages that support it use the ncurses library with wide-characters, add to `/etc/mk.conf`
      PKG_DEFAULT_OPTIONS+= ncursesw

#  Soup up a shell 

##  ksh 

      chsh -s /bin/ksh

##  mksh 

This one is an OpenBSD based Korn shell, works pretty well compared to the pdksh. 
       cd /usr/pkgsrc/shells/mksh
       make install clean
       chsh -s /usr/pkg/bin/mksh

##  zsh 

UTF-8 in the Z shell is enabled by default since 4.3.2. 
       cd /usr/pkgsrc/shells/zsh
       make install clean
       chsh -s /usr/pkg/bin/zsh

##  tcsh 

Works out of the box. 
       cd /usr/pkgsrc/shells/tcsh
       make install clean
       chsh -s /usr/pkg/bin/tcsh

##  bash 

Works out of the box. 
       cd /usr/pkgsrc/shells/bash
       make install clean
       chsh -s /usr/pkg/bin/bash

##  Shell environment 

Set the variables `LANG` in your shell configuration file :
       export LANG="en_US.UTF-8"

or if you have a C-style shell 
       setenv LANG "en_US.UTF-8"

The result should look like 
       % locale

#  X Terminal emulators 

##  xterm 

  * Versions 239 and over work well with default "fixed" font 
  * Also works with ttf DejaVu Mono font 
  * Appears to have trouble with some other fonts such as Bitstream Vera Sans Mono despite this font being more complete than DejaVu 

##  gnome-terminal 

  * Awesome and works great with the ttf Bitstream Vera Sans Mono or DejaVu Mono. 
  * Somewhat bloated considering the dependencies. 

##  urxvt 

       cd /usr/pkgsrc/x11/rxvt-unicode
       make install clean

##  uxterm 

  * Works, as the 'u' might suggest, but last time I checked it sucked. Anyone? 

##  aterm 

  * Doesn't work and probably never will. 

##  Eterm 

  * Doesn't work either. Last time I checked the author was too busy with real-life. 

#  Utilities 

##  less 

  * Set the shell environment variable `LESSCHARSET` to "`utf-8`". 

##  screen 

       defutf8 on
       encoding UTF-8

##  lynx 

       character_set=UNICODE (UTF-8)

Or change "Display character set" in the options menu. 

##  irssi 
       /set recode_autodetect_utf8 yes
       /set recode_fallback iso-8859-1  (or whatever seems fit)
       /set recode_out_default_charset UTF-8          
       /set term_charset UTF-8           

##  silc-client 
       /set term_type utf-8

and restart. 

##  vi 

  * NetBSD's vi is based on nvi. It doesn't support wide-range characters as of version 1.79nb16 from 10/23/96, which is the one in current 4.99.15 and all releases thereunder. 

##  nvi 

pkgsrc' nvi (v1.81.6) works with wide-range characters if built with `wide-curses` option,
e.g. by adding to mk.conf:

      PKG_OPTIONS.nvi+= wide-curses

##  vim 

       set encoding=utf-8           
       set fileencoding=utf-8

##  emacs 

       ; === Set character encoding ===
       (setq locale-coding-system 'utf-8)
       (set-terminal-coding-system 'utf-8)
       (set-keyboard-coding-system 'utf-8)
       (set-selection-coding-system 'utf-8)
       (prefer-coding-system 'utf-8)

This one gives you umlauts: 
       ; === Make ä, ö, ü, ß work ===
       (set-language-environment 'german)

##  mutt 

mutt should work with all the above. If it doesn't, put in your .muttrc something like 
      set charset="utf-8:iso-8859-1"

If you haven't set it in PKG_DEFAULT_OPTIONS already, you may also add to mk.conf 
      PKG_OPTIONS.mutt+= ncursesw

#  Servers 

##  Apache2 

      AddDefaultCharset UTF-8

#  Converting files 

If you have files containing non-ASCII ISO-8859 characters your system now will assume these are UTF-8 characters. They're not though, and the characters in these files will be misinterpreted which means that tools that use them will start breaking. Use iconv to convert these, which is part of the base system. 
       iconv -f iso8859-1 -t utf-8 file >

#  Filesystems 

  * Be careful with special characters in filenames, as they'll look weird when you try to access them from a non-unicode environment. 

CVSweb for NetBSD wikisrc <> software: FreeBSD-CVSweb