We successfully incorporated the Argon2 reference implementation into NetBSD/amd64 for our 2019 Google Summer of Coding project. We introduced our project here and provided some hints on how to select parameters here. For our final report, we will provide an overview of what changes were made to complete the project.

Incorporating the Argon2 Reference Implementation

The Argon2 reference implementation, available here, is available under both the Creative Commons CC0 1.0 and the Apache Public License 2.0. To import the reference implementation into src/external, we chose to use the Apache 2.0 license for this project.

During our initial phase 1, we focused on building the libargon2 library and integrating the functionality into the existing password management framework via libcrypt. Toward this end, we imported the reference implementation and created the "glue" to incorporate the changes into /usr/src/external/apache. The reference implementation is found in

m2$ ls /usr/src/external/apache2/argon2                                                                                    
Makefile dist     lib      usr.bin
The Argon2 reference implementation provides both a library and a binary. We build the libargon2 library to support libcrypt integration, and the argon2(1) binary to provide a userland command-line tool for evaluation. To build the code, we add MKARGON2 to
_MKVARS.yes= \
        MKARGON2 \
and add the following conditional build to /usr/src/external/apache2/Makefile
.if (defined(MKARGON2) && ${MKARGON2} != "no")
SUBDIR+= argon2
After successfully building and installation, we have the following new files and symlinks
To incorporate Argon2 into the password management framework of NetBSD, we focused on libcrypt. In /usr/src/lib/libcrypt/Makefile, we first check for MKARGON2
.if (defined(MKARGON2) && ${MKARGON2} != "no")
If HAVE_ARGON2 is defined and enabled, we append the following to the build flags
.if defined(HAVE_ARGON2)
SRCS+=          crypt-argon2.c
CFLAGS+=        -DHAVE_ARGON2 -I../../external/apache2/argon2/dist/phc-winner
LDADD+=         -largon2 
As hinted above, our most significant addition to libcrypt is the file crypt-argon2.c. This file pulls in the functionality of libargon2 into libcrypt. Changes were also made to pw_gensalt.c to allow for parameter parsing and salt generation.

Having completed the backend support, we pull Argon2 into userland tools, such as pwhash(1), in the same way as above

.if ( defined(MKARGON2) && ${MKARGON2} != "no" )
Once built, we can specify Argon2 using the '-A' command-line argument to pwhash(1), followed by the Argon2 variant name, and any of the parameterized values specified in argon2(1). See our first blog post for more details. As an example, to generate an argon2id encoding of the password password using default parameters, we can use the following
m2# pwhash -A argon2id password
To simplify Argon2 password management, we can utilize passwd.conf(5) to apply Argon2 to a specified user or all users. The same parameters are accepted as for argon2(1). For example, to specify argon2i with non-default parameters for user 'testuser', you can use the following in your passwd.conf
m1# grep -A1 testuser /etc/passwd.conf 
        localcipher = argon2i,t=6,m=4096,p=1
With the above configuration in place, we are able to support standard password management. For example
m1# passwd testuser
Changing password for testuser.
New Password:
Retype New Password:

m1# grep testuser /etc/master.passwd  


The argon2(1) binary allows us to easily validate parameters and encoding. This is most useful during performance testing, see here. With argon2(1), we can specify our parameterized values and evaluate both the resulting encoding and timing.
m2# echo -n password|argon2 somesalt -id -p 3 -m 8
Type:           Argon2id
Iterations:     3
Memory:         256 KiB
Parallelism:    3
Hash:           97f773f68715d27272490d3d2e74a2a9b06a5bca759b71eab7c02be8a453bfb9
Encoded:        $argon2id$v=19$m=256,t=3,p=3$c29tZXNhbHQ$l/dz9ocV0nJySQ09LnSiqb
0.000 seconds
Verification ok
We provide one approach to evaluating Argon2 parameter tuning in our second post. In addition to manual testing, we also provide some ATF tests for pwhash, for both hashing and verification. These tests are focus on encoding correctness, matching known encodings to test results during execution.

tp: t_argon2_v10_hash
tp: t_argon2_v10_verify
tp: t_argon2_v13_hash
tp: t_argon2_v13_verify

cd /usr/src/tests/usr.bin/argon2

info: atf.version, Automated Testing Framework 0.20 (atf-0.20)
info: tests.root, /usr/src/tests/usr.bin/argon2


tc-so:Executing command [ /bin/sh -c echo -n password | \
argon2 somesalt -v 13 -t 2 -m 8 -p 1 -r ]
tc-end: 1567497383.571791, argon2_v13_t2_m8_p1, passed



We have successfully integrated Argon2 into NetBSD using the native build framework. We have extended existing functionality to support local password management using Argon2 encoding. We are able to tune Argon2 so that we can achieve reasonable performance on NetBSD. In this final post, we summarize the work done to incorporate the reference implementation into NetBSD and how to use it. We hope you can use the work completed during this project. Thank you for the opportunity to participate in the Google Summer of Code 2019 and the NetBSD project!

Posted Sunday afternoon, January 12th, 2020 Tags:

Upstream describes LLDB as a next generation, high-performance debugger. It is built on top of LLVM/Clang toolchain, and features great integration with it. At the moment, it primarily supports debugging C, C++ and ObjC code, and there is interest in extending it to more languages.

In February 2019, I have started working on LLDB, as contracted by the NetBSD Foundation. So far I've been working on reenabling continuous integration, squashing bugs, improving NetBSD core file support, extending NetBSD's ptrace interface to cover more register types and fix compat32 issues, fixing watchpoint and threading support.

Throughout December I've continued working on our build bot maintenance, in particular enabling compiler-rt tests. I've revived and finished my old patch for extended register state (XState) in core dumps. I've started working on bringing proper i386 support to LLDB.

Generic LLVM updates

Enabling and fixing more test suites

In my last report, I've indicated that I've started fixing test suite regressions and enabling additional test suites belonging to compiler-rt. So far I've been able to enable the following suites:

  • builtins library (alternative to libgcc)
  • profiling library
  • ASAN (address sanitizer, static and dynamic)
  • CFI (control flow integrity)
  • LSAN (leak sanitizer)
  • MSAN (memory sanitizer)
  • SafeStack (stack buffer overflow protection)
  • TSAN (thread sanitizer)
  • UBSAN (undefined behavior sanitizer)
  • UBSAN minimal
  • XRay (function call tracing)

In case someone's wondering how different memory-related sanitizers differ, here's a short answer: ASAN covers major errors that can be detected with approximate 2x slowdown (out-of-bounds accesses, use-after-free, double-free...), LSAN focuses on memory leaks and has almost no overhead, while MSAN detects unitialized reads with 3x slowdown.

The following test suites were skipped because of major known breakage, pending investigation:

  • generic interception tests (test runner can't find tests?)
  • fuzzer tests (many test failures, most likely as a side effect of LSAN integration)
  • clangd tests (many test failures)

The changes done to improve test suite status are:

Repeating the rationale for disabling ASLR/MPROTECT from my previous report: the sanitizers or tools in question do not work with the listed hardening features by design, and we explicitly make them fail. We are using paxctl to disable the relevant feature per-executable, and this makes it possible to run the relevant tests on systems where ASLR and MPROTECT are enabled globally.

This also included two builtin tests. In case of clear_cache_test.c, this is a problem with test itself and I have submitted a better MPROTECT support for clear_cache_test already. In case of enable_execute_stack_test.c, it's a problem with the API itself and I don't think it can be fixed without replacing it with something more suitable for NetBSD. However, it does not seem to be actually used by programs created by clang, so I do not think it's worth looking into at the moment.

Demise and return of LLD

In my last report, I've mentioned that we've switched to using LLD as the linker for the second stage builds. Sadly, this was only to discover that some of the new test failures were caused exactly by that.

As I've reported back in January 2019, NetBSD's dynamic loader does not support executables with more than two segments. The problem has not been fixed yet, and we were so far relying on explicitly disabling the additional read-only segment in LLD. However, upstream started splitting the RW segment on GNU RELRO, effectively restoring three segments (or up to four, without our previous hack).

This forced me to initially disable LLD and return to GNU ld. However, upstream has suggested using -znorelro recently, and we were enable to go back down to two segments and reenable it.

libc++ system feature list update

Kamil has noticed that our feature list for libc++ is outdated. We have missed indicating that NetBSD supports aligned_alloc(), timespec_get() and C11 features. I have updated the feature list.

Current build bot status

The stage 1 build currently fails as upstream has broken libc++ builds with GCC. Hopefully, this will be fixed after the weekend.

Before that, we had a bunch of failing tests: 7 related to profiling and 4 related to XRay. Plus, the flaky LLDB tests mentioned earlier.

Core dump XState support finally in

I was working on including full register set ('XState') in core dumps before my summer vacation. I've implemented the requested changes and finally pushed them. The patch set included four patches:

  1. Include XSTATE note in x86 core dump, including preliminary support for machine-dependent core dump notes.

  2. Fix alignment when reading core notes fixing a bug in my tests for core dumps that were added earlier,

  3. Combine x86 register tests into unified test function simplifying the test suite a lot (by almost a half of the original lines).

  4. Add tests for reading registers from x86 core dumps covering both old and new notes.

NetBSD/i386 support for LLDB

As the next step in my LLDB work, I've started working on providing i386 support. This covers both native i386 systems, and 32-bit executable support on amd64. In total, the combined amd64/i386 support covers four scenarios:

  1. 64-bit kernel, 64-bit debugger, 64-bit executable (native 64-bit).

  2. 64-bit kernel, 64-bit debugger, 32-bit executable.

  3. 64-bit kernel, 32-bit debugger, 32-bit executable.

  4. 32-bit kernel, 32-bit debugger, 32-bit executable (native 32-bit).

Those cases are really different only from kernel's point-of-view. For scenarios 1. and 2. the debugger is using 64-bit ptrace API, while in cases 3. and 4. it is using 32-bit ptrace API. In case 2., the application runs via compat32 and the kernel fits its data into 64-bit ptrace API. In case 3., the debugger runs via compat32.

Technically, cases 1. and 2. are already covered by the amd64 code in LLDB. However, for user's convenience LLDB needs to be extended to recognize 32-bit processes on NetBSD and adjust the data obtained from ptrace to 32-bit executables. Cases 3. and 4. need to be covered via making the code build on i386.

Other LLDB plugins implement this via creating separate i386 and amd64 modules, then including 32-bit branch in amd64 that reuses parts of i386 code. I am following suit with that. My plan is to implement 32-bit process support for case 2. first, then port everything to i386.

So far I have implemented the code to recognize 32-bit processes and I have started implementing i386 register interface that is meant to map data from 64-bit ptrace register dumps. However, it does not seem to map registers correctly at the moment and I am still debugging the problem.

Future plans

As mentioned above, I am currently working on providing support for debugging 32-bit executables on amd64. Afterwards, I am going to work on porting LLDB to run on i386.

I am also tentatively addressing compiler-rt test suite problems in order to reduce the number of build bot failures. I also need to look into remaining kernel problems regarding simultaneous delivery of signals and breakpoints or watchpoints.

Furthermore, I am planning to continue with the items from the original LLDB plan. Those are:

  1. Add support to backtrace through signal trampoline and extend the support to libexecinfo, unwind implementations (LLVM, nongnu). Examine adding CFI support to interfaces that need it to provide more stable backtraces (both kernel and userland).

  2. Add support for aarch64 target.

  3. Stabilize LLDB and address breaking tests from the test suite.

  4. Merge LLDB with the base system (under LLVM-style distribution).

This work is sponsored by The NetBSD Foundation

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL to chip in what you can:

Posted early Monday morning, January 13th, 2020 Tags:
This month I have improved the NetBSD ptrace(2) API, removing one legacy interface with a few flaws and replacing it with two new calls with new features, and removing technical debt.

As LLVM 10.0 is branching now soon (Jan 15th 2020), I worked on proper support of the LLVM features for NetBSD 9.0 (today RC1) and NetBSD HEAD (future 10.0).

ptrace(2) API changes

There are around 20 Machine Independent ptrace(2) calls. The origin of some of these calls trace back to BSD4.3. The PT_LWPINFO call was introduced in 2003 and was loosely inspired by a similar interface in HP-UX ttrace(2). As that was the early in the history of POSIX threads and SMP support, not every bit of the interface remained ideal for the current computing needs.

The PT_LWPINFO call was originally intended to retrieve the thread (LWP) information inside a traced process.

This call was designed to work as an iterator over threads to retrieve the LWP id + event information. The event information is received in a raw format (PL_EVENT_NONE, PL_EVENT_SIGNAL, PL_EVENT_SUSPENDED).


1. PT_LWPINFO shares the operation name with PT_LWPINFO from FreeBSD that works differently and is used for different purposes:

  • On FreeBSD PT_LWPINFO returns pieces of information for the suspended thread, not the next thread in the iteration.
  • FreeBSD uses a custom interface for iterating over threads (actually retrieving the threads is done with PT_GETNUMLWPS + PT_GETLWPLIST).
  • There is almost no overlapping correct usage of PT_LWPINFO on NetBSD and PL_LWPINFO on FreeBSD, and this causes confusion and misuse of the interfaces (recently I fixed such misuse in the DTrace code).

2. pl_event can only return whether a signal was emitted to all threads or a single one. There is no information whether this is a per-LWP signal or per-PROC signal, no siginfo_t information is attached etc.

3. Syncing our behavior with FreeBSD would mean complete breakage of our PT_LWPINFO users and it is actually unnecessary, as we receive full siginfo_t through Linux-like PT_GET_SIGINFO, instead of reimplementing siginfo_t inside ptrace_lwpinfo in FreeBSD-style. (FreeBSD wanted to follow NetBSD and adopt some of our APIs in ptrace(2) and signals.).

4. Our PT_LWPINFO is unable to list LWP ids in a traced process.

5. The PT_LWPINFO semantics cannot be used in core files as-is (as our PT_LPWINFO returns next LWP, not the indicated one) and pl_event is redundant with netbsd_elfcore_procinfo.cpi_siglwp, and still less powerful (as it cannot distinguish between a per-LWP and a per-PROC signal in a single-threaded application).

6. PT_LWPINFO is already documented in the BUGS section of ptrace(2), as it contains additional flaws.


1. Remove PT_LWPINFO from the public ptrace(2) API, keeping it only as a hidden namespaced symbol for legacy compatibility.

2. Introduce the PT_LWPSTATUS that prompts the kernel about exact thread and retrieves useful information about LWP.

3. Introduce PT_LWPNEXT with the iteration semantics from PT_LWPINFO, namely return the next LWP.

4. Include per-LWP information in core(5) files as "PT_LWPSTATUS@nnn".

5. Fix flattening the signal context in netbsd_elfcore_procinfo in core(5) files, and move per-LWP signal information to the per-LWP structure "PT_LWPSTATUS@nnn".

6. Do not bother with FreeBSD like PT_GETNUMLWPS + PT_GETLWPLIST calls, as this is a micro-optimization. We intend to retrieve the list of threads once on attach/exec and later trace them through the LWP events (PTRACE_LWP_CREATE, PTRACE_LWP_EXIT). It's more important to keep compatibility with current usage of PT_LWPINFO.

7. Keep the existing ATF tests for PT_LWPINFO to avoid rot.

PT_LWPSTATUS and PT_LWPNEXT operate over newly introduced "struct ptrace_lwpstatus". This structure is inspired by: - SmartOS lwpstatus_t, - struct ptrace_lwpinfo from NetBSD, - struct ptrace_lwpinfo from FreeBSD

and their usage in real existing open-source software.

#define PL_LNAMELEN 20 /* extra 4 for alignment */

struct ptrace_lwpstatus {
 lwpid_t  pl_lwpid;  /* LWP described */
 sigset_t pl_sigpend;  /* LWP signals pending */
 sigset_t pl_sigmask;  /* LWP signal mask */
 char  pl_name[PL_LNAMELEN]; /* LWP name, may be empty */
 void  *pl_private;  /* LWP private data */
 /* Add fields at the end */

  • pt_lwpid is picked from PT_LWPINFO.
  • pl_event is removed entirely as useless, misleading and harmful.
  • pl_sigpend and pl_sigmask are mainly intended to untangle the cpi_sig* fields from "struct ptrace_lwpstatus" (fix "XXX" in the kernel code).
  • pl_name is an easy to use API to retrieve the LWP name, replacing sysctl() retrieval. (Previous algorithm: retrieve the number of LWPs, retrieve all LWPs; iterate over them; finding the matching ID; copy the LWP name.) pl_name will also be included with the missing LWP name information in core(5) files.
  • pl_private implements currently missing interface to read the TLS base value.

I have decided to avoid a writable version of PT_LWPSTATUS that rewrites signals, name, or private pointer. These options are practically unused in existing open-source software. There are two exceptions that I am familiar with, but both are specific to kludges overusing ptrace(2). If these operations are needed, they can be implemented without a writable version of PT_LWPSTATUS, patching tracee's code.

I have switched GDB (in base), LLDB, picotrace and sanitizers to the new API. As NetBSD 9.0 is nearing release, this API change will land NetBSD 10.0 and existing ptrace(2) software will use PT_LWPINFO for now.

New interfaces are ensured to be stable and continuously verified by the ATF infrastructure.


In the early in the history of libpthread, the NetBSD developers designed and programmed a libpthread_dbg library. It's use-case was initially intended to handle user-space scheduling of threads in the M:N threading model inspired by Solaris.

After the switch of the internals to new SMP design (1:1 model) by Andrew Doran, this library lost its purpose and was no longer used (except being linked for some time in a local base system GDB version). I removed the libpthread_dbg when I modernized the ptrace(2) API, as it no longer had any use (and it was broken in several ways for years without being noticed).

As I have introduced the PT_LWPSTATUS call, I have decided to verify this interface in a fancy way. I have mapped ptrace_lwpstatus::pl_private into the tls_base structure as it is defined in the sys/tls.h header:

struct tls_tcb {   
        void    **tcb_dtv;
        void    *tcb_pthread;
        void    *tcb_self;
        void    **tcb_dtv;
        void    *tcb_pthread;

The pl_private pointer is in fact a pointer to a structure in debugger's address space, pointing to a tls_tcl structure. This is not true universally in every environment, but it is true in regular programs using the ELF loader and the libpthread library. Now, with the tcb_pthread field we can reference a regular C-style pthread_t object. Now, wrapping it into a real tracer, I have implemented a program that can either start a debuggee or attach to a process and on demand (as a SIGINFO handler, usually triggered in the BSD environment with ctrl-t) dump the full state of pthread_t objects within a process. A part of the example usage is below:

$ ./pthreadtracer -p `pgrep nslookup` 
[ 21088.9252645] load: 2.83  cmd: pthreadtracer 6404 [wait parked] 0.00u 0.00s 0% 1600k
DTV=0x7f7ff7ee70c8 TCB_PTHREAD=0x7f7ff7e94000
LID=4 NAME='sock-0' TLS_TSD=0x7f7ff7eed890
pt_self = 0x7f7ff7e94000
pt_tls = 0x7f7ff7eed890
pt_magic = 0x11110001 (= PT_MAGIC=0x11110001)
pt_state = 1
pt_lock = 0x0
pt_flags = 0
pt_cancel = 0
pt_errno = 35
pt_stack = {.ss_sp = 0x7f7fef9e0000, ss_size = 4194304, ss_flags = 0}
pt_stack_allocated = YES
pt_guardsize = 65536

Full log is stored here. The source code of this program, on top of picotrace is here.

The problem with this utility is that it requires libpthread sources available and reachable by the build rules. pthreadtracer reaches each field of pthread_t knowing its exact internal structure. This is enough for validation of PT_LWPSTATUS, but is it enough for shipping it to users and finding its real world use-case? Debuggers (GDB, LLDB) using debug information can reach the same data with DWARF, but supporting DWARF in pthreadtracer is currently harder than it ought to be for the interface tests. There is also an option to revive at some point libpthread_dbg(3), revamping it for modern libpthread(3), this would help avoid DWARF introspection and it could find some use in self-introspection programs, but are there any?


I keep searching for a solution to properly support lld (LLVM linker).

NetBSD's major issue with LLVM lld is the lack of standalone linker support, therefore being a real GNU ld replacement. I was forced to publish a standalone wrapper for lld, called lld-standalone and host it on GitHub for the time being, at least until we will sort out the talks with LLVM developers.

LLVM sanitizers

As the NetBSD code is evolving, there is a need to support multiple kernel versions starting from 9.0 with the LLVM sanitizers. I have introduced the following changes:

  • [compiler-rt] [netbsd] Switch to syscall for ThreadSelfTlsTcb()
  • [compiler-rt] [netbsd] Add support for versioned statvfs interceptors
  • [compiler-rt] Sync NetBSD ioctl definitions with 9.99.26
  • [compiler-rt] [fuzzer] Include stdarg.h for va_list
  • [compiler-rt] [fuzzer] Enable LSan in libFuzzer tests on NetBSD
  • [compiler-rt] Enable SANITIZER_CAN_USE_PREINIT_ARRAY on NetBSD
  • [compiler-rt] Adapt stop-the-world for ptrace changes in NetBSD-9.99.30
  • [compiler-rt] Adapt for ptrace(2) changes in NetBSD-9.99.30

The purpose of these changes is as follows:

  • Stop using internal interface to retrieve the tcl_tcb struct (TLS base) and switch to public API with the syscall _lwp_getprivate(2). While there, I have harmonized the namespacing of __lwp_getprivate_fast() and __lwp_gettcb_fast() in the NetBSD distribution. Now, every port will need to use the same define (-D_RTLD_SOURCE, -D_LIBC_SOURCE or -D__LIBPTHREAD_SOURCE__). Previously these interfaces were conflicting with the public namespaces (affecting kernel builds) and wrongly suggesting that these interfaces might be available to public third party code. Initially I used it in LLVM sanitizers, but switched it to full-syscall _lwp_getspecific().
  • Nowadays almost every mainstream OS implements support for preinit/initarray/finitarray in all ports, regardless of ABI requirements. NetBSD originally supported these features only when they were mandated by an ABI specification. Christos Zoulas in 2018 enabled these features for all CPUs, and this eventually allowed to enable this feature unconditionally for consumption in the sanitizer code. This allows use of the same interface as Linux or Solaris, rather than relying on C++-style constructors that have their own issues (need to abuse priorities of constructors and lack of guarantee that our code will be called before other constructors, which can be fatal).
  • Support for kernels between 9.0 and 9.99.30 (and later, unless there are breaking changes).

There is still one portability issue in the sanitizers, as we hard-code the offset of the link_map field within the internal dlopen handle pointer. The dlopen handler is internal to the ELF loader object of type Obj_Entry. This type is not available to third party code and it is not stable. It also has a different layout depending on the CPU architecture. The same problem exists for at least FreeBSD, and to some extent to Linux. I have prepared a patch that utilizes the dlinfo(3) call with option RTLD_DI_LINKMAP. Unfortunately there is a regression with MSan on NetBSD HEAD (it works on 9.0rc1) that makes it harder for me to finalize the patch. I suspect that after the switch to GCC 8, there is now incompatible behavior that causes a recursive call sequence: _Unwind_Backtrace() calling _Unwind_Find_FDE(), calling search_object, and triggering the __interceptor_malloc interceptor again, which calls _Unwind_Backtrace(), resulting in deadlock. The offending code is located in src/external/gpl3/gcc/dist/libgcc/unwind-dw2-fde.c and needs proper investigation. A quick workaround to stop recursive stack unwinding unfortunately did not work, as there is another (related?) problem:

==4629==MemorySanitizer CHECK failed:
/public/llvm-project/llvm/projects/compiler-rt/lib/msan/msan_origin.h:104 "((stack_id)) != (0)" (0x0, 0x0)

This shows that this low-level code is very sensitive to slight changes, and needs maintenance power. We keep improving the coverage of tested scenarios on the LLVM buildbot, and we enabled sanitizer tests on 9.0 NetBSD/amd64; however we could make use of more manpower in order to reach full Linux parity in the toolchain.

Other changes

As my project in LLVM and ptrace(2) is slowly concluding, I'm trying to finalize the related tasks that were left behind.

I've finished researching why we couldn't use syscall restart on kevent(2) call in LLDB and improved the system documentation on it. I have also fixed small nits in the NetBSD wiki page on kevent(2).

I have updated the list of ELF defines for CPUs and OS ABIs in sys/exec_elf.h.

Plan for the next milestone

Port remaining ptrace(2) test scenarios from Linux, FreeBSD and OpenBSD to ATF and ensure that they are properly operational.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL to chip in what you can:

Posted Monday evening, January 13th, 2020 Tags:
Add a comment