Jan 2019
S M T W T F S
   
22 23
   

Archives

This page is a blog mirror of sorts. It pulls in articles from blog's feed and publishes them here (with a feed, too).

Prepared by Michał Górny (mgorny AT gentoo.org).

LLD is the link editor (linker) component of Clang toolchain. Its main advantage over GNU ld is much lower memory footprint, and linking speed. It is of specific interest to me since currently 8 GiB of memory are insufficient to link LLVM statically (which is the upstream default).

The first goal of LLD porting is to ensure that LLD can produce working NetBSD executables, and be used to build LLVM itself. Then, it is desirable to look into trying to build additional NetBSD components, and eventually into replacing /usr/bin/ld entirely with lld.

In this report, I would like to shortly summarize the issues I have found so far trying to use LLD on NetBSD.

DT_RPATH vs DT_RUNPATH

RPATH is used to embed a library search path in the executable. Since it takes precedence over default system library paths, it can be used both to specify the location of additional program libraries and to override system libraries.

Currently, RPATH can be embedded in executables using two tags: the “old” DT_RPATH tag and the “new” DT_RUNPATH tag. The existence of two tags comes from behavior exhibited by some operating systems (e.g. glibc systems): DT_RPATH used to take precedence over LD_LIBRARY_PATH, making it impossible to override the paths specified there. Therefore, a new DT_RUNPATH tag was added that comes after LD_LIBRARY_PATH in precedence. When both DT_RPATH and DT_RUNPATH are specified, the former is ignored.

On NetBSD, DT_RPATH does not take precedence over LD_LIBRARY_PATH. Therefore, there wasn't ever a need for DT_RUNPATH and the support for it (as alias to DT_RPATH) was added only very recently: on 2018-12-30.

Unlike GNU ld, LLD defaults to using “new” tag by default and therefore produces executables whose RPATHs do not work on older NetBSD versions. Given that using DT_RUNPATH on NetBSD has no real advantage, using the --disable-new-dtags option to suppress them is preferable.

More than two PT_LOAD segments

PT_LOAD segments are used to map executable image into the memory. Traditionally, GNU ld produces exactly two PT_LOAD segments: a RX text (code) segment, and a RW data segment. NetBSD dynamic loader (ld.elf_so) hardcodes the assumption of that design. However, lld sometimes produces an additional read-only data segment, causing the assertions to fail in the dynamic loader.

I have attempted to rewrite the memory mapping routine to allow for arbitrary number of segments. However, apparently my patch is just “a step in the wrong direction” and Joerg Sonnenberger is working on a proper fix.

Alternatively, LLD has a --no-rosegment option that can be used to suppress the additional segment and work around the problem.

Clang/LLD driver design issues

Both GCC and Clang use a design based on a front-end driver component. That is, the executable called directly by the user is a driver whose purpose is to perform initial command-line option and input processing, and run appropriate tools performing the actual work. Those tools may include the C preprocessor (cpp), C/C++ compiler (cc1), assembler, link editor (ld).

This follows the original UNIX principle of simple tools that perform a single task well, and a wrapper that combines those tools into complete workflows. Interesting enough, it makes it possible to keep all system-specific defaults and logic in a single place, without having to make every single tool aware of them. Instead, they are passed to those tools as command-line options.

This also accounts for simpler and more portable build system design. The gcc/clang driver provides a single high-level interface for performing a multitude of tasks, including compiling assembly files or linking executables. Therefore, the build system and the user do not need to be explicitly aware of low-level tooling and its usage. Not to mention it makes much easier to swap that tooling transparently.

For example, if you are linking an executable via the driver, it takes care of finding the appropriate link editor (and makes it easy to change it via -fuse-ld), preparing appropriate command-line options (e.g. if you do a -m32 multilib build, it sets the emulation for you) and passes necessary libraries to link (e.g. an appropriate standard C++ library when building a C++ program).

The clang toolchain considers LLD an explicit part of this workflow, and — unlike GNU ld — ld.lld is not really suitable for using stand-alone. For example, it does not include any standard search paths for libraries, expecting the driver to provide them in form of appropriate -L options. This way, all the logic responsible for figuring out the operating system used (including possible cross-compilation scenarios) and using appropriate paths is located in one component.

However, Joerg Sonnenberger disagrees with this and believes LLD should contain all the defaults necessary for it to be used stand-alone on NetBSD. Effectively, we have two conflicting designs: one where all logic is in clang driver, and the other where some of the logic is moved into LLD. At this moment, LLD is following the former assumption, and clang driver for NetBSD — the latter. As a result, neither using LLD directly nor via clang works out of the box on NetBSD; to use either, the user would have to pass all appropriate -L and -z options explicitly.

Fixing LLD to work with the current clang driver would require adding target awareness to LLD and changing a number of defaults for NetBSD based on the target used. However, LLD maintainer Rui Ueyama is opposed to introducing this extra logic specifically for NetBSD, and believes it should be added to the clang driver as for other platforms. On the other side, the NetBSD toolchain driver maintainer Joerg Sonnenberger blocks adding this to the driver. Therefore, we have reached a impasse that prevents LLD from working out of the box on NetBSD without local patches.

A work-in-progress implementation of the local target logic approach requested by Joerg can be seen in D56650. Afterwards, additional behavior can be enabled on NetBSD by using target triple properties such as in D56215 (which copies the libdir logic from clang).

For comparison, the same problem solved in a way consistent with other distributions (and rejected by Joerg) can be seen in D56932 (which updates D33726). However, in some cases it will require adding additional options to LLD (e.g. -z nognustack, D56554), and corresponding dummy switches in GNU ld.

Handling of indirect shared library dependencies

When starting a program, the dynamic loader needs to find and load all shared libraries listed via DT_NEEDED entries in order to obtain symbols needed by the program (functions, variables). Naturally, it also needs to process DT_NEEDED entries of those libraries to satisfy their symbol dependencies, and so on. As a result to this, the program can also reference symbols declared in dependencies of its DT_NEEDED libraries, that is its indirect dependencies.

While linking executables, link editors normally verify that all symbols can be resolved in one of the linked libraries. Historically, GNU ld followed the logic used by the dynamic loader and permitted symbols used by program to be present in either direct or indirect dependencies. However, GNU gold, LLD and newer versions of GNU ld use different logic and permit only symbols provided by the direct dependencies.

Let's take an example: you are writing a program that works with .zip files, and therefore you link -lzip. However, you also implement support for .gz files, and therefore call gzopen() provided by -lz which is also a dependency of libzip.so. Now, with old GNU ld versions you could just use -lzip since it would indirectly include libz.so. However, modern linkers will refuse to link it claiming that gzopen is undefined. You need to explicitly link -lzip -lz to resolve that.

Joerg Sonnenberger disagrees with this new behavior, and explicitly preserves the old behavior in GNU ld version used in NetBSD. However, LLD does not support the historical GNU ld behavior at all. It will probably become necessary to implement it from scratch to support NetBSD fully.

Summary

At this point, it seems that using LLD for the majority of regular packages is a goal that can be achieved soon. The main blocker right now is the disagreement between developers on how to proceed. When we can resolve that and agree on a single way forward, most of the patches become trivial.

Sadly, at this point I really do not see any way to convince either of the sides. The problem was reported in May 2017, and in the same month a fix consistent with all other platforms was provided. However, it is being blocked and LLD can not work out of the box.

Hopefully, we will able to finally find a way forward that does not involve keeping upstream clang driver for NetBSD incompatible with upstream LLD, and carrying a number of patches to LLD locally to make it work.

The further goals include attempting to build the kernel and complete userland using LLD, as well as further verifying compatibility of executables produced with various combinations of linker options (e.g. static PIE executables).

Posted late Friday evening, January 18th, 2019 Tags: blog
I've finished the process of upstreaming patches to LLVM sanitizers (almost 2000LOC of local code) and submitted to upstream new improvements for the NetBSD support. Today out of the box (in unpatched version) we have support for a variety of compiler-rt LLVM features: ASan (finds unauthorized memory access), UBSan (finds unspecified code semantics), TSan (finds threading bugs), MSan (finds uninitialized memory use), SafeStack (double stack hardening), Profile (code coverage), XRay (dynamic code tracing); while other ones such as Scudo (hardened allocator) or DFSan (generic data flow sanitizer) are not far away from completeness.

The NetBSD support is no longer visibly lacking behind Linux in sanitizers, although there are still failing tests on NetBSD that are not observed on Linux. On the other hand there are features working on NetBSD that are not functional on Linux, like sanitizing programs during early initialization process of OS (this is caused by /proc dependency on Linux that is mounted by startup programs, while NetBSD relies on sysctl(3) interfaces that is always available).

Changes in compiler-rt

A number of patches have been merged upstream this month. Part of the upstreamed code has been originally written by Yang Zheng during GSoC-2018. My work was about cleaning the patches, applying comments from upstream review and writing new regression tests. Some of the changes were newly written over the past month, like background thread support in ASan/NetBSD or NetBSD compatible per-thread cleanup destructors in ASan and MSan.

Additionally, I've also ported the LLVM profile (--coverage) feature to NetBSD and investigated the remaining failing tests. Part of the failures were caused by already a copy of older runtime inside the NetBSD libc (ABIv2 libc vs ABIv4 current). The remaining two tests are affected by incompatible behavior of atexit(3) in Dynamic Shared Objects. Replacing the functionality with destructors didn't work and I've marked these tests as expected failures and moved on.

Changes in compiler-rt:

  • 3d5a3668a Reenable hard_rss_limit_mb_test.cc for android-26
  • decb231c3 Add support for background thread on NetBSD in ASan
  • 3ebc523bb Fix a mistake in previous
  • df1f46250 Update NetBSD ioctl(2) entries with 8.99.28
  • 2de4ff725 Enable asan_and_llvm_coverage_test.cc for NetBSD
  • 4d9ac421b Reimplement Thread Static Data MSan routines with TLS
  • f4a536af4 Adjust NetBSD/sha2.cc to be portable to more environments
  • 21bd4bd9f Adjust NetBSD/md2.cc to be portable to more environments
  • 1f2d0324e Adjust NetBSD/md[45].cc to be portable to more environments
  • 2835fe7cf Add support for LLVM profile for NetBSD
  • 52af2fe7c Reimplement Thread Static Data ASan routines with TLS
  • 79d385b5c Improve the comment in previous
  • 384486fa4 Expand TSan sysroot workaround to NetBSD
  • 81e370964 Enable test/msan/pthread_getname_np.cc for NetBSD
  • 19e2af50e Enable SANITIZER_INTERCEPT_PTHREAD_GETNAME_NP for NetBSD
  • 5088473c7 Fix internal_sleep() for NetBSD
  • cd24f2f94 Mark interception_failure_test.cc as passing for NetBSD and asan-dynamic-runtime
  • ececda6ca Set shared_libasan_path in lit tests for NetBSD
  • 429bc2d51 Add a new interceptors for cdbr(3) and cdbw(3) API from NetBSD
  • 71553eb50 Add new interceptors for vis(3) API in NetBSD
  • a3e78a793 Add data types needed for md2(3)/NetBSD interceptors
  • 0ddb9d099 Add interceptors for the sha2(3) from NetBSD
  • 9e2ff43a6 Add interceptors for md2(3) from NetBSD
  • 42ac31ee6 Add new interceptors for FILE repositioning stream
  • 8627b4b30 Fix a typo in the strtoi test
  • 660f7441b Revert a chunk of previous change in sanitizer_platform_limits_netbsd.h
  • 9a087462c Add interceptors for md5(3) from NetBSD
  • 086caf6a2 Add interceptors for the rmd160(3) from NetBSD
  • 8f77a2e89 Add interceptors for the md4(3) from NetBSD
  • 27af3db52 Add interceptors for the sha1(3) from NetBSD
  • 6b9f7889b Add interceptors for the strtoi(3)/strtou(3) from NetBSD
  • 195044df9 Add a new interceptors for statvfs1(2) and fstatvfs1(2) from NetBSD
  • f0835eb01 Add a new interceptor for fparseln(3) from NetBSD
  • 11ecbe602 Add new interceptor for strtonum(3)
  • 19b47fcc0 Remove XFAIL in get_module_and_offset_for_pc.cc for NetBSD-MSan
  • b3a7f1d78 Add a new interceptor for modctl(2) from NetBSD
  • 39c2acc81 Add a new interceptor for nl_langinfo(3) from NetBSD
  • 2eb9a4c53 Update GET_LINK_MAP_BY_DLOPEN_HANDLE() for NetBSD x86
  • e8dd644be Improve the regerror(3) interceptor
  • dd939986a Add interceptors for the sysctl(3) API family from NetBSD
  • 67639f9cc Add interceptors for the fts(3) API family from NetBSD
  • c8fae517a Add new interceptor for regex(3) in NetBSD

Part of the new code has been quickly ported from NetBSD to other Operating Systems, mostly FreeBSD, and when applicable to Darwin and Linux.

Changes in other LLVM projects

In order to eliminate local diffs in other LLVM projects, I've upstreamed two patches to LLVM and two to OpenMP. I've also helped other BSDs to get their support in OpenMP (DragonFlyBSD and OpenBSD).

LLVM changes:

  • 50df229c26a Add NetBSD support in needsRuntimeRegistrationOfSectionRange.
  • 267dfed3ade Register kASan shadow offset for NetBSD/amd64

OpenMP changes:

  • 67d037d Implement __kmp_is_address_mapped() for NetBSD
  • 9761977 Implement __kmp_gettid() for NetBSD
  • a72c79b Add OpenBSD support to OpenMP
  • b3d05ab Add DragonFlyBSD support to OpenMP

NetBSD changes

I've introduced 5 changes to the NetBSD source tree over the past month, not counting updates to TODO lists.

  • Raise the fill_vmentries() E2BIG limit from 1MB to 10MB
  • Correct libproc_p.a in distribution sets
  • compiler_rt: Update prepare-import.sh according to future updates
  • Correct handling of minval > maxval in strtonum(3)
  • Stop mangling __func__ for C++11 and newer

The first change is needed to handle large address space with sysctl(3) operation to retrieve the map. This feature is required in sanitizers and part of the tests were failing because within 1MB it wasn't possible to pass all the information about the process virtual map (mostly due to a large number of small allocations).

The second change was introduced to unbreak MKPROFILE=no build, I needed this during my work of porting the modern LLVM profile feature.

The third change is a preparation for import of compiler-rt sanitizers into the NetBSD distribution.

The forth change was a bug fix for strtonum(3) implementation in libc.

The fifth change was intended to reuse native compiler support for the __func__ compiler symbol.

Integration of LLVM sanitizers with the NetBSD basesystem

We are ready to push support for LLVM sanitizers into the NetBSD basesystem as all the needed patches have been merged. I've divided the remaining tests of integration of LLVM sanitizers into three milestones:

  1. Import compiler-rt sources into src/. A complete diff is pending for final acceptance in internal review.
  2. Integrate building of compiler-rt stanitizer under the MKLLVM=yes option. This has been made functional, but it needs polishing and submitting to internal review.
  3. Make MKSANITIZER available out of the box with the toolchain available in "./build.sh tools". This will be continuation of the previous point. All the MKSANITIZER patches independent from compiler type are already committed into the NetBSD distribution, however there will be likely some extra minor adaptation work here too.

Plan for the next milestone

Finish the integration of LLVM sanitizers with the NetBSD distribution.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:

http://netbsd.org/donations/#how-to-donate

Posted late Thursday afternoon, January 3rd, 2019 Tags: blog

Prepared by Michał Górny (mgorny AT gentoo.org).

I'm recently helping the NetBSD developers to improve the support for this operating system in various LLVM components. As you can read in my previous report, I've been focusing on fixing build and test failures for the purpose of improving the buildbot coverage.

Previously, I've resolved test failures in LLVM, Clang, LLD, libunwind, openmp and partially libc++. During the remainder of the month, I've been working on the remaining libc++ test failures, improving the NetBSD clang driver and helping Kamil Rytarowski with compiler-rt.

Locale issues in NetBSD / libc++

The remaining libc++ work focused on resolving locale issues. This consisted of two parts:

  1. Resolving incorrect assumptions in re.traits case-insensitivity translation handling: r349378 (D55746),

  2. Enabling locale support and disabling tests failing because of partial locale support in NetBSD: r349379 (D55767).

The first of the problems was related to testing the routine converting strings to common case for the purpose of case-insensitive regular expression matching. The test attempted to translate \xDA and \xFA characters (stored as char) in UTF-8 locale, and expected both of them to map to \xFA, i.e. according to Unicode Latin-1 Supplement. However, this behavior is only exhibited on OSX. Other systems, including NetBSD, FreeBSD and Linux return the character unmodified (i.e. \xDA and \xFA appropriately).

I've came to the conclusion that the most likely cause of the incompatible behavior is that both \xDA and \xFA alone do not comprise valid UTF-8 sequences. However, since the translation function can take only a single char and is therefore incapable of processing multi-byte sequences, it is unclear how it should handle the range \x80..\xFF that's normally used for multi-byte sequences or not used at all. Apparently, the OSX implementers decided to map it into \u0080..\u00FF range, i.e. treat equivalently to wchar_t, while others decided to ignore it.

After some discussion, upstream agreed on removing the two tests for now, and treating the result of this translation as undefined implementation-specific behavior. At the same time, we've left similar cases for L'\xDA' and L'\xFA' which seem to work reliably on all implementations (assuming wchar_t is using UCS-4, UCS-2 or UTF-16 encoding).

The second issue was adding libc++ target info for NetBSD which has been missing so far. This I've based on the rules for FreeBSD, modified to use libc++abi instead of libcxxrt (the former being upstream default, and the latter being abandoned external project). This also included a list of supported locales which caused a large number of additional tests to be run (if I counted correctly, 43 passing and 30 failing).

Some of the tests started failing due to limited locale support on NetBSD. Apparently, the system supports locales only for the purpose of character encoding (i.e. LC_CTYPE category), while the remaining categories are implemented as stubs (see localeconv(3)). After some thinking, we've agreed to list all locales expected by libc++ as supported since NetBSD has support files for them, and mark relevant tests as expected failures. This has the advantage that when locale support becomes more complete, the expected failures will be reported as unexpected passes, and we will know to update the tests accordingly.

Clang driver updates

On specific request of Kamil, I have looked into the NetBSD Clang driver code afterwards. My driver work focused on three separate issues:

  1. Recently added address significance tables (-faddrsig) causing crashes of binutils (r349647; D55828),

  2. Passing -D_REENTRANT when building sanitized code (with prerequisite: r349649, D55832; r349650, D55654,

  3. Establishing proper support for finding and using locally installed LLVM components, i.e. libc++, compiler-rt, etc.

The first problem was mostly a side effect of Kamil testing my z3 bump. He noticed that binutils crash on executables produced by recent versions of clang (example backtrace <http://netbsd.org/~kamil/llvm/strip.txt>). Given my earlier experiences in Gentoo, I correctly guessed that this is caused by address significance table support that is enabled by default in LLVM 7. While technically they should be ignored by tooling not supporting them, it causes verbose warnings in some versions of binutils, and explicit crashes in the 2.27 version used by NetBSD at the moment.

In order to prevent users from experiencing this, we've agreed to disable address significance table by default on NetBSD (users can still explicitly enable them via -faddrsig). What's interesting, the block for disabling LLVM_ADDRSIG has grown since to include PS4 and Gentoo which indicates that we're not the only ones considering this default a bad idea.

The second problem was that Kamil indicated that we should only support sanitizing reentrant versions of the system API. Accordingly, he requested that the driver automatically includes -D_REENTRANT when compiling code with any of the sanitizers enabled. I've implemented a new internal clang API that checks whether any of the sanitizers were enabled, and used it to implicitly pass this option from driver to the actual compiler instance.

The third problem is much more complex. It boils down to the driver using hardcoded paths from a few years back assuming LLVM being installed to /usr as path of /usr/src integration. However, those paths do not really work when clang is run from the build directory or installed manually to /usr/local; which means e.g. clang install with libc++ does not work out of the box.

Sadly, we haven't been able to come up with a really good and reliable solution to this. Even if we could make clang find other components reasonably reliably using executable-relative paths, this would require building executables with RPATHs potentially pointing to (temporary) build directory of LLVM. Eventually, I've decided to abandon the problem and focus on passing appropriate options explicitly when using just-built clang to build and test other components. For the purpose of buildbot, this required using the following option:

-DOPENMP_TEST_FLAGS="-cxx-isystem${PWD}/include/c++/v1"

compiler-rt work

As Kamil is finalizing his work on sanitizers, he left tasks in TODO lists and I picked them up to work on them.

Build fixes

My first two fixes to compiler-rt were plain build fixes, necessary to proceed further. Those were:

  1. Fixing use of variables (whose values could not be directly determined at compile time) for array length: r349645, D55811.

  2. Detecting missing libLLVMTestingSupport and skipping tests requiring it: r349899, D55891.

The first one was a trivial coding slip that was easily fixed by using a constant value for hash length. The second one was more problematic.

Two tests in XRay test suite required LLVMTestingSupport. However, this library is not installed by LLVM, and so is not present when building stand-alone. Normally, similar issues in LLVM were resolved by building the relevant library locally (that's e.g. what we do with gtest). In this case this wasn't feasible due to the library being more tightly coupled with LLVM itself, and adjusting the build system would be non-trivial. Therefore, since only two tests required this we've agreed on disabling this when the library isn't present.

XRay: alignment and MPROTECT problems

The next step was to research XRay test failures. Firstly, all the tests were failing due to PaX MPROTECT. Secondly, after disabling MPROTECT I've been getting the following test failures from check-xray:

XRay-x86_64-netbsd :: TestCases/Posix/fdr-reinit.cc
XRay-x86_64-netbsd :: TestCases/Posix/fdr-single-thread.cc

Both tests segfaulted on initializing thread-local data structure. I've came to the conclusion that somehow initializing aligned thread-local data is causing the issue, and built a simple test case for it:

struct FDRLogWriter {
  FDRLogWriter() {}
};

struct alignas(64) ThreadLocalData {
  FDRLogWriter Buffer{};
};

void f() { thread_local ThreadLocalData foo{}; }

int main() {
  f();
  return 0;
}

The really curious part of this was that the code produced by g++ worked fine, while the one produced by clang++ segfaulted. At the same time, the LLVM bytecode generated by clang worked fine on both FreeBSD and Linux. Finally, through comparing and manipulating LLVM bytecode I've found the culprit: the alignment was not respected while allocating storage for thread-local data. However, clang assumed it will be and used MOVAPS instructions that caused a segfault on unaligned data.

I wrote about the problem to tech-toolchain and received a reply that TLS alignment is mostly ignored and there is no short term plan for fixing it. As an interim solution, I wrote a patch that disables alignment tips sufficiently to prevent clang from emitting code relying on it: r350029, D56000.

As suggested by Kamil, I discussed the PaX MPROTECT issues with upstream. The direct problem in solving it the usual way is that the code relies on remapping program memory mapped by ld.so, and altering the program code in place. I've been informed that this design was chosen to keep the mechanism simple, and explicitly declaring that XRay is incompatible with system security features such as PaX MPROTECT. As Kamil suggested, I've written an explicit check for MPROTECT being enabled, and made XRay fail with explanatory error message in that case: r350030, D56049.

Improving stdio.h / FILE* interceptor coverage

My final work has been on improving stdio.h coverage in interceptors. It started as a task on adding NetBSD FILE structure support for interceptors (D56109), and expanded into increasing test and interceptor coverage, both for POSIX and NetBSD-specific stdio.h functions.

I have implemented tests for the following functons:

  • clearerr, feof, ferrno, fileno, fgetc, getc, ungetc: D56136,

  • fputc, putc, putchar, getc_unlocked, putc_unlocked, putchar_unlocked: D56152,

  • popen, pclose: D56153,

  • funopen, funopen2 (in multiple parameter variants): D56154.

Furthermore, I have implemented missing interceptors for the following functions:

  • popen, popenve, pclose: D56157,

  • funopen, funopen2: D56158.

Kamil also pointed out that devname_r interceptor has wrong return type on non-NetBSD systems, and I've fixed that for completeness: D56150.

Summary

At this point, NetBSD LLVM buildbot is building and testing the following projects with no expected test failures:

  • llvm: core LLVM libraries; also includes the test suite of LLVM's lit testing tool

  • clang: C/C++ compiler

  • clang-tools-extra: extra C/C++ code manipulation tools (tidy, rename...)

  • lld: link editor

  • polly: loop and data-locality optimizer for LLVM

  • openmp: OpenMP runtime library

  • libunwind: unwinder library

  • libcxxabi: low-level support library for libcxx

  • libcxx: implementation of C++ standard library

It also builds the lldb debugger but does not run its test suite.

The support for compiler-rt is progressing but it is not ready for buildbot inclusion yet.

Future plans

Next month, I'm planning to work on next items from the TODO. Most notably, this includes:

  • building compiler-rt on buildbot and running the tests,

  • porting LLD to actually produce working executables on NetBSD,

  • porting remaining compiler-rt components: DFSan, ESan, LSan, shadowcallstack.

Posted Sunday night, December 30th, 2018 Tags: blog

Prepared by Michał Górny (mgorny AT gentoo.org).

I'm recently helping the NetBSD developers to improve the support for this operating system in various LLVM components. My first task in this endeavor was to fix build and test issues in as many LLVM projects as timely possible, and get them all covered by the NetBSD LLVM buildbot.

Including more projects in the continuous integration builds is important as it provides the means to timely catch regressions and new issues in NetBSD support. It is not only beneficial because it lets us find offending commits easily but also because it makes other LLVM developers aware of NetBSD porting issues, and increases the chances that the patch authors will fix their mistakes themselves.

Initial buildbot setup and issues

The buildbot setup used by NetBSD is largely based on the LLDB setup used originally by Android, published in the lldb-utils repository. For the purpose of necessary changes, I have forked it as netbsd-llvm-build and Kamil Rytarowski has updated the buildbot configuration to use our setup.

Initially, the very high memory use in GNU ld combined with high job count caused our builds to swap significantly. As a result, the builds were very slow and frequently were terminated due to no output as buildbot presumed them to hang. The fix for this problem consisted of two changes.

Firstly, I have extended the building script to periodically report that it is still active. This ensured that even during prolonged linking buildbot would receive some output and would not terminate the build prematurely.

Secondly, I have split the build task into two parts. The first part uses full ninja job count to build all static libraries. The second part runs with reduced job count to build everything else. Since most of LLVM source files are part of static libraries, this solution makes it possible to build as much as possible with full job count, while reducing it necessarily for GNU ld invocations later.

While working on this setup, we have been informed that the buildbot setup based on external scripts is a legacy design, and that it would be preferable to update it to define buildbot rules directly. However, we have agreed to defer that until our builds mature, as external scripts are more flexible and can be updated without having to request a restart of the LLVM buildbot.

The NetBSD buildbot is part of LLVM buildbot setup, and can be accessed via http://lab.llvm.org:8011/builders/lldb-amd64-ninja-netbsd8. The same machine is also used to run GDB and binutils build tests.

RPATH setup for LLVM builds

Another problem that needed solving was to fix RPATH in built executables to include /usr/pkg/lib, as necessary to find dependencies installed via pkgsrc. Normally, the LLVM build system sets RPATH itself, using a path based on $ORIGIN. However, I have been informed that NetBSD discourages the use of $ORIGIN, and appropriately I have been looking for a better solution.

Eventually, after some experimentation I have come up with the following CMake parameters:

-DCMAKE_BUILD_RPATH="${PWD}/lib;/usr/pkg/lib"
-DCMAKE_INSTALL_RPATH=/usr/pkg/lib

This explicitly disables the standard logic used by LLVM. Build-time RPATH includes the build directory explicitly as to ensure that freshly built shared libraries will be preferred at build time (e.g. when running tests) over previous pkgsrc install; this directory is afterwards removed from rpath when installing.

Building and testing more LLVM sub-projects

The effort so far was to include the following projects in LLVM buildbot runs:

  • llvm: core LLVM libraries; also includes the test suite of LLVM's lit testing tool

  • clang: C/C++ compiler

  • clang-tools-extra: extra C/C++ code manipulation tools (tidy, rename...)

  • lld: link editor

  • polly: loop and data-locality optimizer for LLVM

  • openmp: OpenMP runtime library

  • libunwind: unwinder library

  • libcxxabi: low-level support library for libcxx

  • libcxx: implementation of C++ standard library

  • lldb: debugger; built without test suite at the moment

Additionally, the following project was considered but it was ultimately skipped as it was not ready for wider testing yet:

  • llgo: Go compiler

My project fixes

During my work, I have been trying to upstream all the necessary changes ASAP, as to avoid creating additional local patch maintenance burden. This section provides a short list of all patches that have either been merged upstream, or are in process of waiting for review.

LLVM

Waiting for upstream review:

pkgsrc

  • z3 version bump (submitted to maintainer, waiting for reply)

NetBSD portability

During my work, I have met with a few interesting divergencies between the assumptions made by LLVM developers and the actual behavior of NetBSD. While some of them might be considered bugs, we determined it was preferable to support the current behavior in LLVM. In this section I shortly describe each of them, and indicate the path I took in making LLVM work.

unwind.h

The problem with unwind.h header is a part of bigger issue — while the unwinder API is somewhat defined as part of system ABI, there is no well-defined single implementation on most of the systems. In practice, there are multiple implementations both of the unwinding library and of its headers:

  • gcc: it implements unwinder library in libgcc; also, has its own unwind.h on Linux (but not on NetBSD)

  • clang: it has its own unwind.h (but no library)

  • 'non-GNU' libunwind: stand-alone implementation of library and headers

  • llvm-libunwind: stand-alone implementation of library and headers

  • libexecinfo: provides unwinder library and unwind.h on NetBSD

Since gcc does not provide unwind.h on NetBSD, using it to build LLVM normally results in the built-in unwinder library from GCC being combined with unwind.h installed as part of libexecinfo. However, the API defined by the latter header is type-incompatible with most of the other implementations, and caused libc++abi build to fail.

In order to resolve the build issue, we agreed to use LLVM's own unwinder implementation (llvm-libunwind) which we were building anyway, via the following CMake option:

-DLIBCXXABI_USE_LLVM_UNWINDER=ON

I have started a thread about fixing unwind.h to be more compatible.

noatime behavior

noatime is a filesystem mount option that is meant to inhibit atime updates on file accesses. This is usually done in order to avoid spurious inode writes when performing read-level operations. However, unlike the other implementations NetBSD not only disables automatic atime updates but also explicitly blocks explicit updates via utime() family of functions.

Technically, this behavior is permitted by POSIX as it permits implementation-defined behavior on updating atimes. However, a small number of LLVM tests explicitly rely on being able to set atime on a test file, and the NetBSD behavior causes them to fail. Without a way to set atime, we had to mark those tests unsupported.

I have started a thread about noatime behavior on tech-kern.

__func__ value

__func__ is defined by the standard to be an arbitrary form of function identifier. On most of the other systems, it is equal to the value of __FUNCTION__ defined by gcc, that is the undecorated function name. However, NetBSD system headers conditionally override this to __PRETTY_FUNCTION__, that is a full function prototype.

This has caused one of the LLVM tests to fail due to matching debug output. Admittedly, this was definitely a problem with the test (since __func__ can have an arbitrary value) and I have fixed it to permit the pretty function form.

Kamil Rytarowski has noted that the override is probably more accidental than expected since the header was not updated for C++11 compilers providing __func__, and started a thread about disabling it.

tar -t output

Another difference I have noted while investigating test failures was in output of tar -t (listing files inside a tarball). Curious enough, both GNU tar and libarchive use C-style escapes in the file list output. NetBSD pax/tar output the filenames raw.

The test meant to verify whether backslash in filenames is archived properly (i.e. not treated equivalent to forward slash). It failed because it expected the backslash to be escaped. I was able to fix it by permitting both forms, as the exact treatment of backslash was not relevant to the test case at hand.

I have compared different tar implementations including NetBSD pax in the article portability of tar features.

(time_t)-1 meaning

One of the libc++ test cases was verifying the handling of negative timestamps using a value of -1 (i.e. one second before the epoch). However, this value seems to be mishandled in some of the BSD implementations, FreeBSD and NetBSD in particular. Curious enough, other negative values work fine.

The easier side of the issue is that some functions (e.g. mktime()) use -1 as an error value. However, this can be easily fixed by inspecting errno for actual errors.

The harder side is that the kernel uses a value of -1 (called ENOVAL) to internally indicate that the timestamp is not to be changed. As a result, an attempt to update the file timestamp to one second before the epoch is going to be silently ignored.

I have fixed the test via extend the FreeBSD workaround to NetBSD, and using a different timestamp. I have also started a thread about (time_t)-1 handling on tech-kern.

Future plans

The plans for the remainder of December include, as time permits:

  • finishing upstream of the fore-mentioned patches

  • fixing flaky tests on NetBSD buildbot

  • upstreaming (and fixing if necessary) the remaining pkgsrc patches

  • improving NetBSD support in profiling and xray (of compiler-rt)

  • porting ESan/DFSan

The long-term goals include:

  • improving support for __float128

  • porting LLD to NetBSD (currently it passes all tests but does not produce working executables)

  • finishing LLDB port to NetBSD

  • porting remaining sanitizers to NetBSD

Posted mid-morning Sunday, December 16th, 2018 Tags: blog
I've been actively working on reducing the delta with the local copy of sanitizers with upstream LLVM sources. Their diff has been reduced to less than 2000 Lines Of Code. I've pushed to review almost all of the local code and I'm working on addressing comments from upstream developers.

LLVM changes

The majority of work was related to interceptors. There was a need to cleanup the local code and develop dedicated tests for new interceptors whenever applicable (i.e. always unless this is a syscall modifying the kernel state such as inserting kernel modules).

Detailed list of commits merged with the upstream LLVM compiler-rt repository:

  • Split getpwent and fgetgrent functions in interceptors
  • Try to unbreak the build of sanitizers on !NetBSD
  • Disable recursive interception for tzset in MSan
  • Follow Windows' approach for NetBSD in AlarmCallback()
  • Disable XRay test fork_basic_logging for NetBSD
  • Prioritize the constructor call of __local_xray_dyninit() (investigated with help of Michal Gorny)
  • Adapt UBSan integer truncation tests to NetBSD
  • Split remquol() from INIT_REMQUO
  • Split lgammal() from INIT_LGAMMAL
  • Correct atexit(3) support in MSan/NetBSD (with help of Michal Gorny investigating the failure on Linux)
  • Add new interceptor for getmntinfo(3) from NetBSD
  • Add new interceptor for mi_vector_hash(3)
  • Cast _Unwind_GetIP() and _Unwind_GetRegionStart() to uintptr_t
  • Cast the 2nd argument of _Unwind_SetIP() to _Unwind_Ptr (reverted as it broke "MacPro Late 2013")
  • Add interceptor for the setvbuf(3) from NetBSD
  • Add a new interceptor for getvfsstat(2) from NetBSD

A single patch landed in the LLVM source tree:

  • Swap order of discovering of -ltinfo and -lterminfo (originated by Ryo Onodera in pkgsrc)

Patches submitted upstream and still in review:

  • Add interceptors for the sha1(3) from NetBSD
  • Add interceptors for the md4(3) from NetBSD
  • Add interceptors for the rmd160(3) from NetBSD
  • Add interceptors for md5(3) from NetBSD
  • Add a new interceptor for nl_langinfo(3) from NetBSD
  • Add a new interceptor for fparseln(3) from NetBSD
  • Add a new interceptor for modctl(2) from NetBSD
  • Add a new interceptors for statvfs1(2) and fstatvfs1(2) from NetBSD
  • Add a new interceptors for cdbr(3) and cdbw(3) API from NetBSD
  • Add interceptors for the sysctl(3) API family from NetBSD
  • Add interceptors for the fts(3) API family from NetBSD
  • Implement getpeername(2) interceptor
  • Add new interceptors for vis(3) API in NetBSD
  • Add new interceptor for regex(3) in NetBSD
  • Add new interceptor for strtonum(3)
  • Add interceptors for the strtoi(3)/strtou(3) from NetBSD
  • Add interceptors for the sha2(3) from NetBSD

Patches still kept locally:

  • ASan thread's termination destructor
  • MSan thread's termination destructor
  • Interceptors for getchar(3) API (might be abandoned as FILE/DIR sanitization isn't done)
  • Incomplete interceptor for mount(2) (might be abandoned as unfinished)

This month I've received also a piece of help from Michal Gorny who improved the NetBSD support in LLVM projects with the following changes:

  • [unittest] Skip W+X MappedMemoryTests when MPROTECT is enabled
  • [cmake] Fix detecting terminfo library

Changes to the NetBSD distribution

I've reduced the number of changes to the src/ distribution to corrections related to interceptors.

  • Document SHA1FileChunk(3) in sha1(3)
  • Fix link sha1.3 <- SHA1File.3
  • Define MD4_DIGEST_STRING_LENGTH in <md4.h>
  • Correct the documentation of cdbr_open_mem(3)

Plan for the next milestone

I will keep upstreaming local LLVM patches (less than 2000LOC to go!).

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:

http://netbsd.org/donations/#how-to-donate

Posted late Sunday afternoon, December 2nd, 2018 Tags: blog
I have presented the state of NetBSD sanitizers during two conferences in the San Francisco Bay Area: Google Summer of Code Mentor Summit (Mountain View) and MeetBSDCa (Santa Clara, Intel Campus SC12). I've also made progress in upstreaming of our local patches to LLVM sanitizers and introducing generic NetBSD enhancements there.

The Bay Area

I took part (together with William Coldwell - cryo@) in the GSOC Mentor Summit as a NetBSD delegate. I've presented during the event a presentation with a quick introduction to NetBSD, track history of GSoC involvement and the LLVM Sanitizers work with a stress of sanitizers.

The MeetBSDCa conference is a continuation of the MeetBSD conferences from Poland. I took part there as a speaker talking about Userland Sanitizers in NetBSD. I've also presented the state of virtualization in NetBSD during a discussion panel. Additionally I've prepared a lightning talk about NetBSD Kernel sanitizers and quick status update from The NetBSD Foundation. Unfortunately the schedule was last minute changed (introduction of BSD history talk in the slot of lightning presentations) and the closing ceremony had different proceeding. Nonetheless, I'm sharing these additional quick presentations.

During the former conference it was a great opportunity to meet people from other Open Source projects, a lot of them are in interaction with NetBSD developers during the process of upstreaming local support patches. During the latter conference it was an opportunity to meet BSD people and people closer to hardware companies.

Upstreaming process of LLVM Sanitizers

I've upstreamed a number of patches to the LLVM source tree. The changes can be summarized as:

  • Further reworking the code and approaching the state of installing of sysctl*() inteceptors.
  • Fixing or marking failing or hanging tests in the sanitizer test-suites.
  • Adapting definitions of syscalls and ioctl(2) operations for NetBSD 8.99.25.

Detailed list of commits merged with the upstream LLVM compiler-rt repository:

  • Update ioctl(2) operations for NetBSD 8.99.25
  • Update generate_netbsd_ioctls.awk for NetBSD 8.99.25
  • Diable test suppressions-library for NetBSD/i386
  • Disable BufferOverflowAfterManyFrees for NetBSD
  • Mark breaking asan tests on NetBSD
  • Switch getline_nohang from XFAIL to UNSUPPORTED for NetBSD
  • Mark vptr-non-unique-typeinfo as a broken test for NetBSD/i386
  • Mark breaking sanitizer_common tests on NetBSD
  • Handle NetBSD alias for pthread_sigmask
  • Cast the return value of _Unwind_GetIP() to uptr
  • Mark interception_failure_test with XFAIL for NetBSD
  • Disable ASan test asan_and_llvm_coverage_test for NetBSD
  • Adapt ASan test heavy_uar_test for NetBSD
  • Mark breaking TSan tests on NetBSD with XFAIL
  • Cleanup includes in sanitizer_platform_limits_netbsd.cc
  • Regenerate syscall hooks for NetBSD 8.99.25
  • Update generate_netbsd_syscalls.awk for NetBSD 8.99.25
  • Handle pthread_sigmask in DemangleFunctionName()
  • Drop now hidden ioctl(2) operations for NetBSD
  • Handle NetBSD symbol mangling for tzset
  • Handle NetBSD symbol mangling for nanosleep and vfork
  • Mark test/tsan/getline_nohang as XFAIL for NetBSD
  • Disable the GNU strerror_r TSan test for NetBSD
  • Mark test/tsan/ignore_lib5 as unsupported for NetBSD
  • Mark intercept-rethrow-exception.cc as XFAIL on NetBSD
  • Disable failing tests lib/asan/tests on NetBSD
  • Skip unsupported MSan tests on NetBSD
  • Mark 4 MSan tests as XFAIL for NetBSD
  • Mark MSan fork test as UNSUPPORTED on NetBSD
  • Reflect the current reality and disable lsan tests on NetBSD
  • Use PTHREAD_STACK_MIN conditionally in a test
  • Remove remnant code of using indirect syscall on NetBSD
  • Don't harcode -ldl test/sanitizer_common/TestCases
  • Disable TestCases/pthread_mutexattr_get on NetBSD
  • Fix Posix/devname_r for NetBSD
  • Unwind local macro DEFINE_INTERNAL()
  • Introduce internal_sysctlbyname in place of sysctlbyname

Frequently asked question

People keep asking me about rationale of some design decisions in sanitizers and whether something could be done better. One of such places is to reuse more of libc internals and to not keep bypassing it whenever possible. The motivation to keep redoing the same work for NetBSD is to keep close to the upstream (mostly Linux & Android) source code with a minimal delta between NetBSD vs others support. Doing some operations in a more convenient way is tempting, but it's a danger that someone will need to keep maintaining a larger diff, especially since upstream developers will focus on their own OSes rather than trying to adapt their patches for potentially alternative approaches.

Plan for the next milestone

I will keep upstreaming local LLVM patches (almost 2500LOC to go!).

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:

http://netbsd.org/donations/#how-to-donate

Posted in the wee hours of Wednesday night, November 1st, 2018 Tags: blog
I presented the state of NetBSD sanitizers during EuroBSDCon 2018 held in Bucharest, Romania.

I gave two talks, one covered userland sanitizers and the other one kernel sanitizers. Unfortunately video recordings from the conference are not available, but I've uploaded my slides online:

Besides participating in the conference and preparing for the travel and talks I've been researching the libunwind port to NetBSD and further integration of Lua. The libunwind port from the nongnu project has been approached to passing 22 out of 33 tests and the current blocker is the lack of signal trampoline handling or annotation. A signal trampoline is a special libc function, registered into the kernel, that is used as a helper routine to install and use signal handlers. Backtracing the function call stack is not trivial. We need to either annotate the assembly code in the trampoline with DWARF notes or handle it differently inside an unwinder.

I wrote a toy application using the newly created Lua binding for the curses(3) library. The process of writing the Lua bindings resulted in detecting various bugs in the native curses library. A majority of these bugs have been already fixed with aid of Roy Marples and Rin Okuyama, though they are still waiting for merge. I intend to keep working on the bindings in my spare time, but a shortcoming is that there are a lot of API functions (over 300!), and covering them all is time consuming process.

Meanwhile, I've made progress in the upstreaming of local LLVM patches. I've finally upstreamed to switch of indirect syscall (syscall(2)/__syscall(2)) to direct libc calls.

Plan for the next milestone

I will visit the GSoC Mentor Summit & MeetBSDCa in October (California, the U.S.). In the time besides the conference I will keep upstreaming local LLVM patches (almost 3000LOC to go!).

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:

http://netbsd.org/donations/#how-to-donate

Posted Monday night, October 1st, 2018 Tags: blog

This was my first big BSD conference. We also planned - planned might be a big word - thought about doing a devsummit on Friday. Since the people who were in charge of that had a change of plans, I was sure it'd go horribly wrong.

The day before the devsummit and still in the wrong country, I mentioned the hours and venue on the wiki, and booked a reservation for a restaurant.

It turns out that everything was totally fine, and since the devsummit was at the conference venue (that was having tutorials that day), they even had signs pointing at the room we were given. Thanks EuroBSDCon conference organizers!

At the devsummit, we spent some time hacking. A few people came with "travel laptops" without access to anything, like Riastradh, so I gave him access to my own laptop. This didn't hold very long and I kinda forgot about it, but for a few moments he had access to a NetBSD source tree and an 8 thread, 16GB RAM machine with which to build things.

We had a short introduction and I suggested we take some pictures, so here's the ones we got. A few people were concerned about privacy, so they're not pictured. We had small team to hold the camera :-)

At the actual conference days, I stayed at the speaker hotel with the other speakers. I've attempted to make conversation with some visibly FreeBSD/OpenBSD people, but didn't have plans to talk about anything, so there was a lot of just following people silently.
Perhaps for the next conference I'll prepare a list of questions to random BSD people and then very obviously grab a piece of paper and ask, "what was...", read a bit from it, and say, "your latest kernel panic?", I'm sure it'll be a great conversation starter.

At the conference itself, was pretty cool to have folks like Kirk McKusick give first person accounts of some past events (Kirk gave a talk about governance at FreeBSD), or the second keynote by Ron Broersma.

My own talk was hastily prepared, it was difficult to bring the topic together into a coherent talk. Nevertheless, I managed to talk about stuff for a while 40 minutes, though usually I skip over so many details that I have trouble putting together a sufficiently long talk.

I mentioned some of my coolest bugs to solve (I should probably make a separate article about some!). A few people asked for the slides after the talk, so I guess it wasn't totally incoherent.

It was really fun to meet some of my favourite NetBSD people. I got to show off my now fairly well working laptop (it took a lot of work by all of us!).

After the conference I came back with a conference cold, and it took a few days to recover from it. Hopefully I didn't infect too many people on the way back.

Posted late Monday evening, October 1st, 2018 Tags: blog
Peter Wemm's writeup about using acme.sh for FreeBSD.org served as inspiration, but I chose to do a few things different:
  • using DNS alias mode with sub-domains dedicated to ACME verification
  • delegating the sub-domains to the servers where the certificate will be needed
  • using bind on the servers where the certificate will be needed (where it was running as resolver already anyway)
  • using dns_nsupdate (i.e. dynamic DNS) to add the challenge to the ACME subzone.
Appropriately restricted, that gives the following addition to named.conf on the target server (with an update key named acme-ddns):
options {
        ....
        allow-update { localhost; };
        ....
};

zone "acme-www.pkgsrc.org" {
        type master;
        file "acme/acme-www.pkgsrc.org";
        update-policy {
                grant acme-ddns name _acme-challenge.acme-www.pkgsrc.org. TXT;
        };
};
And last but not least, deployment of certificates via make, i.e. completely independent of acme.sh.

Due to all of the above, acme.sh does not need to tentacle about in the filesystem and can run as a plain user in a chroot. It's not a tiny chroot, though (20M), since acme.sh needs a bunch of common shell tools:

  • awk basename cat chmod cp curl cut date egrep/grep head mkdir mktemp mv nsupdate od openssl printf readlink rm sed sh sleep stat tail touch tr uname, and their shared libs, /libexec/ld.elf_so and /usr/libexec/ld.elf_so;
  • under the chroot /etc a resolv.conf, the CA cert for Let's Encrypt (mozilla-rootcert-60.pem) and to make openssl complain less an empty openssl.cnf
  • and in the chroot /dev: null, random and urandom.

I call both the acme.sh --cron job and the certificate deployment make from daily.local, which adds the output to the daily mail and makes it easy to keep an eye on things.

Posted Monday night, September 17th, 2018 Tags: blog
Over the past month, I was coordinating and coding the remaining post-GSoC tasks. This mostly covers work around honggfuzz and sanitizers.

honggfuzz ptrace(2) features

I've introduced new ptrace(2) tests verifying attaching to a stopped process. This is an important scenario in debuggers, the ability to call a ptrace(2) operation with the PT_ATTACH argument with a process id (PID) of a process entity that is stopped. In typical circumstances PT_ATTACH causes an executing process to stop and emit SIGSTOP for the tracer. An already stopped process is a special case as we cannot stop it again. Not every UNIX-like kernel can handle this scenario in a sensible way, and the modern solution is to keep the process stopped (rather than e.g... resumed) and emit a new signal SIGSTOP to the debugger (rather than e.g. not emitting anything). There used to be complex workarounds for mainstream kernels in debuggers such as GDB to workaround kernel bugs of attaching to a stopped process.

honggfuzz is a security oriented, feedback-driven, evolutionary, easy-to-use fuzzer with interesting analysis options. This piece of software is developed by a Google employee, however the product is not an official Google software. honggfuzz uses on featured platforms ptrace(2) to monitor crash signals in traced processes. I've implemented a new backed in the fuzzer for NetBSD using its ptrace(2) API. The backend is designed to follow the existing scenarios and features in Linux & Android:

  • Attaching to a manually stopped process, optionally spawned by the fuzzer.
  • Option to attach to a selected process over its PID.
  • Crash instruction decoding with aid of a disassembler (capstone for NetBSD).
  • Ability to monitor multiple processes with an arbitrary number of threads.
  • Monitor forks(2) and vforks(2) events, however unused in the current fuzzing model.
  • Concurrent execution of multiple processes.
  • Timeout of long-lasting (hanging?) processes in persistent mode (limit: 0.25[sec]).

There are few missing features:

  • Intel BTS - hardware assisted tracing
  • Intel PT - hardware assisted tracing
  • hp libunwind - unwinding stack of a traced process for better detection of unique crashes

Sanitizers

I've started researching Kernel Address Sanitizer, checking the runtime internals and differences between its version ABIs. My intention was to join efforts with Siddharth (GSoC student) and head with a sanitzier for EuroBSDCon 2018 in Romania. However, Maxime Villard decided to join the efforts a little bit earlier and he managed to get quickly a functional bare version for NetBSD/amd64. In the end we have decided to leave the kASan work to Maxime for now and let Siddharth to work on a kCov (SanitizerCoverage) device. SanCov is a feature of compilers, designed as an aid for fuzzers to ship interesting information from a fuzzing point of view of a number of function calls, comparisons, divisions etc. Successful userland fuzzers (such as AFL, honggfuzz) use this feature as an aid in determining of new code-paths. It's the same with a renowned kernel fuzzer - syzkaller.

While, I'm helping Siddharth to port a kcov(4) device to NetBSD, I've switched to the remaining pending tasks in userland sanitizers. I've managed to switch the sanitizers from syscall(2) and __syscall(2) - indirect system call API - calls to libc routines. The approach of using an indirect generic interface didn't work well in the NetBSD case, as there is the need to handle multiple ABIs, Endians, CPU architectures, and the C language ABI is not a good choice to serialize and deserialize arbitrary function arguments with various types through a generic interface. The discussion on the rationale is perhaps not the proper place, and every low-level C developer is well aware of the problems. It's better to restrict the discussion to the statement that it's not possible (not trivial) to call in a portable way all the needed syscalls, without the aid of per-case auxiliary switches and macros. There are also some cases (such as pipe(2)) when is is not possible to express the system call semantics with syscall(2)/__syscall(2).

I've switched these routines to use internal libc symbols when possible. In the remaining cases I've used a fallback to libc's versions of the routines, with aid of indirect function pointers. I'm trying to detect the addresses of real functions with dlsym(3) calls. In the result, I've switched all the uses of syscall(2) and __syscall(2) and observed no regressions in tests.

I'm also in the process of deduplication of local patches to sanitizers. My current main focus is to finish switching syscall(2) and __syscall(2) to libc routines (patch pending upstream), introduce a new internal version of sysctl(3) that bypasses interceptors (partially merged upstream) and introduces new interceptors for sysctl(3) calls. This is a convoluted process in the internals with the goal to make the sanitizers more reliable across NetBSD targets and manage to sanitize less trivial examples such as rumpkernels. The RUMP code uses internally a modified and private versions of sysctl*() operations and we still must keep the internals in order and properly handle the RUMP code.

Merged commits

The NetBSD sources:

  • Merge FreeBSD improvements to the man-page of timespec_get(3)
  • Remove unused symbols from sys/sysctl.h
  • Add a new ATF ptrace(2) test: child_attach_to_its_stopped_parent
  • Add await_stopped() in t_ptrace_wait.h
  • Add a new ATF test parent_attach_to_its_stopped_child
  • Add a new ATF ptrace(2) test: tracer_attach_to_unrelated_stopped_process
  • Drop a duplicate instruction line [libpthread]
  • Mark kernel-asan as done (by maxv)
  • TODO.sanitizers: Mark switch of syscall(2)/__syscall(2) to libc done

The LLVM sources:

  • Introduce new type for inteceptors UINTMAX_T
  • Add internal_sysctl() used by FreeBSD, NetBSD, OpenBSD and MacOSX
  • Improve portability of internal_sysctl()
  • Try to fix internal_sysctl() for MacOSX
  • Try to unbreak internal_sysctl() for MacOSX

Summary

I'm personally proud of the success of the reliability of the ptrace(2) backend in the renowned honggfuzz fuzzer. NetBSD was capable to handling all the needed features and support all of them with an issue-free manner. Once, I will address the remaining ptrace(2) issues on my TODO list - the NetBSD kernel will be capable to host more software in a similar fashion, and most importantly a fully featured debugger such as GDB and LLDB, however without the remaining hiccups.

We are also approaching another milestone with the sanitizers' runtime available in the compiler toolchain: sanitizing rumpkernels. It is already possible to execute the rump code against a homegrown uUBSan runtime, but we are heading now to execute the code under the default runtime for the remaining sanitizers (ASan, MSan, TSan).

For the record, it has been reported that kUBSan has been ported from NetBSD to at least two kernels: FreeBSD and XNU.

Plan for the next milestone

I'm in preparation for my visit to EuroBSDCon (Bucharest, Romania) in September and GSoC Mentor Summit & MeetBSDCa in October (California, the U.S.). I intend to rest during this month and still provide added value to the project, porting and researching missing software dedicated for developers. Among others, I'm planning to research the HP libunwind library and if possible, port it to NetBSD.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:

http://netbsd.org/donations/#how-to-donate

Posted Monday night, September 3rd, 2018 Tags: blog
Add a comment