Jun 2017
S M T W T F S
       
29 30  

Archives

This page is a blog mirror of sorts. It pulls in articles from blog's feed and publishes them here (with a feed, too).

Nicolas Joly passed away on 2017-06-07.

Born in 1969, he developed a passion for computer science in his youth but chose to study biology. He then attended a bioinformatics training at the Pasteur Institute in 1997 at the end of which he got hired in the scientific IT department. He has continuously worked there since. Not only was he skilled in biology but also in system administration, databases, and programming.

He was introduced to NetBSD in 1999 and quickly became a dedicated user and hacker. He became a NetBSD developer in 2007 for working on Linux/i386 (32 bits) emulation on NetBSD/amd64 (64 bits).

Nicolas leaves a wife and three children.

Memorial note submitted by Marc Baudoin.

Posted at teatime on Tuesday, June 20th, 2017 Tags: blog
Hello all,

the repository conversion setup for NetBSD CVS -> Fossil -> Git has found a new home. Ironically, on former cvs.NetBSD.org hardware. This provides a somewhat faster conversion cycle as well as removing anoncvs.NetBSD.org from the process. This should avoid occasional problems with incomplete syncs. Two other changes have been applied at the same time:

  • The Fossil repositories have moved to using the SHA512 checksums internally. To avoid accidents, a new project code is used. This requires Fossil 2.x.
  • The Git repositories map user names to @NetBSD.org addresses (src, xsrc) or @pkgsrc.org addresses (pkgsrc). This allows consolidation on github with user accounts, assuming you have the corresponding addresses as primary or secondary address.

The new locations for the repositories are:

CVS Fossil Git
src https://src.fossil.NetBSD.org https://github.com/NetBSD/src
pkgsrc https://pkgsrc.fossil.NetBSD.org https://github.com/NetBSD/pkgsrc
xsrc https://xsrc.fossil.NetBSD.org https://github.com/NetBSD/xsrc

The old conversions will be provided for the near future, but likely stop toward the end of the month.

A special thanks goes to Petra and Christos from the admin team for the assistance with the machine setup and low-level CVS clean-ups.

Joerg

-- source https://mail-index.netbsd.org/tech-repository/2017/06/10/msg000637.html

Posted late Saturday evening, June 10th, 2017 Tags: blog
This month I started to work on correcting of the ptrace(2) layer, as test suites used to trigger failures on the kernel side. This finally ended up sanitizing the LLDB runtime as well, addressing LLDB and NetBSD userland bugs.

It turned out that more bugs were unveiled and this is not the final report on LLDB.

The good

Besides the greater enhancements this month I performed a cleanup in the ATF ptrace(2) tests again. Additionally I have managed to unbreak the LLDB Debug build and to eliminate compiler warnings in the NetBSD Native Process Plugin.

It is worth noting that LLVM can run tests on NetBSD again, the patch in gtest/LLVM has been installed by Joerg Sonnenberg and a more generic one has been submitted to the upstream googletest repository. There was also an improvement in ftruncate(2) on the LLVM side (authored by Joerg).

Since LLD (the LLVM linker) is advancing rapidly, it improved support for NetBSD and it can link a functional executable on NetBSD. I submitted a patch to stop crashing it on startup anymore. It was nearly used for linking LLDB/NetBSD and it spotted a real linking error... however there are further issues that need to be addressed in the future. Currently LLD is not part of the mainline LLDB tasks - it's part of improving the work environment. This linker should reduce the linking time - compared to GNU linkers - of LLDB by a factor of 3x-10x and save precious developer time. As of now LLDB linking can take minutes on a modern amd64 machine designed for performance.

Kernel correctness

I have researched (in pkgsrc-wip) initial support for multiple threads in the NetBSD Native Process Plugin. This code revealed - when running the LLDB regression test-suite - new kernel bugs. This unfortunately affects the usability of a debugger in a multithread environment in general and explains why GDB was never doing its job properly in such circumstances.

One of the first errors was asserting kernel panic with PT_*STEP, when a debuggee has more than a single thread. I have narrowed it down to lock primitives misuse in the do_ptrace() kernel code. The fix has been committed.

LLDB and userland correctness

LLDB introduced support for kevent(2) and it contains the following function:

Status MainLoop::RunImpl::Poll() {
  in_events.resize(loop.m_read_fds.size());
  unsigned i = 0;
  for (auto &fd : loop.m_read_fds)
    EV_SET(&in_events[i++], fd.first, EVFILT_READ, EV_ADD, 0, 0, 0);
  num_events = kevent(loop.m_kqueue, in_events.data(), in_events.size(),
                      out_events, llvm::array_lengthof(out_events), nullptr);
  if (num_events < 0)
    return Status("kevent() failed with error %d\n", num_events);
  return Status();
}

It works on FreeBSD and MacOSX, however it broke on NetBSD.

Culprit line:

   EV_SET(&in_events[i++], fd.first, EVFILT_READ, EV_ADD, 0, 0, 0);

FreeBSD defined EV_SET() as a macro this way:

#define EV_SET(kevp_, a, b, c, d, e, f) do {    \
        struct kevent *kevp = (kevp_);          \
        (kevp)->ident = (a);                    \
        (kevp)->filter = (b);                   \
        (kevp)->flags = (c);                    \
        (kevp)->fflags = (d);                   \
        (kevp)->data = (e);                     \
        (kevp)->udata = (f);                    \
} while(0)

NetBSD version was different:

#define EV_SET(kevp, a, b, c, d, e, f)                                  \
do {                                                                    \
        (kevp)->ident = (a);                                            \
        (kevp)->filter = (b);                                           \
        (kevp)->flags = (c);                                            \
        (kevp)->fflags = (d);                                           \
        (kevp)->data = (e);                                             \
        (kevp)->udata = (f);                                            \
} while (/* CONSTCOND */ 0)
This resulted in heap damage, as keyp was incremented every time a value was assigned to (keyp)->.

Without GCC asan and ubsan tools, finding this bug would be much more time consuming, as the random memory corruption was affecting unrelated lambda function in a different part of the code.

To use the GCC sanitizers with packages from pkgsrc, on NetBSD-current, one has to use one or both of these lines:

_WRAP_EXTRA_ARGS.CXX+= -fno-omit-frame-pointer -O0 -g -ggdb -U_FORTIFY_SOURCE -fsanitize=address -fsanitize=undefined -lasan -lubsan
CWRAPPERS_APPEND.cxx+= -fno-omit-frame-pointer -O0 -g -ggdb -U_FORTIFY_SOURCE -fsanitize=address -fsanitize=undefined -lasan -lubsan

While there, I have fixed another - generic - bug in the LLVM headers. The class Triple constructor hadn't initialized the SubArch field, which upsetting the GCC address sanitizer. It was triggered in LLDB in the following code:

void ArchSpec::Clear() {
  m_triple = llvm::Triple();
  m_core = kCore_invalid;
  m_byte_order = eByteOrderInvalid;
  m_distribution_id.Clear();
  m_flags = 0;
}

I have filed a patch for review to address this.

The bad

Unfortunately this is not the full story and there is further mandatory work.

LLDB acceleration

The EV_SET() bug broke upstream LLDB over a month ago, and during this period the debugger was significantly accelerated and parallelized. It is difficult to declare it definitely, but it might be the reason why the tracer's runtime broke due to threading desynchronization. LLDB behaves differently when run standalone, under ktruss(1) and under gdb(1) - the shared bug is that it always fails in one way or another, which isn't trivial to debug.

The ugly

There are also unpleasant issues at the core of the Operating System.

Kernel troubles

Another bug with single-step functions that affects another aspect of correctness - this time with reliable execution of a program - is that processes die in non-deterministic ways when single-stepped. My current impression is that there is no appropriate translation between process and thread (LWP) states under a debugger.

These issues are sibling problems to unreliable PT_RESUME and PT_SUSPEND.

In order to be able to appropriately address this, I have diligently studied this month the Solaris Internals book to get a better image of the design of the NetBSD kernel multiprocessing, which was modeled after this commercial UNIX.

Plan for the next milestone

The current troubles can be summarized as data races in the kernel and at the same time in LLDB. I have decided to port the LLVM sanitizers, as I require the Thread Sanitizer (tsan). Temporarily I have removed the code for tracing processes with multiple threads to hide the known kernel bugs and focus on the LLDB races.

Unfortunately LLDB is not easily bisectable (build time of the LLVM+Clang+LLDB stack, number of revisions), therefore the debugging has to be performed on the most recent code from upstream trunk.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:

http://netbsd.org/donations/#how-to-donate

Posted Tuesday afternoon, June 6th, 2017 Tags: blog

If you've been reading source-changes@, you likely noticed the recent creation of the netbsd-8 branch. If you haven't been reading source-changes@, here's some news: the netbsd-8 branch has been created, signaling the beginning of the release process for NetBSD 8.0.

We don't have a strict timeline for the 8.0 release, but things are looking pretty good at the moment, and we expect this release to happen in a shorter amount of time than the last couple major releases did.

At this point, we would love for folks to test out netbsd-8 and let us know how it goes. A couple of major improvements since 7.0 are the addition of USB 3 support and an overhaul of the audio subsystem, including an in-kernel mixer. Feedback about these areas is particularly desired.

To download the latest binaries built from the netbsd-8 branch, head to http://daily-builds.NetBSD.org/pub/NetBSD-daily/netbsd-8/

Thanks in advance for helping make NetBSD 8.0 a stellar release!

Posted early Tuesday morning, June 6th, 2017 Tags: blog
QEMU - the FAST! processor emulator - is a generic, Open Source, machine emulator and virtualizer. It defines state of the art in modern virtualization.

This software has been developed for multiplatform environments with support for NetBSD since virtually forever. It's the primary tool used by the NetBSD developers and release engineering team. It is run with continuous integration tests for daily commits and execute regression tests through the Automatic Test Framework (ATF).

Since the projects keep researching and developing support for various modern trends in computing, the gap between the QEMU featureset in NetBSD and Linux diverged due to lack of active NetBSD maintenance resulted in breaking the default build.

The QEMU developers warned the Open Source community - with version 2.9 of the emulator - that they will eventually drop support for suboptimally supported hosts if nobody will step in and take the maintainership to refresh the support. This warning was directed to major BSDs, Solaris, AIX and Haiku.

Thankfully the NetBSD position has been filled - making NetBSD to restore official maintenance.

The current roadmap in QEMU/NetBSD is as follows:

  • address all build failures [all patches sent to review, part of them already merged upstream],
  • address all build warnings,
  • restore the QEMU setup to run regression tests on NetBSD.
With the goal to move on to the maintenance mode, catching up with regressions, adding NetBSD node in the regression tests cluster and reducing the featureset gap. There are various missing functions on NetBSD, including: resurrecting user-mode emulation, suboptimal kernel aio(3) support, hugepagefs support, hardware assisted virtualization, passthrough PCI and SRIOV.

This effort is spare time activity - as of now without commercial support - and possible thanks to unloading the developer (myself) from more urgently pending tasks in NetBSD thanks to the contract for enhancing debuggers in business hours.

Posted late Tuesday night, May 17th, 2017 Tags: blog
We are very happy to announce that the selection process in this year's Summer of Code with its bargaining of slots and what student gets assigned to which project is over. As a result, the following students will take on their projects:

  • Leonardo Taccari will work add multi-packages support to pkgsrc.
  • Maya Rashish will work on the LFS cleanup.
  • Utkarsh Anand will make Anita support multiple virtual machine systems and more architectures within them to improve testing coverage.
What follows now is a community bonding period until May 30th, followed by a coding period over the summer (it's Summer of Code, after all :-)) until August 21st, evaluations, code submission and an announcement of the results on September 6th 2017.

Good luck to all our students and their mentors - we look forward to your work results, and welcome you to The NetBSD Project!

Posted Friday evening, May 5th, 2017 Tags: blog

Coming soon we have a new set of kernel synchronization routines - localcount(9) - which provide a medium-weight reference-counting mechanism. From the manual page, "During normal operations, localcounts do not need the interprocessor synchronization associated with atomic_ops(3) atomic memory operations, and (unlike psref(9)) localcount references can be held across sleeps and can migrate between CPUs. Draining a localcount requires more expensive interprocessor synchronization than atomic_ops(3) (similar to psref(9)). And localcount references require eight bytes of memory per object per-CPU, significantly more than atomic_ops(3) and almost always more than psref(9)."

We'll be adding localcount(9) reference counting to the device driver cdevsw and bdevsw structures, to ensure that a (modular) device driver cannot be removed while it is active. Modular drivers with initializers for these structures need to be modified to initialize their localcount members, using the DEVSW_MODULE_INIT macro (this change is mandatory for all loadable drivers). To take advantage of the reference counting, the drivers also need to replace all calls to bdevsw_lookup() and cdevsw_lookup() with bdevsw_lookup_acquire() and cdevsw_lookup_acquire() respectively, and then release the reference using bdevsw_release() and cdevsw_release().

We'll also be using localcount(9) to provide reference-counting of individual device units, to prevent a unit from being destroyed while it is active. To implement device unit reference-counting, all calls to device_lookup(), device_find_by_driver_unit(), and device_lookup_private() need to be replaced by their corresponding *_acquire() variant; when the caller is finished using the device, it must release the reference using device_release().

More details and examples can be seen by examining the prg-localcount2 branch in cvs including the new localcount(9) manual page!

Posted early Wednesday morning, May 3rd, 2017 Tags: blog

Coming soon we have a new set of kernel synchronization routines - localcount(9) - which provide a medium-weight reference-counting mechanism. From the manual page, "During normal operations, localcounts do not need the interprocessor synchronization associated with atomic_ops(3) atomic memory operations, and (unlike psref(9)) localcount references can be held across sleeps and can migrate between CPUs. Draining a localcount requires more expensive interprocessor synchronization than atomic_ops(3) (similar to psref(9)). And localcount references require eight bytes of memory per object per-CPU, significantly more than atomic_ops(3) and almost always more than psref(9)."

We'll be adding localcount(9) reference counting to the device driver cdevsw and bdevsw structures, to ensure that a (modular) device driver cannot be removed while it is active. Modular drivers with initializers for these structures need to be modified to initialize their localcount members, using the DEVSW_MODULE_INIT macro (this change is mandatory for all loadable drivers). To take advantage of the reference counting, the drivers also need to replace all calls to bdevsw_lookup() and cdevsw_lookup() with bdevsw_lookup_acquire() and cdevsw_lookup_acquire() respectively, and then release the reference using bdevsw_release() and cdevsw_release().

We'll also be using localcount(9) to provide reference-counting of individual device units, to prevent a unit from being destroyed while it is active. To implement device unit reference-counting, all calls to device_lookup(), device_find_by_driver_unit(), and device_lookup_private() need to be replaced by their corresponding *_acquire() variant; when the caller is finished using the device, it must release the reference using device_release().

More details and examples can be seen by examining the prg-localcount2 branch in cvs including the new localcount(9) manual page!

Posted early Wednesday morning, May 3rd, 2017 Tags: blog
Last month I have worked on features of the Process Plugin on NetBSD and support for threads in core(5) files.

What has been done in NetBSD

I've managed to achieve the following accomplishments:

Introduction of PT_SETSTEP and PT_CLEARSTEP

This allows to:

  • singlestep particular threads,
  • combine PT_STEP with PT_SYSCALL,
  • combine PT_STEP and emission of a signal.

There are equivalent operations in FreeBSD with the same names.

Introduction of helper macro PTRACE_BREAKPOINT_ASM

This code was prepared by Nick Hudson and it was used in ATF tests to verify behavior of software breakpoints.

Addition of new sysctl(2) functions

Add new defines in sysctl(2) on amd64 and i386 ports. These values are defined in <x86/cpu.h>:

  • CPU_FPU_SAVE (15)
       int: FPU Instructions layout
       * to use this, CPU_OSFXSR must be true
       * 0: FSAVE
       * 1: FXSAVE
       * 2: XSAVE
       * 3: XSAVEOPT
    
  • CPU_FPU_SAVE_SIZE (16)
       int: FPU Instruction layout size
    
  • CPU_XSAVE_FEATURES (17)
       quad: FPU XSAVE features
    
  • Bump CPU_MAXID from 15 to 18.

These values are useful to get FPU (floating point unit) properties in e.g. a debugger. This information is required to properly implement FPR (floating point register) tracer operations on x86 processors.

Corrections in ptrace(2) man-page

Few mistakes were corrected to make the documentation more correct.

ATF tests cleanup in ptrace(2)

There were added new tests for new ptrace(2) operations (PT_SETSTEP and PT_CLEARSTEP).

Also several tests were updated to reflect the current state of "successfully passed" and "expected failure". This is important to mark issues that are already known and quickly catch new regressions in future changes.

F_GETPATH in fcntl(2)

It was decided that NetBSD will not introduce new fcntl(2) function for compatibility with certain other systems. This means that once LLDB will require this feature, we will need to introduce a workaround in the project.

What has been done in LLDB

The NetBSD Process Plugin in LLDB acquired new capabilities. Additionally enhancements in LLDB were developed such as handling threads in core(5) files.

Floating point support

The x86_64 architecture supports in default properties FXSAVE processor instructions. The FXSAVE feature allows to operate over floating point registers. A thread state (context) is composed of (and not restricted to) general and floating point registers.

The NetBSD Process Plugin acquired the functionality to read these registers and optionally set new values for them.

Watchpoint support

A programmer can use hardware assisted watchpoints to stop execution of a tracee whenever a certain variable or instruction was read/written/executed. The support for this feature has been implemented on NetBSD with ptrace(2) operations PT_SETDBREGS and PT_GETDBREGS. These operations are now available in the LLDB Process plugin.

Threads support in core(5) files

I've included support for LWPs in core(5) files. This means that larger programs with threads, like Firefox that emitted coredump for some reason (usually during crash) can be investigated postmortem.

Demo

I've prepared a recording with the script(1) utility from the NetBSD base system. To replay it:

script -p ./firefox-core.typescript

This recording shows a debugging session of a Firefox core(5) file.

(I was kind to prepare a Linux version of the NetBSD script(1) here).

Plan for the next milestone

The plan for the next milestone is continuing development of threads in the NetBSD Process Plugin. I will need to work more on correctness of ptrace(2) calls as new issues were detected in setups with threads that resulted in crashes.

There is also ongoing work on a new build node running NetBSD-current (prerelease of 8) and building LLVM+Clang+LLDB. I'm working on enabling unit tests to catch functional regressions quickly. The original LLDB node cluster was privately funded by myself in the last two years and has been switched to a machine hosted by The NetBSD Foundation.

To keep this machine up and running (8 CPU, 24 GB RAM) community support through donations is required. This is crucial to actively maintain the LLVM toolchain (Clang, LLDB and others) on NetBSD.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:

http://netbsd.org/donations/#how-to-donate

Posted in the wee hours of Monday night, May 2nd, 2017 Tags: blog
Last month I have worked on features of the Process Plugin on NetBSD and support for threads in core(5) files.

What has been done in NetBSD

I've managed to achieve the following accomplishments:

Introduction of PT_SETSTEP and PT_CLEARSTEP

This allows to:

  • singlestep particular threads,
  • combine PT_STEP with PT_SYSCALL,
  • combine PT_STEP and emission of a signal.

There are equivalent operations in FreeBSD with the same names.

Introduction of helper macro PTRACE_BREAKPOINT_ASM

This code was prepared by Nick Hudson and it was used in ATF tests to verify behavior of software breakpoints.

Addition of new sysctl(2) functions

Add new defines in sysctl(2) on amd64 and i386 ports. These values are defined in <x86/cpu.h>:

  • CPU_FPU_SAVE (15)
       int: FPU Instructions layout
       * to use this, CPU_OSFXSR must be true
       * 0: FSAVE
       * 1: FXSAVE
       * 2: XSAVE
       * 3: XSAVEOPT
    
  • CPU_FPU_SAVE_SIZE (16)
       int: FPU Instruction layout size
    
  • CPU_XSAVE_FEATURES (17)
       quad: FPU XSAVE features
    
  • Bump CPU_MAXID from 15 to 18.

These values are useful to get FPU (floating point unit) properties in e.g. a debugger. This information is required to properly implement FPR (floating point register) tracer operations on x86 processors.

Corrections in ptrace(2) man-page

Few mistakes were corrected to make the documentation more correct.

ATF tests cleanup in ptrace(2)

There were added new tests for new ptrace(2) operations (PT_SETSTEP and PT_CLEARSTEP).

Also several tests were updated to reflect the current state of "successfully passed" and "expected failure". This is important to mark issues that are already known and quickly catch new regressions in future changes.

F_GETPATH in fcntl(2)

It was decided that NetBSD will not introduce new fcntl(2) function for compatibility with certain other systems. This means that once LLDB will require this feature, we will need to introduce a workaround in the project.

What has been done in LLDB

The NetBSD Process Plugin in LLDB acquired new capabilities. Additionally enhancements in LLDB were developed such as handling threads in core(5) files.

Floating point support

The x86_64 architecture supports in default properties FXSAVE processor instructions. The FXSAVE feature allows to operate over floating point registers. A thread state (context) is composed of (and not restricted to) general and floating point registers.

The NetBSD Process Plugin acquired the functionality to read these registers and optionally set new values for them.

Watchpoint support

A programmer can use hardware assisted watchpoints to stop execution of a tracee whenever a certain variable or instruction was read/written/executed. The support for this feature has been implemented on NetBSD with ptrace(2) operations PT_SETDBREGS and PT_GETDBREGS. These operations are now available in the LLDB Process plugin.

Threads support in core(5) files

I've included support for LWPs in core(5) files. This means that larger programs with threads, like Firefox that emitted coredump for some reason (usually during crash) can be investigated postmortem.

Demo

I've prepared a recording with the script(1) utility from the NetBSD base system. To replay it:

script -p ./firefox-core.typescript

This recording shows a debugging session of a Firefox core(5) file.

(I was kind to prepare a Linux version of the NetBSD script(1) here).

Plan for the next milestone

The plan for the next milestone is continuing development of threads in the NetBSD Process Plugin. I will need to work more on correctness of ptrace(2) calls as new issues were detected in setups with threads that resulted in crashes.

There is also ongoing work on a new build node running NetBSD-current (prerelease of 8) and building LLVM+Clang+LLDB. I'm working on enabling unit tests to catch functional regressions quickly. The original LLDB node cluster was privately funded by myself in the last two years and has been switched to a machine hosted by The NetBSD Foundation.

To keep this machine up and running (8 CPU, 24 GB RAM) community support through donations is required. This is crucial to actively maintain the LLVM toolchain (Clang, LLDB and others) on NetBSD.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:

http://netbsd.org/donations/#how-to-donate

Posted in the wee hours of Monday night, May 2nd, 2017 Tags: blog
Add a comment
Contact | Disclaimer | Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.
NetBSD® is a registered trademark of The NetBSD Foundation, Inc.