Annotation of wikisrc/projects/project/smp_networking.mdwn, revision 1.2

1.1       jmmv        1: [[!template id=project
                      2: 
                      3: title="SMP Networking (aka remove the big network lock)"
                      4: 
                      5: contact="""
                      6: [tech-kern](mailto:tech-kern@NetBSD.org),
1.2     ! jmmv        7: [tech-net](mailto:tech-net@NetBSD.org),
1.1       jmmv        8: [board](mailto:board@NetBSD.org),
                      9: [core](mailto:core@NetBSD.org)
                     10: """
                     11: 
1.2     ! jmmv       12: category="networking"
1.1       jmmv       13: difficulty="hard"
                     14: funded="The NetBSD Foundation"
                     15: 
                     16: description="""
1.2     ! jmmv       17: **WARNING: THIS IS A DRAFT; THE INFORMATION CONTAINED IN THIS PROJECT AND
        !            18: ANY OF THE SUBPROJECTS LINKED BELOW IS SUBJECT TO CHANGE.**
        !            19: 
        !            20: Traditionally, the NetBSD kernel code had been protected by a single,
        !            21: global lock.  This lock ensured that, on a multiprocessor system, two
        !            22: different threads of execution did not access the kernel concurrently and
        !            23: thus simplified the internal design of the kernel.  However, such design
        !            24: does not scale to multiprocessor machines because, effectively, the kernel
        !            25: is restricted to run on a single processor at any given time.
1.1       jmmv       26: 
                     27: The NetBSD kernel has been modified to use fine grained locks in many of
                     28: its different subsystems, achieving good performance on today's
                     29: multiprocessor machines.  Unfotunately, these changes have not yet been
                     30: applied to the networking code, which remains protected by the single lock.
1.2     ! jmmv       31: In other words: NetBSD networking has evolved to work in a uniprocessor
        !            32: envionment; switching it to use fine-grained locked is a hard and complex
        !            33: problem.
1.1       jmmv       34: 
1.2     ! jmmv       35: # Funding
1.1       jmmv       36: 
                     37: At this time, The NetBSD Foundation is accepting project specifications to
                     38: remove the single networking lock.  If you want to apply for this project,
1.2     ! jmmv       39: please send your proposal to the contact addresses listed above.
        !            40: 
        !            41: What follows is a particular design proposal, extracted from an
        !            42: [original text](http://www.NetBSD.org/~matt/smpnet.html) written by
        !            43: [Matt Thomas](mailto:matt@NetBSD.org).  You may choose to work on this
        !            44: particular proposal or come up with your own.
        !            45: 
        !            46: **Please note that the subtasks listed below are also open for funding
        !            47: individually.**
        !            48: 
        !            49: # Tentative specification
        !            50: 
        !            51: The future of NetBSD network infrastructure has to efficiently embrace two
        !            52: major design criteria: Symmetric Multi-Processing (SMP) and modularity.
        !            53: Other design considerations include not only supporting but taking
        !            54: advantage of the capability of newer network devices to do packet
        !            55: classification, payload splitting, and even full connection offload.
        !            56: 
        !            57: You can divide the network infrastructure into 5 major components:
        !            58: 
        !            59: * Interfaces (both real devices and pseudo-devices)
        !            60: * Socket code
        !            61: * Protocols
        !            62: * Routing code
        !            63: * mbuf code.
        !            64: 
        !            65: Part of the complexity is that, due to the monolithic nature of the kernel,
        !            66: each layer currently feels free to call any other layer.  This makes
        !            67: designing a lock hierarchy difficult and likely to fail.
        !            68: 
        !            69: Part of the problem are asynchonous upcalls, among which include:
        !            70: 
        !            71: * `ifa->ifa_rtrequest` for route changes.
        !            72: * `pr_ctlinput` for interface events.
        !            73: 
        !            74: Another source of complexity is the large number of global variables
        !            75: scattered throughout the source files.  This makes putting locks around
        !            76: them difficult.
        !            77: 
        !            78: The proposed solution presented here include the following tasks (in no
        !            79: particular order) to achieve the desired goals of SMP support and
        !            80: modularity:
        !            81: 
        !            82: [[!map show="title" pages="projects/project/* and tagged(project) and tagged(smp_networking)"]]
        !            83: 
        !            84: # Radical thoughts
        !            85: 
        !            86: You should also consider the following ideas:
        !            87: 
        !            88: ## LWPs in user space do not need a kernel stack
        !            89: 
        !            90: Those pages are only being used in case the an exception happens.
        !            91: Interrupts are probably going to their own dedicated stack.  One could just
        !            92: keep a set of kernel stacks around.  Each CPU has one, when a user
        !            93: exception happens, that stack is assigned to the current LWP and removed as
        !            94: the active CPU one.  When that CPU next returns to user space, the kernel
        !            95: stack it was using is saved to be used for the next user exception.  The
        !            96: idle lwp would just use the current kernel stack.
        !            97: 
        !            98: ## LWPs waiting for kernel condition shouldn't need a kernel stack
        !            99: 
        !           100: If an LWP is waiting on a kernel condition variable, it is expecting to be
        !           101: inactive for some time, possibly a long time.  During this inactivity, it
        !           102: does not really need a kernel stack.
        !           103: 
        !           104: When the exception handler get an usermode exeception, it sets LWP
        !           105: restartable flag that indicates that the exception is restartable, and then
        !           106: services the exception as normal.  As routines are called, they can clear
        !           107: the LWP restartable flag as needed.  When an LWP needs to block for a long
        !           108: time, instead of calling `cv_wait`, it could call `cv_restart`.  If
        !           109: `cv_restart` returned false, the LWPs restartable flag was clear so
        !           110: `cv_restart` acted just like `cv_wait`.  Otherwise, the LWP and CV would
        !           111: have been tied together (big hand wave), the lock had been released and the
        !           112: routine should have returned `ERESTART`.  `cv_restart` could also wait for
        !           113: a small amount of time like .5 second, and only if the timeout expires.
        !           114: 
        !           115: As the stack unwinds, eventually, it would return to the last the exception
        !           116: handler.  The exception would see the LWP has a bound CV, save the LWP's
        !           117: user state into the PCB, set the LWP to sleeping, mark the lwp's stack as
        !           118: idle, and call the scheduler to find more work.  When called,
        !           119: `cpu_switchto` would notice the stack is marked idle, and detach it from
        !           120: the LWP.
        !           121: 
        !           122: When the condition times out or is signalled, the first LWP attached to the
        !           123: condition variable is marked runnable and detached from the CV.  When the
        !           124: `cpu_switchto` routine is called, the it would notice the lack of a stack
        !           125: so it would grab one, restore the trapframe, and reinvoke the exception
        !           126: handler.
1.1       jmmv      127: """
                    128: ]]

CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb