File:  [NetBSD Developer Wiki] / wikisrc / pkgsrc / gcc.mdwn
Revision 1.7: download - view: text, annotated - select for diffs
Sat Dec 30 01:29:49 2017 UTC (3 years, 3 months ago) by gdt
Branches: MAIN
CVS tags: HEAD
pkgsrc/gcc: Float notion of failing for unexpected options in bootstrap

From Edgar Fuss in private mail.

    1: On many systems pkgsrc supports, gcc is the standard compiler.  In
    2: general, different versions of each OS have different gcc versions,
    3: and some packages require newer gcc versions, in order to support
    4: newer language standards (e.g. c++11, written in the style of
    5: USE_LANGUAGES), or because older versions don't work (infrequently).
    6: 
    7: This page discusses issues related to version selection, and intends
    8: to be a design document for how pkgsrc should address this problem, to
    9: be converted into historical design rationale once implemented.  It
   10: freely takes content from extensive mailinglist discussions, and
   11: attempts to follow the rough consensus that has emerged.
   12: 
   13: ## Base system gcc vs pkgsrc gcc
   14: 
   15: Systems using gcc (e.g. NetBSD) have a compiler as /usr/bin/gcc, and
   16: this is usable by pkgsrc without any bootstrapping activity.  One can
   17: build gcc versions (typically newer versions) from pkgsrc, resulting
   18: in a compiler within ${PREFIX}, e.g. /usr/pkg/gcc6/bin/gcc.  This
   19: compiler can then be used to compile other packages.
   20: 
   21: The Issue with using base system gcc is typically that it is too old,
   22: such as gcc 4.5 with NetBSD 6, which cannot compile c++11.  Another
   23: example is gcc 4.8 with NetBSD 7.  While this can compile most c++11
   24: programs, it cannot be used for firefox or glibmm (and therefore any
   25: package that links against glibmm).
   26: 
   27: Issues when using pkgsrc gcc are that
   28: 
   29:   - it must be bootstrapped, requiring compiling a number of packages
   30:     with the system compiler
   31:   - C++ packages that are linked together should be built with the
   32:     same compiler, because the standard library ABI is not necessarily
   33:     the same for each compiler version
   34:   - While C packages can be built with mixed versions, the binary
   35:     should be linked with the higher version because the support
   36:     library is backwards compatible but not forward compatible.
   37: 
   38: ## Specific constraints and requirements
   39: 
   40: This section attempts to gather all the requirements.
   41: 
   42:   - By default, pkgsrc should be able to build working packages, even
   43:     for packages that need a newer compiler than that provided in the
   44:     base system.
   45: 
   46:   - The set of packages that are needed when building a bootstrap
   47:     compiler should be minimized.
   48: 
   49:   - All packages that use C should have final linking with the highest
   50:     version used in any included library.
   51: 
   52:   - All packages that use C++ should be built with the same compiler
   53:     version.  Because these in the general case may include C, the
   54:     version used for C++ must be at least as new as the version used
   55:     for any used C package.
   56: 
   57:   - pkgsrc should avoid building gcc unless it is more or less
   58:     necessary to build packges.  (As an example, if the base system
   59:     gcc can build c99 but not c++11, building a c99-only program
   60:     should not trigger building a gcc version adequate for c++11.)
   61: 
   62:   - The compiler selection logic should work on NetBSD 6 and newer,
   63:     and other systems currently supported by pkgsrc, including in-use
   64:     LTS GNU/Linux systems.  It should work on systems that default to
   65:     clang, when set to use GCC, at least as well as the current
   66:     scheme.  It is desirable for this logic to work on NetBSD 5.
   67: 
   68:   - The compiler selection logic should be understandable and not brittle.
   69: 
   70: ## Design
   71: 
   72: The above requirements could in theory be satisfied in many ways, but
   73: most of them are too complicated.  We present a design that aims to be
   74: sound while mimimizing complexity.
   75: 
   76:   - Packages declare what languages they need, with c++, c++11, and
   77:     c++14 being expressed differently.  (This is exactly current
   78:     practice and just noted for completeness.)
   79: 
   80:   - The package-settable variable GCC_REQD will be used only when a
   81:     compiler that generally can compile the declared language version
   82:     is insufficient.  These cases are expected to be relatively rare;
   83:     an example is firefox that is in c++ (but not c+11) and needs gcc
   84:     4.9.
   85: 
   86:   - A user-settable variable PKGSRC_GCC_VERSION will declare the
   87:     version of gcc to be used for C programs, with an OS- and
   88:     version--specific default.
   89: 
   90:   - A user-settable variable PKGSRC_GXX_VERSION will declare the
   91:     version of gcc to be used for all C++ programs, again with an OS-
   92:     and version-specific default.  It must be at least
   93:     PKGSRC_GCC_VERSION.
   94: 
   95:   - Each of c99, c++, c++11, and c++14 will be associated with a
   96:     minimum gcc version, such that almost all programs declaring that
   97:     language can be built with that version.  (This avoids issues of
   98:     strict compliance with c++11, which requires a far higher version
   99:     of gcc than the version required to compile almost all actual
  100:     programs in c++11.)
  101: 
  102:   - The minimum version inferred from the language tag will be
  103:     combined with any GCC_REQD declarations to find a minimum version
  104:     for a specific package.  If that is greater than
  105:     PKGSRC_GCC_VERSION (programs using only C) or PKGSRC_GXX_VERSION,
  106:     package building will fail.  We call the resulting
  107:     PKGSRC_GCC_VERSION or PKGSRC_GXX_VERSION the chosen version.
  108: 
  109:   - When building a program using C or C++, if the chosen version is
  110:     not provided by the base system, and the chosen version is not
  111:     installed via pkgsrc, then it (and its dependencies) will be built
  112:     from pkgsrc in a special bootstrap mode.  When building in
  113:     bootstrap mode, the version selection logic is ignored and the
  114:     base system compiler is used.  Consistency and reproducible builds
  115:     require that a package built with the normal prefix must be the
  116:     same whether built because of compiler bootstrapping or normal
  117:     use.
  118: 
  119:     There are thus two choices for dealing with bootstrapping.  One is
  120:     to use a distinct prefix, which will ensure that all packages that
  121:     are part of the compiler bootstrap will not be linked into normal
  122:     pkgsrc programs.  This implies that any dependencies of gcc may
  123:     exist twice, once in bootstrap mode and once if built normally.  A
  124:     gcc version itself will be built twice, if it is desired for
  125:     regular use.  This double building and the complexity of a second
  126:     prefix are the negatives of this approach.
  127: 
  128:     The other choice is to mark gcc and all depending packages as used
  129:     for compiler bootstrapping, and to always build those with the
  130:     base compiler.  We use the package-settable variable
  131:     PKGSRC_GCC_BOOTSTRAP=yes to denote this.  The negative with this
  132:     approach is possible inconsistency with gcc's dependencies being
  133:     built with the base compiler and used later.
  134: 
  135:   - We hope that the chosen version can be built using the base system
  136:     version, and hope to avoid multi-stage bootstrapping.
  137: 
  138:   - We expect that any program containing C++ will undergo final
  139:     linking with a C++ compiler.  This is not a change from the
  140:     current situation.
  141: 
  142: ## Remaining issues
  143: 
  144: ### gcc dependencies introduction
  145: 
  146: Because gcc can have dependencies, there could be packages built with
  147: the system compiler that are then later used with the chosen version.
  148: For now, we defer worrying about these problems (judging that they
  149: will be less serious than the current situation where all c++11
  150: programs fail to build on NetBSD 6).
  151: 
  152: \todo: Change gcc 4.8 and 4.9 to enable gcc-inplace-math by default.
  153: 
  154: \todo: Analyze what build-time and install-time dependencies actually
  155: exist.  Include old GNU/Linux in this analysis.
  156: 
  157: \todo: Consider if dropping nls would help.  (On NetBSD, it seems that
  158: base system libraries are used, so it would not help.)
  159: 
  160: \todo: Consider failing if optins that we want one way are another,
  161: when bootstrapping.
  162: 
  163: ### managing gcc dependencies
  164: 
  165: There are multiple paths forward.
  166: 
  167: \todo Choose one.  Straw proposal is "Don't worry" and recursive
  168: variable for the initial implementation.
  169: 
  170: #### Separate prefix
  171: 
  172: Build compilers in a separate prefix, or a subprefix, so that the
  173: compiler and the packages needed to build it will not be used by any
  174: normal packages.  This completely avoids the issue of building a
  175: package one way in bootstrap and another not in bootstrap, at the cost
  176: of two builds and writing the separate-prefix code.
  177: 
  178: #### Don't worry
  179: 
  180: Don't worry that packages used to bootstrap the needed compiler are
  181: compiled with an older compiler.  Don't worry that they might be
  182: different depending on build order.  If we have an actual problem,
  183: deal with it.  This requires choosing an approach to omit compiler
  184: selection logic when building the compiler:
  185: 
  186: ##### Mark bootstrap packages
  187: 
  188: Mark packages used to build gcc as PKGSRC_GCC_BOOTSTRAP=yes.
  189: Conditionalize this on OPSYS if necessary.  Don't force the compiler
  190: if this is set.
  191: 
  192: ##### Pass a recursive variable
  193: 
  194: As above, but set PKGSRC_GCC_BOOTSTRAP=yes in the evniroment of the
  195: call to build the compiler, so that all dependencies inherit
  196: permission to skip compiler selection logic.  (Alternatively, use some
  197: other mechanism such as passing a make variable explicitly.)
  198: 
  199: ### Differing GCC and GXX versions
  200: 
  201: Perhaps it is a mistake to allow the chosen GCC and GXX versions to
  202: differ.  If we require them to be the same, then essentially all
  203: systems with a base system compiler older than gcc 5 will have to
  204: bootstrap the compiler.  For now, we allow them to differ and will
  205: permit the defaults to differ.
  206: 
  207: ### gcc versions and number of buildable packages
  208: 
  209: A gcc version that is too old will not build a number of packages.
  210: Anything older than 4.8 fails for c++11.  4.8 fails on some c++11
  211: packages, such as firefox and glibmm.
  212: 
  213: A version that is too new also fails to build packages.  Analyses
  214: posted to tech-pkg indicate that 5 is close to 4.9 in the number of
  215: packages built, and that moving to 6 causes hundreds of additional
  216: failures.
  217: 
  218: Therefore, the current answer to "What is the best version to use" is
  219: 5.
  220: 
  221: \todo Check this with Jason Bacon.
  222: 
  223: ### Default versions for various systems
  224: 
  225: Note that if for any particular system's set of installed packages (or
  226: bulk build), a newer gcc has to be built, it does not hurt to have
  227: built it earlier.
  228: 
  229: When the base system is old (e.g., gcc 4.5 in NetBSD 6, or 4.1, in
  230: NetBSD 5), then it is clear that a newer version must be built.  For
  231: these, PKGSRC_GXX_VERSION should default to a newish gcc, avoiding
  232: being so new as to cause building issues.  Currently, gcc5 is probably
  233: a good choice, with gcc6 compiling significantly but not vastly fewer
  234: packages.  PKGSRC_GCC_VERSION should probably default to the system
  235: version if it can build all C99 programs, or match PKGSRC_GXX_VERSION,
  236: if the system version is too old.  Perhaps gcc 4.5 would be used, but
  237: 4.1 not used.  \todo Discuss.
  238: 
  239: When the base system is almost new enough, the decision about the
  240: default is more complicated.  A key example is gcc 4.8, found in
  241: NetBSD 7.  Firefox requires gcc 4.9, and all programs using c++14 also
  242: need a newer version.  One options is to choose 4.8, resulting in
  243: firefox failing, as well as all c++14 programs.  Another is to choose
  244: 4.9, but this makes little sense because c++14 programs will still
  245: fail, and the general rule of moving to the most recent
  246: generally-acceptable version applies, which currently leads to gcc6.
  247: This is in effect a declaration that "almost new enough" does not
  248: count as new enough.  Thus the plan for NetBSD 7 is to set
  249: PKGSRC_GCC_VERSION to 4.8 and PKGSRC_GXX_VERSION to 5.
  250: 
  251: When the base system is new enough, e.g. gcc 5, 6 or 7 it should
  252: simply be used.  By "new enough", we mean that almost no programs in
  253: pkgsrc fail to build with it (because it is too old), which implies
  254: that it supports (almost all) C++14 programs.  Our current definiton
  255: of new enough is gcc 5.
  256: 
  257: ### Limited mixed versions
  258: 
  259: One approach would be to allow limited mixed versions, where
  260: individual programs could force a specific version to be bootstrapped
  261: and used, so that e.g. firefox could use 4.9 even though most programs
  262: use 4.8, which is what happens now on NetBSD 7.  This would rely on
  263: being able to link c++ with 4.9 including some things built with 4.8
  264: (which is done presently).  However, this approach would become
  265: unsound with a library rather than an end program.  We reject this as
  266: too much complexity for avoiding building a newer compiler in limited
  267: situations.
  268: 
  269: ### Fortran
  270: 
  271: Fortran support is currently somewhat troubled..  It seems obvious to
  272: extend to PGKSRC_GFORTRAN_VERSION, and have that match
  273: PKGSRC_GCC_VERSION or PKGSRC_GXX_VERSION, but the Fortran situation is
  274: not worsened by the above design.
  275: 
  276: When building a gcc version, we get gfortran.  Perhaps, because of
  277: fortran, we should require a single version, vs a C and a C++ version.
  278: 
  279: \todo Discuss.
  280: 
  281: ### C++ programs used by C programs
  282: 
  283: The choice of one version for C++ and one for C (e.g. 5, 4.8 on
  284: netbsd-7) breaks down if a C program links against a library that is
  285: written in C++ but provides a C API, because we still need the C++
  286: version's stdlib.
  287: 
  288: \todo Define a variable for such packages to have in their buildlink3,
  289: which will not add c++ to USE_LANGUAGES but will force
  290: PKGSRC_GXX_VERSION to be used.  Or decide that this is a good reason
  291: to really just have one compiler version.
  292: 
  293: ## Path forward
  294: 
  295:  - Modify all gcc packages to have minimal dependencies, and to add
  296:    PKGSRC_GCC_BOOTSTRAP.
  297: 
  298:  - Modify the compiler selection logic to do nothing if
  299:    PKGSRC_GCC_BOOTSTRAP is set.
  300: 
  301:  - Modify the compiler selection logic for LANGUAGES= to fail if
  302:    PKGSRC_GCC_VERSION/PKGSRC_GXX_VERSION is not new enough.
  303: 
  304:  - Modify the compiler selection logic for GCC_REQD to fail if the
  305:    version of GCC/GXX is not new enough.
  306: 
  307:  - Decide on defaults.  The straw proposal is that PKGSRC_GCC_VERSION
  308:    is the base system version if >= 4.5 (or 4.4?), and otherwise 5,
  309:    and that PKGSRC_GXX_VERSION is the base system version if >= 5, and
  310:    otherwise 5.
  311: 
  312: ### Later steps
  313: 
  314:  - Address fortran.

CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb