File:  [NetBSD Developer Wiki] / wikisrc / pkgsrc / gcc.mdwn
Revision 1.12: download - view: text, annotated - select for diffs
Thu Jan 4 20:37:58 2018 UTC (2 years, 8 months ago) by gdt
Branches: MAIN
CVS tags: HEAD
Add data.  Promote sections

    1: On many systems pkgsrc supports, gcc is the standard compiler.  In
    2: general, different versions of each OS have different gcc versions,
    3: and some packages require newer gcc versions, in order to support
    4: newer language standards (e.g. c++11, written in the style of
    5: USE_LANGUAGES), or because older versions don't work (infrequently).
    6: 
    7: This page discusses issues related to version selection, and intends
    8: to be a design document for how pkgsrc should address this problem, to
    9: be converted into historical design rationale once implemented.  It
   10: freely takes content from extensive mailinglist discussions, and
   11: attempts to follow the rough consensus that has emerged.
   12: 
   13: # Base system gcc vs pkgsrc gcc
   14: 
   15: Systems using gcc (e.g. NetBSD) have a compiler as /usr/bin/gcc, and
   16: this is usable by pkgsrc without any bootstrapping activity.  One can
   17: build gcc versions (typically newer versions) from pkgsrc, resulting
   18: in a compiler within ${PREFIX}, e.g. /usr/pkg/gcc6/bin/gcc.  This
   19: compiler can then be used to compile other packages.
   20: 
   21: The issue with using base system gcc is typically that it is too old,
   22: such as gcc 4.5 with NetBSD 6, which cannot compile c++11.  Another
   23: example is gcc 4.8 with NetBSD 7.  While this can compile most c++11
   24: programs, it cannot be used for firefox or glibmm (and therefore any
   25: package that links against glibmm).
   26: 
   27: Issues when using pkgsrc gcc are that
   28: 
   29:   - on some platforms, pkgsrc gcc does not build and work
   30:   - it must be bootstrapped, requiring compiling a number of packages
   31:     with the system compiler
   32:   - C++ packages that are linked together should be built with the
   33:     same compiler, because the standard library ABI is not necessarily
   34:     the same for each compiler version
   35:   - While C packages can be built with mixed versions, the binary
   36:     should be linked with the higher version because the support
   37:     library is backwards compatible but not forward compatible.
   38: 
   39: # Specific constraints and requirements
   40: 
   41: This section attempts to gather all the requirements.
   42: 
   43:   - By default, pkgsrc should be able to build working packages, even
   44:     for packages that need a newer compiler than that provided in the
   45:     base system.
   46: 
   47:   - The set of packages that are needed when building a bootstrap
   48:     compiler should be minimized.
   49: 
   50:   - All packages that use C should have final linking with the highest
   51:     version used in any included library.
   52: 
   53:   - All packages that use C++ should be built with the same compiler
   54:     version.  Because these in the general case may include C, the
   55:     version used for C++ must be at least as new as the version used
   56:     for any used C package.
   57: 
   58:   - pkgsrc should avoid building gcc unless it is more or less
   59:     necessary to build packges.  (As an example, if the base system
   60:     gcc can build c99 but not c++11, building a c99-only program
   61:     should not trigger building a gcc version adequate for c++11.)
   62: 
   63:   - The compiler selection logic should work on NetBSD 6 and newer,
   64:     and other systems currently supported by pkgsrc, including in-use
   65:     LTS GNU/Linux systems.  It should work on systems that default to
   66:     clang, when set to use GCC, at least as well as the current
   67:     scheme.  It is desirable for this logic to work on NetBSD 5.
   68: 
   69:   - All systems should work at least as well as they do before
   70:     implementation of new compiler selection logic.
   71: 
   72:   - The compiler selection logic should be understandable and not brittle.
   73: 
   74: # Design
   75: 
   76: The above requirements could in theory be satisfied in many ways, but
   77: most of them are too complicated.  We present a design that aims to be
   78: sound while mimimizing complexity.
   79: 
   80:   - Packages declare what languages they need, with c++, c++11, and
   81:     c++14 being expressed differently.  (This is exactly current
   82:     practice and just noted for completeness.)
   83: 
   84:   - The package-settable variable GCC_REQD will be used only when a
   85:     compiler that generally can compile the declared language version
   86:     is insufficient.  These cases are expected to be relatively rare;
   87:     an example is firefox that is in c++ (but not c+11) and needs gcc
   88:     4.9.
   89: 
   90:   - A user-settable variable PKGSRC_GCC_VERSION will declare the
   91:     version of gcc to be used for C programs, with an OS-,
   92:     version- and architeture- specific default.
   93: 
   94:   - A user-settable variable PKGSRC_GXX_VERSION will declare the
   95:     version of gcc to be used for all C++ programs, again with an OS-,
   96:     version- and architeture-specific default.  It must be at least
   97:     PKGSRC_GCC_VERSION.
   98: 
   99:   - If PKGSRC_GCC_VERSION and PKGSRC_GXX_VERSION are not set, the
  100:     system will behave much as before.  As a possible exception,
  101:     builds may still fail if the required version is greater than the
  102:     base system version.  So far the only known reason to avoid
  103:     setting these variable is if pkgsrc gcc cannot be built.
  104: 
  105:   - Each of c99, c++, c++11, and c++14 will be associated with a
  106:     minimum gcc version, such that almost all programs declaring that
  107:     language can be built with that version.  (This avoids issues of
  108:     strict compliance with c++11, which requires a far higher version
  109:     of gcc than the version required to compile almost all actual
  110:     programs in c++11.)
  111: 
  112:   - The minimum version inferred from the language tag will be
  113:     combined with any GCC_REQD declarations to find a minimum version
  114:     for a specific package.  If that is greater than
  115:     PKGSRC_GCC_VERSION (programs using only C) or PKGSRC_GXX_VERSION,
  116:     package building will fail.  We call the resulting
  117:     PKGSRC_GCC_VERSION or PKGSRC_GXX_VERSION the chosen version.
  118: 
  119:   - When building a program using C or C++, if the chosen version is
  120:     not provided by the base system, and the chosen version is not
  121:     installed via pkgsrc, then it (and its dependencies) will be built
  122:     from pkgsrc in a special bootstrap mode.  When building in
  123:     bootstrap mode, the version selection logic is ignored and the
  124:     base system compiler is used.  Consistency and reproducible builds
  125:     require that a package built with the normal prefix must be the
  126:     same whether built because of compiler bootstrapping or normal
  127:     use.
  128: 
  129:     There are thus two choices for dealing with bootstrapping.  One is
  130:     to use a distinct prefix, which will ensure that all packages that
  131:     are part of the compiler bootstrap will not be linked into normal
  132:     pkgsrc programs.  This implies that any dependencies of gcc may
  133:     exist twice, once in bootstrap mode and once if built normally.  A
  134:     gcc version itself will be built twice, if it is desired for
  135:     regular use.  This double building and the complexity of a second
  136:     prefix are the negatives of this approach.
  137: 
  138:     The other choice is to mark gcc and all depending packages as used
  139:     for compiler bootstrapping, and to always build those with the
  140:     base compiler.  We use the package-settable variable
  141:     PKGSRC_GCC_BOOTSTRAP=yes to denote this.  The negative with this
  142:     approach is possible inconsistency with gcc's dependencies being
  143:     built with the base compiler and used later.
  144: 
  145:     As an alternative, we store lists of bootstrap packages in a
  146:     variable, because it will vary with OS and version, and with
  147:     PREFER_PKGSRC settings.
  148: 
  149:     As a third alternative, we pass a GCC_BOOTSTRAPPING variable
  150:     recursively.  This is easier but less consistent.
  151: 
  152:   - We hope that the chosen version can be built using the base system
  153:     version, and hope to avoid multi-stage bootstrapping.
  154: 
  155:   - We expect that any program containing C++ will undergo final
  156:     linking with a C++ compiler.  This is not a change from the
  157:     current situation.
  158: 
  159: # Remaining issues
  160: 
  161: ## gcc dependencies introduction
  162: 
  163: Because gcc can have dependencies, there could be packages built with
  164: the system compiler that are then later used with the chosen version.
  165: For now, we defer worrying about these problems (judging that they
  166: will be less serious than the current situation where all c++11
  167: programs fail to build on NetBSD 6).
  168: 
  169: \todo: Perhaps change gcc 4.8 and 4.9 to enable gcc-inplace-math by
  170: default.  Perhaps decide that if we want to build gcc, we want to
  171: build 5 or 6, and 4.9 is no longer of interest as a bootstrap target.
  172: 
  173: \todo: Analyze what build-time and install-time dependencies actually
  174: exist.  Include old GNU/Linux in this analysis.
  175: 
  176: \todo: Consider if dropping nls would help.  (On NetBSD, it seems that
  177: base system libraries are used, so it would not help.)
  178: 
  179: \todo: Consider failing if optins that we want one way are another,
  180: when bootstrapping.
  181: 
  182: ## managing gcc dependencies
  183: 
  184: There are multiple paths forward.
  185: 
  186: \todo Choose one.  Straw proposal is "Don't worry" and recursive
  187: variable for the initial implementation.
  188: 
  189: ### Separate prefix
  190: 
  191: Build compilers in a separate prefix, or a subprefix, so that the
  192: compiler and the packages needed to build it will not be used by any
  193: normal packages.  This completely avoids the issue of building a
  194: package one way in bootstrap and another not in bootstrap, at the cost
  195: of two builds and writing the separate-prefix code.
  196: 
  197: ### Don't worry
  198: 
  199: Don't worry that packages used to bootstrap the needed compiler are
  200: compiled with an older compiler.  Don't worry that they might be
  201: different depending on build order.  If we have an actual problem,
  202: deal with it.  This requires choosing an approach to omit compiler
  203: selection logic when building the compiler:
  204: 
  205: #### Mark bootstrap packages
  206: 
  207: Mark packages used to build gcc as PKGSRC_GCC_BOOTSTRAP=yes.
  208: Conditionalize this on OPSYS if necessary.  Don't force the compiler
  209: if this is set.
  210: 
  211: Alternatively, manage a per-OS list of packages in a central mk file.
  212: 
  213: #### Pass a recursive variable
  214: 
  215: As above, but set PKGSRC_GCC_BOOTSTRAP=yes in the evniroment of the
  216: call to build the compiler, so that all dependencies inherit
  217: permission to skip compiler selection logic.  (Alternatively, use some
  218: other mechanism such as passing a make variable explicitly.)
  219: 
  220: ## Differing GCC and GXX versions
  221: 
  222: Perhaps it is a mistake to allow the chosen GCC and GXX versions to
  223: differ.  If we require them to be the same, then essentially all
  224: systems with a base system compiler older than gcc 5 will have to
  225: bootstrap the compiler.  For now, we allow them to differ and will
  226: permit the defaults to differ.
  227: 
  228: ## gcc versions and number of buildable packages
  229: 
  230: A gcc version that is too old will not build a number of packages.
  231: Anything older than 4.8 fails for c++11.  4.8 fails on some c++11
  232: packages, such as firefox and glibmm.
  233: 
  234: A version that is too new also fails to build packages.  Jason Bacon
  235: posted counts to tech-pkg indicate that 5 is close to 4.8 in the
  236: number of packages built, and that moving to 6 causes hundreds of
  237: additional failures.  (Keep in mind that currently, building with 4.8
  238: will build 4.9 for firefox, but in the future will not.)
  239: 
  240:     www/pkgsrc/packages/sharedapps/pkg-2017Q3/RHEL6-gcc48/All	16461
  241:     www/pkgsrc/packages/sharedapps/pkg-2017Q3/RHEL6-gcc6/All	15849
  242: 
  243:     www/pkgsrc/packages/sharedapps/pkg-2017Q3/RHEL7-gcc48/All	16414
  244:     www/pkgsrc/packages/sharedapps/pkg-2017Q3/RHEL7-gcc5/All	16338
  245: 
  246: Therefore, the current answer to "What is the best version to use" is
  247: 5.
  248: 
  249: ## Default versions for various systems
  250: 
  251: Note that if for any particular system's set of installed packages (or
  252: bulk build), a newer gcc has to be built, it does not hurt to have
  253: built it earlier.
  254: 
  255: When the base system is old (e.g., gcc 4.5 in NetBSD 6, or 4.1, in
  256: NetBSD 5), then it is clear that a newer version must be built.  For
  257: these, PKGSRC_GXX_VERSION should default to a newish gcc, avoiding
  258: being so new as to cause building issues.  PKGSRC_GCC_VERSION should
  259: probably default to the system version if it can build all C99
  260: programs, or match PKGSRC_GXX_VERSION, if the system version is too
  261: old.  Perhaps gcc 4.5 would be used, but 4.1 not used.  \todo Discuss.
  262: 
  263: When the base system is almost new enough, the decision about the
  264: default is more complicated.  A key example is gcc 4.8, found in
  265: NetBSD 7.  Firefox requires gcc 4.9, and all programs using c++14 also
  266: need a newer version.  One options is to choose 4.8, resulting in
  267: firefox failing, as well as all c++14 programs.  Another is to choose
  268: 4.9, but this makes little sense because c++14 programs will still
  269: fail, and the general rule of moving to the most recent
  270: generally-acceptable version applies, which currently leads to gcc5.
  271: This is in effect a declaration that "almost new enough" does not
  272: count as new enough.  Thus the plan for NetBSD 7 is to set
  273: PKGSRC_GCC_VERSION to 4.8 and PKGSRC_GXX_VERSION to 5.
  274: 
  275: When the base system is new enough, e.g. gcc 5, 6 or 7 it should
  276: simply be used.  By "new enough", we mean that almost no programs in
  277: pkgsrc fail to build with it (because it is too old), which implies
  278: that it supports (almost all) C++14 programs.  Our current definiton
  279: of new enough is gcc 5.
  280: 
  281: ## Limited mixed versions
  282: 
  283: One approach would be to allow limited mixed versions, where
  284: individual programs could force a specific version to be bootstrapped
  285: and used, so that e.g. firefox could use 4.9 even though most programs
  286: use 4.8, which is what happens now on NetBSD 7.  This would rely on
  287: being able to link c++ with 4.9 including some things built with 4.8
  288: (which is done presently).  However, this approach would become
  289: unsound with a library rather than an end program.  We reject this as
  290: too much complexity for avoiding building a newer compiler in limited
  291: situations.
  292: 
  293: ## Fortran
  294: 
  295: Fortran support is currently somewhat troubled..  It seems obvious to
  296: extend to PGKSRC_GFORTRAN_VERSION, and have that match
  297: PKGSRC_GCC_VERSION or PKGSRC_GXX_VERSION, but the Fortran situation is
  298: not worsened by the above design.
  299: 
  300: When building a gcc version, we get gfortran.  Perhaps, because of
  301: fortran, we should require a single version, vs a C and a C++ version.
  302: 
  303: \todo Discuss.
  304: 
  305: ## C++ libraries used by C programs
  306: 
  307: The choice of one version for C++ and one for C (e.g. 5, 4.8 on
  308: netbsd-7) breaks down if a C program links against a library that is
  309: written in C++ but provides a C API, because we still need the C++
  310: version's stdlib.
  311: 
  312: \todo Define a variable for such packages to have in their buildlink3,
  313: which will not add c++ to USE_LANGUAGES but will force
  314: PKGSRC_GXX_VERSION to be used.  Or decide that this is a good reason
  315: to really just have one compiler version.
  316: 
  317: # Path forward
  318: 
  319: (This assumes per-package marking of bootstrap packages, but is
  320: reasonably obviously extended to the other schemes.)
  321: 
  322:  - Modify all gcc packages to have minimal dependencies, and to add
  323:    PKGSRC_GCC_BOOTSTRAP.
  324: 
  325:  - Modify the compiler selection logic to do nothing if
  326:    PKGSRC_GCC_BOOTSTRAP is set.
  327: 
  328:  - Modify the compiler selection logic for LANGUAGES= to fail if
  329:    PKGSRC_GCC_VERSION/PKGSRC_GXX_VERSION is not new enough.
  330: 
  331:  - Modify the compiler selection logic for GCC_REQD to fail if
  332:    PKGSRC_GCC_VERSION/PKGSRC_GXX_VERSION is not new enough.
  333: 
  334:  - Decide on defaults.  The straw proposal is that PKGSRC_GCC_VERSION
  335:    is the base system version if >= 4.5 (or 4.4?), and otherwise 5,
  336:    and that PKGSRC_GXX_VERSION is the base system version if >= 5, and
  337:    otherwise 5.  Implement these in platform.mk as they are tested.
  338: 
  339: ## Later steps
  340: 
  341:  - Address fortran.  Probably add PKGSRC_GFORTRAN_VERSION, after
  342:    determining how Fortran, C and C++ interact with library ABI
  343:    compatibility.
  344: 
  345: # Data
  346: 
  347: This section has data points that are relevant to the discussion.
  348: 
  349: ## amd64/i386
  350: 
  351: It is believed that pkgsrc gcc generally builds on these systems.
  352: gcc6 builds on netbsd-5/i386.
  353: 
  354: ## macppc
  355: 
  356: On macppc, [lang/gcc5 fails on netsbd-6 and netbsd-7, but succeeds on
  357: netbsd-8](https://mail-index.netbsd.org/tech-pkg/2018/01/03/msg019260.html).

CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb