Skip to content

oaken-source/parabola-cross-bootstrap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

98 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

parabola-cross-bootstrap
========================

 “Don't cry because it's over, smile because it happened.”
	― Dr. Seuss

1. Introduction
---------------

This project is an attempt to bootstrap a self-contained parabola
GNU/Linux-libre system for the following architectures:

 - riscv64 (HiFive) (complete)
 - riscv32 (PULP) (blocked)
 - sparc (OpenSparc) (planned)
 - powerpc64le (TalosII) (multilib complete)

The scripts are created with the goal to be as architecture agnostic as
possible, to make future porting efforts easier.

The build process is split into four stages, the rationale of which is outlined
in section 2 below.

To initiate a complete build of all stages, run: $> sudo ./create [CARCH]

The builds can be configured to keep going if the build of a single package
fails, by creating the file `build/.KEEP_GOING`. Otherwise, the build will stop
once an error is encountered. This is useful for getting as much work done as
possible unattended, but will make debugging harder in the later stages,
because temporary build fragments and filesystem trees in the build chroots
will be overwritten by the next package.

The complete console output of a package build process can be found in the
corresponding .MAKEPKGLOG file in the build directory of that package.

Target platform configuration is persisted in the config/config.$CARCH.sh
files.

1.1. System Requirements
------------------------

The scripts require, among probably other things, to be running on a fairly
POSIX-conforming GNU/Linux system, and in particular need the following tools
to be present and functional:

 * decently up-to-date GNU build toolchain (gcc, glibc, binutils) most of the
 * things in base-devel pacman, makepkg

I have tried to make the script smart enough to check for required bits and
pieces where needed, and to report when anything is missing ahead of time, but
some requirements may be missing.

1.2. A note to the reader
-------------------------

The scripts assume to be running on a parabola GNU/Linux-libre system, and may
fail in subtle and unexpected ways otherwise. They also may fail in subtle and
unexpected ways anyway, because they are modifying upstream PKGBUILD files,
which are very volatile and change without notice. When adapting this project
to a new architecture, or a different flavour of archlinux, execrcise caution,
pay close attention to any output, and be prepared to fix and modify patches
and scripts.

Also, if you found this project useful, and want to chat about anything, you
can email me at <andreas@grapentin.org>, or find me as <oaken-source> in
#parabola and others on irc.freenode.org.

1.3. Current state of the project
---------------------------------

All four stages of the riscv64 bootstrap are complete, and efforts to add
additional architectures has begun. A pointer where to find future development
efforts for the parabola ports will be added here in due time.

2. Build Stages
---------------

The following subsections outline the reasoning behind the separate bootstrap
stages. More details about *how* things are done may be gathered from reading
the inline comments of the respective scripts.

From Stage 2 onwards, the scripts use DEPTREEs to determine what packages need
to be built. these files are located in the build/$CARCH/stage$X/ directories
and can give an insight into what packages are going to be built next and what
dependencies are still missing.

Since risc-v is a fairly new ISA, some packages were packaged with config.sub
and config.guess files that are too old to recognize the target triplet. This
requires config.sub and config.guess files to be refreshed with newer versions
from upstream. This is done automatically for stage 2 and onwards, if
REGEN_CONFIG_FRAGMENTS is set to yes (the default) in the architecture config
file.

Packages with the `any' architecture are reused from upstream arch for this
bootstrap, since they should work for any architecture and do not need to be
re-(or cross-)compiled. This assumption is not always true, but it is a
reasonable approximation.

Additionally, all checkdepends are ignored, and the check() phase of the builds
is skipped for sanity reasons, since the tests are likely to fail for benign
reasons in a cross-arch makepkg chroot.

Note that the stages use upstream arch and parabola PKGBUILDs and attempt to
apply custom patches to resolve unbuildable or missing dependencies, and fix
architecture specific build issues. Look for these patches in
src/stage$X/patches/* and be prepared to fix and adapt them for future porting
efforts.

2.1. Stage 1
------------

The first stage creates and installs a cross-compile toolchain for the target
triplet defined in $CHOST, consisting of binutils, linux-libre-api-headers, gcc
and glibc. The scripts will check for $CHOST-ar and $CHOST-gcc in $PATH to
determine whether binutils and gcc are installed, and will then proceed to look
for the following files in $CHOST-gcc's sysroot:

  $sysroot/lib/libc.so.6              # for $CHOST-glibc
  $sysroot/include/linux/kernel.h     # for $CHOST-linux-libre-api-headers

If your system contains these files, the toolchain bootstrap process will be
skipped. Otherwise, the scripts will create a new toolchain as follows:

 - compile $CHOST-binutils
 - compile $CHOST-linux-libre-api-headers
 - compile $CHOST-gcc-bootstrap, without glibc dependency
 - cross-compile $CHOST-glibc using $CHOST-gcc-bootstrap
 - compile $CHOST-gcc against the created $CHOST-glibc

The sysroot of the created toolchain is set as /usr/$CHOST.

The $CHOST-{binutils,linux-libre-api-headers,gcc,glibc} packages are installed
as regular pacman packages on the build system, and are not automatically
removed by the scripts after the build is completed (or has failed).

For more information, you might want to check the PKGBUILD files located in
src/stage1/toolchain-pkgbuilds/*.

2.2. Stage 2
------------

Stage 2 uses the toolchain created in Stage 1 to cross-compile a subset of the
packages of the base-devel group plus transitive runtime dependencies.

The script creates an empty skeleton chroot, into which the cross-compiled
packages are going to be installed, and creates a sane makepkg.conf and a
patched makepkg.sh to work in the prepared chroot root directory. To make the
sysroot of the compiler available to builds in the chroot and vice-versa, the
/usr directory of the chroot is mounted into the sysroot. At the end of Stage
2, or in case of an error, this directory is unmounted automatically.

To build the packages, the DEPTREE is traversed and packages are cross-compiled
using upstream PKGBUILDs and custom patches, and the compiled packages are
installed into the chroot immediately. For this stage, the patches are
mandatory for each built package.

Note that this process is a bit fragile and dependent on arbitrary
particularities of the host system, and thus might fail for subtle reasons,
like missing, or superfluous build-time installed packages on the host.
Exercise caution and common sense.

2.3. Stage 3
------------

Stage 3 uses the cross-compiled makepkg chroot created in Stage 2 to natively
recompile the base-devel group of packages. This stage requires to build more
packages, since a reduced set of make-time dependencies need to be present in
the makepkg chroot, as well as runtime dependencies. Additionally, running the
cross-compiled native compiler instead of the cross compiler takes longer,
since user-mode emulation needs to be applied. As a result, stage 3 is expected
to takes much longer than stage 2.

However, since now the process is isolated from the host systems installed
packages, since everything is built cleanly in a chroot, the process is much
more stable and less prone to hard to diagnose problems with the host system.

The scripts create a clean librechroot from the cross-compiled packages
produced in stage 2. A modified libremakepkg script is created to perform
config fragment regeneration and to skip the check() phase, and then packages
are built in order of the DEPTREE.

Since no cross-compilation is needed from stage 3 onwards, the patches are no
longer required for each package. Fewer packages need patching, and the patches
are typically smaller and apply less pressure to the build recipes. As a
consequence, the patches probably need less maintenance in the future, and less
work to adapt to a different architecture.

Note that the cross-compiled packages from stage 3 can be a bit derpy at times,
hence the stage 3 build scripts prioritize building bash and make natively, to
avoid some known issues down the line.

2.4. Stage 4
------------

Stage 4 does a final recompile of the packages of the base-devel group, similar
to stage 3, with the difference that more make-time dependencies are enabled,
and the packages of the base group are added to the DEPTREE.

Stage 4 relies entirely on the packages natively compiled in stage 3, and no
cross-compiled packages are present in the build chroot at any time. This
results in reliable builds and reproducible build failures. However, since the
number of packages to be built in stage 4 is again much larger, expect the
builds to take a long time (days / weeks, not including work required to fix
broken patches and builds).

The result of stage 4 is a repository of packages that should allow to
bootstrap and boot a virtual machine of the bootstrapped architecture, with the
packages required to build the entirety of the arch / parabola package
repository. At this point, the port becomes self-hosting and I consider the
bootstrap process done. Note that a bootloader is missing from the bootstrap
process, but this can easily be built manually after stage4 using the
bootstrapped makepkg chroot.

3. Acknowledgements
===================

I would like to thank the awesome abaumann from the archlinux32 project for
pointers on how to bootstrap a PKGBUILD based system for a new architecture,
and for his work on the bootstrap32 project, which helped a lot in getting this
project started:

  https://github.com/archlinux32/bootstrap32

Further, I would like to thank the amazing people in #riscv and #fedora-riscv
on irc.freenode.org, especially rwmjones, davidlt and sorear for all the help
getting packages to build and work on risc-v, and for providing the source
packages of all fedora-riscv packages, including all patches and build recipes,
which have been of immense help:

  https://fedorapeople.org/groups/risc-v/SRPMS/

Lastly, it was refreshing to see yor names pop up over and over again on the
risc-v ports, patches and pull requests out there, and it made me realize how
much work it really is to get a port like this off the ground. I could not have
made it this far without your work.

	― Andreas Grapentin, 2018-04-18

------------------------------------------------------------------------------

After finishing the risc-v bootstrap, I didn't expect to revisit this
repository so soon. Now, after bootstrapping parabola for powerpc64le, I was
very pleased with how well the scripts held up to the new architecture and how
easily everything came together, despite a couple of bumps in the road.

I realize again that this would not have been possible without the people
listed above - abaumann from archlinux32 and the fedora crew (including, of
course, the amazing people working on ppc64le fedora). Thank you.

	― Andreas Grapentin, 2018-06-22