-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path_results.tex
217 lines (186 loc) · 8.79 KB
/
_results.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
\FloatBarrier
\chapter{Results and discussion}
The development of miq ended up with a working prototype
that can be used to build some packages implemented
in the reference packages repository in Lua. The number of
lines of code of the program is 1200 lines of Rust, and
around 1000 lines of Lua, the latter being much less
complex.
The program is built on GitHub actions, a Continuous
Integration service that builds the application on the
cloud, and attaches it to the GitHub release. The source
code for the project is therefore hosted on GitHub at \url{https://github.com/viperML/miq}.
While the program is a research project, it was developed
with the end-user in mind, by trying to provide a good user
experience via the \ac{CLI}. The application is not a simple
script, but tries to provide a good error messages. The
usage of the Rust programming language also has served as a
learning experience for the author. While the syntax and
semantics of the languages are intimating at first glance,
the type system and library ecosystem greatly boost the
development productivity.
The standard packages' repository implemented in Lua tries
to organize them in multiple stages: |stage0| and |stage1|,
with the purpose of bootstrapping a C compiler by building
it from the previous result, in hopes of getting a result as
pure as possible. As C programs and libraries are the basis
of Linux systems, the C compiler is usually the first target
to try to build with a new package manager. While this was
also true for miq, the complexity of GCC didn't make it
possible to have a working GCC build in the time span of
this project. This is not due to technical limitations of
miq itself, but of properly configuring the build to work
with the new style of hashed paths of miq.
The first stage of the bootstrap process uses nix's
bootstrap tools, downloaded directly from their
repositories. These bootstrap tools contain a sufficiently
recent GCC (version 7), a libc implementation (musl) and
core utilities to be able to bootstrap the C compiler, such
as bash, make, and others. By using nix's bootstrap tools
and utility functions implemented in Lua to abstract the
standard environment, it was possible to compile several of
the dependencies of GCC, including:
\begin{itemize}
\item Nix bootstrap tools.
\item musl libc
\item cc and ld wrappers
\item GMP -- GNU
Multiple Precision
Arithmetic Library
\item MPFR -- GNU Multiple Precision Floating-Point Reliable Library
\item LIBMPC -- GNU Complex floating-point library.
\end{itemize}
The package with the deepest dependency graph is libmpc (one
of GCC's dependencies). Figure \ref{fig:results} shows its
visual representation. Building libmpc can be done with the
following miq invocation:
\begin{minted}[breaklines]{text}
$ miq build ./pkgs/init.lua#stage1.libmpc
/miq/store/bootstrap-tools.tar.xz-9d678d0fc5041f17
/miq/store/toybox-x86_64-69a4327d80d88104
/miq/store/busybox-33a90b67a497c4d6
/miq/store/m4-1.4.19.tar.bz2-6732a25e4458acb
/miq/store/mpc-1.3.1.tar.gz-cf0aa3bd2a0d6fe0
/miq/store/mpfr-4.2.0.tar.bz2-ea1165b7c0959798
/miq/store/unpack-bootstrap-tools.sh-6949dd1f64cfe7b6
/miq/store/gmp-6.2.1.tar.bz2-a8db6558fa4fba6b
/miq/store/musl-1.2.3.tar.gz-828bf8f78328fb26
/miq/store/bootstrap-5f87f2800c8c639e
/miq/store/gmp-6.2.1.tar.bz2-unpack-33b1305b8e698313
/miq/store/m4-1.4.19.tar.bz2-unpack-2dd3c0568e5cf24a
/miq/store/musl-1.2.3.tar.gz-unpack-5f9d5116c4c83592
/miq/store/cc-wrapper-6666c755718e6aa4
/miq/store/ld-wrapper-668a4212dddee39c
/miq/store/stage0-stdenv-d2ecc89c54b1b316
/miq/store/musl-f0dd14ee1ca91c64
/miq/store/cc-wrapper-a948c296a1a6d88a
/miq/store/ld-wrapper-54bd49b0d1298443
/miq/store/stage0-stdenv-cbfc1da815062410
/miq/store/m4-a132d7d257844060
/miq/store/gmp-4dc253ccbc7d9572
/miq/store/mpc-1.3.1.tar.gz-unpack-2032440f15c6d528
/miq/store/mpfr-4.2.0.tar.bz2-unpack-c0aadc49a8faf78f
/miq/store/mpfr-9a80ac127a402980
/miq/store/libmpc-5d2a3c99a73fb6e6
\end{minted}
\begin{figure}[hbt]
\centerfloat
\includesvg[width=500pt]{results.svg}
\caption{Dependency graph of libmpc.}
\label{fig:results}
\end{figure}
While the |ld| and |gcc| wrappers performs the modifications needed by
this deployment model -- namely modifying the |RUNPATH|
dynamic section of the resulting |ELF| -- more work is
needed to handle properly the addition of dependencies to
the |RUNPATH| . This should be handled in the Lua code, that
creates the |ld| wrappers. For example, some libraries are
missing from the |RUNPATH| of the libmpc, while libc is
properly handled:
\begin{minted}[breaklines]{text}
$ eu-readelf -d /miq/store/libmpc-5d2a3c99a73fb6e6/lib/libmpc.so | grep -e RUNPATH -e NEEDED
NEEDED Shared library: [libmpfr.so.6]
NEEDED Shared library: [libgmp.so.10]
NEEDED Shared library: [libc.so]
RUNPATH Library runpath: [/miq/store/musl-f0dd14ee1ca91c64/lib]
\end{minted}
A simpler program like dash, which is a POSIX shell
implementation, is properly built, as it only requires libc.
\begin{minted}[breaklines]{text}
$ miq build ./pkgs/init.lua#stage1.dash
...
/miq/store/dash-fd40df4f5c3b3d7b
$ eu-readelf -d /miq/store/dash-fd40df4f5c3b3d7b/bin/dash | grep -e RUNPATH -e NEEDED
NEEDED Shared library: [libc.so]
RUNPATH Library runpath: [/miq/store/musl-f0dd14ee1ca91c64/lib]
$ /miq/store/dash-fd40df4f5c3b3d7b/bin/dash
$ cd /
$ echo *
bin dev efi etc home lib64 miq mnt nix opt proc root run srv sys tmp usr var
\end{minted}
\section{Security}
One of the main advantages of Linux operating systems, is
that the library components that form the \ac{OS} itself can
be easily replaced or updated, compared to Windows where
updates are served by Microsoft in a monolithic fashion.
For this reason, in the classical Linux distribution model,
packages are dynamically linked to each other and the
\acl{FHS} is used. When a security update is pushed, for
example for openssl, all the packages that are linked to its
library don't need to be replaced. As the library that they
depend on resides in a standard location at |/usr/lib|, an
update just replaces the underlying file, without an
application author needing to do anything. Therefore, one
could say that the deployment model of nix and miq poses a
problem, as the packages are "statically" linked to each
other --- but still using ELF dynamic linking. When an
update is pushed to openssl, all the packages that depend on
it, don't get the update, as the ELF files point into the
absolute path of the older openssl at
|/miq/store/older-openssl-hash|.
However, this can be solved by using the environment
variable |LD_LIBRARY_PATH| or |LD_PRELOAD|. As mentioned in the previous
sections, the link-loader has a list of known locations that
it search libraries for, namely:
\begin{itemize}
\item |LD_PRELOAD| environment variable pointing
directly to libraries to load even if they are not
needed.
\item |LD_LIBRARY_PATH| environment variable pointing to
a path of libraries.
\item |RUNPATH| dynamic section of the binary pointing
to a path of libraries.
\item The default search paths at |/lib| and |/usr/lib|.
\end{itemize}
While miq embeds the library |RUNPATH|, a system
administrator is still able to override it by setting either
|LD_LIBRARY_PATH| or |LD_PRELOAD| at global scope, like with
|/etc/profile|, such that binaries can ignore the embedded
library path to use a different package.
The deployment model of miq would also make it easy to trace
back the dependencies that a package uses by storing them in
the package database. This would make it easy to find which
packages are affected by a security vulnerability, possible
more so than in the classical Linux distribution model.
And in the world of Docker containers, where applications
are built into a micro-operating system constantly, having
to rebuild the whole system because of a package down in the
dependency graph is not so far-fetched from having to
rebuild an entire docker container.
\section{Other applications}
While the scope of this research project was to explore the
problems of the Linux ecosystem, the developed solution
could be useful to other fields. For example, the builder is
not limited to build C libraries, but could also be used to
build python packages with the properties of the bubblewrap
sandbox. This could allow for more reproducible deployments
and easier dependency management. However, it should be
studied how the build should be sandboxed: should python be
a store path? If so, should every dependency be also built
by miq down to libc?
Another application of the builder is to declare files to be
downloaded and cached. By leveraging the Lua scripting
language, it is trivial to programmatically declare files to
build (for example, every major version from N to N + 3).
Fetch Units downloaded by miq are automatically hashed,
making it easy to keep track of different files in the store.