1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
|
* access control
On the Linux desktop, most people are running apps with zero access control enforced, meaning that running the Discord client for example, can ~rm -rf ${HOME}~ or slurp up your private data such as your ssh keys. This is not great, but there are several options available to address this problem.
* example program
I'm going to use weechat as an example of a program that we want to isolate from the rest of the system. It's simple but not trivial, and touches a lot of commonly shared directories like ~XDG_CONFIG_DIR~, ~XDG_STATE_DIR~ and similar.
At it's core, Weechat needs access to a few directories and the network. We will focus on the directories.
Here are the directories/files it needs write access to:
#+BEGIN_SRC
${HOME}/.config/weechat
${HOME}/.cache/weechat
${HOME}/.local/share/weechat
${HOME}/.local/state/weechat
${XDG_RUNTIME_DIR}/weechat
#+END_SRC
It also needs read-only access to some system files such as (assuming a merged-usr system):
#+BEGIN_SRC
/etc/ld.so # the dynamic loader
/usr/lib{,32,64}
/usr/bin/weechat # weechat executable itself
/usr/share
#+END_SRC
* our options
There are multiple options available to do access control on Linux, but I'm going to cover namespaces (with bubblewrap), Apparmor and SELinux.
** bubblewrap
Bubblewrap is a small C utility used to setup mount namespaces for sandboxing and container purposes.
Mount namespaces are a way to control processes views of mount points, meaning processes in different mount namespaces cannot see each others mounts. A simple example would be mounting a USB drive to ~/mnt/usb~. If you mount the USB drive in a separate mount namespace, other processes will not see anything mounted to ~/mnt/usb~ at all.
Bubblewrap works by creating a new mount namespace, and then creates a new root mountpoint on a tmpfs (similar to a chroot), and bind mounts in the directories and files provided by the command line parameters.
Let's see an example of how we actually do this:
#+BEGIN_SRC
#!/bin/bash
args=(
--unshare-all
--share-net
--dev /dev
--proc /proc
--tmpfs /tmp
--tmpfs /run
--tmpfs /var
--tmpfs /mnt/sandbox
--ro-bind /usr /usr
--ro-bind /bin /bin
--ro-bind /sbin/ /sbin
--ro-bind /lib /lib
)
# handle lib32 and lib64 for some systems
[[ -e /lib32 ]] && args+=(--ro-bind /lib32 /lib32)
[[ -e /lib64 ]] && args+=(--ro-bind /lib64 /lib64)
exec bwrap ${args[@]} /bin/sh
#+END_SRC
Running this script should drop you into a shell in the sandbox.
You won't be able to access much since almost everything is mounted read only, but there are writable tmpfs mounts. The tmpfs mount points will not persist across runs, and get deleted when the sandbox is destroyed.
This isn't super useful but it shows a simple example. Now lets adapt this to run weechat!
#+BEGIN_SRC
#!/bin/bash
# setup the core bind mounts
args=(
--unshare-all
--share-net
--dev /dev
--proc /proc
--tmpfs /tmp
--tmpfs /run
--tmpfs /var
--tmpfs /mnt/sandbox
--ro-bind /usr /usr
--ro-bind /bin /bin
--ro-bind /sbin/ /sbin
--ro-bind /lib /lib
)
# handle lib32 and lib64 for some systems
[[ -e /lib32 ]] && args+=(--ro-bind /lib32 /lib32)
[[ -e /lib64 ]] && args+=(--ro-bind /lib64 /lib64)
# weechat specific bind mounts (make sure these exist before running the script)
args+=(
--tmpfs ${HOME}
--bind ${HOME}/.config/weechat ${HOME}/.config/weechat
--bind ${HOME}/.cache/weechat ${HOME}/.cache/weechat
--bind ${HOME}/.local/share/weechat ${HOME}/.local/share/weechat
--bind ${HOME}/.local/state/weechat ${HOME}/.local/state/weechat
)
exec bwrap ${args[@]} /usr/bin/weechat
#+END_SRC
Hopefully weechat starts up. Now it will only have read only access to most of the system, and will not be able to access anything else in your ~${HOME}~, such as your ssh keys.
You may want to adapt this script to bind in other things, but this should at least give you a start.
There are some caveats with bwrap based sandboxing. The primary issue is that it requires "root" to create mount namespaces. You might wonder why you were able to run without root before, this is because bwrap created a user namespace.
User namespaces are similar to mount namespaces, but they unshare IDs rather than mount points. This means you can become UID 0 (root) inside of a sandbox, and perform actions that normally require root access, but outside of the sandbox you are still not-root and have no extra privileges.
User namespaces involve ID mapping. For example, UID 1000 may be mapped to UID 0 inside of the container. Most Linux systems also have a reserved range of IDs for each user, dedicated for mapping into user namespaces. My system has ~notroot:100000:65536~ dedicated for user ~notroot~. So all UIDs between 100000 and 165536 are reserved for this purpose. If you map 1000:0:1 and 100000:1:65535, files created inside of the sandbox by root will appear as owned by UID 1000 outside, and files owned by UID 1000 in the sandbox will be seen as UID 100999 outside. IDs that are not mapped will be seen as "nobody" inside of the sandbox.
ID mapping is confusing for me personally, but ~bwrap~ has some flags to help you setup trivial mappings that should work for a lot of simple use cases.
~bwrap~ can also unshare ipc, pid, net, uts and cgroup namespaces, which all work similar to the namespaces described above, and provide isolation for things beyond files which is also an important aspect of sandboxing.
** apparmor
Apparmor is a "Linux Security Module" (LSM), and a mandatory access control (MAC) system.
MAC is different from discretionary access control (DAC) in that a central authority controls the rules, instead of owners of the resource.
Apparmor is a path based LSM. Apparmor profiles define a list of paths that a process can or can't access. The profile syntax supports glob-like "patterns" for matching specific paths that the process might try to access as runtime.
Lets show an example of an Apparmor profile for our IRC client:
#+BEGIN_SRC
#include <tunables/global>
profile weechat /usr/bin/weechat {
#include <abstractions/base>
# read only shared system resources
/etc/fonts/** r,
/usr/share/** r,
owner @{HOME}/.config/weechat/ rw,
owner @{HOME}/.config/weechat/** rw,
owner @{HOME}/.cache/weechat/ rw,
owner @{HOME}/.cache/weechat/** rw,
owner @{HOME}/.local/share/weechat/ rw,
owner @{HOME}/.local/share/weechat/** rw,
owner @{HOME}/.local/state/weechat/ rw,
owner @{HOME}/.local/state/weechat/** rw,
owner @{XDG_RUNTIME_DIR}/weechat/ rw,
owner @{XDG_RUNTIME_DIR}/weechat/** rw,
}
#+END_SRC
The first part of the profile simply includes a file (via the c-pre-processer) that has "tunables" such as ~@{HOME}~ predefined.
The second part of the profile (the ~profile weechat~ part) defines a profile for the ~/usr/bin/weechat~ executable. Apparmor transitions into confined mode when a process executes an executable that matches the ~/usr/bin/weechat~ pattern (globs are supported here).
The third part of the profile includes the ~base~ abstraction. ~base~ gives access to all of the basic things all processes will need to run at all, such as access to ~/usr/lib~ or ~/dev/null~. You can technically define these all yourself, but it's quite a lot of boilerplate, and the base should work for most use cases.
The rest of the profile defines path and patterns and access rules for them. Weechat will only be able to access the paths you defined and the things defined in ~base~ with this profile.
Apparmor is very simple and easy to get started with, but does have a few flaws.
The primary flaw is that apparmor is *path based* rather than *inode based*. Hardlinks of files could allow bypassing the apparmor rules, depending on the exact situation. Apparmor disallows creating links by default though, so the hardlinks would have to be created by something unconfined or that was explictly allowed.
By default, apparmor prevents you from being able to *execute* the paths you gave access to. There are a few ways to give *execute* permissions.
*** execute modes
- ~ix~ starts the subproc under the current profile
- ~ux~ starts the subproc unconfined
- ~px~ starts the subproc under a profile that matches the executable path
- ~cx~ starts the subproc under a subprofile
*** caveat
Until Linux 6.17, apparmor will not be fully functional without Ubuntu kernel patches.
The primary missing feature I am aware of is the ability to restrict access to unix sockets.
** selinux
Selinux is another MAC based LSM, it's however quite different from apparmor.
*** labels
Selinux access control works by labeling subjects (processes) and objects (files etc) with "types", this information is stored in the files xattrs, an example label is "~sys.id:sys.role:sys.subj:s0~".
Unlike apparmor, Selinux is inode based rather than path based, so hardlinks can't be used as loopholes.
The first part of the label is the *user*, the second is the *role* and the third is the *type*. Mostly we are going to ignore users and roles and focus on types for this.
*** dssp5
This post is going to assume we are basing our policy [[https://salsa.debian.org/dgrift/dssp5/][dssp5]], a minimal and modular base policy that we create our own types on top of. [[https://salsa.debian.org/dgrift/dssp5/][dssp5]] provides the core types.
*** built in types
[[https://salsa.debian.org/dgrift/dssp5/][dssp5]] provides many core types that we will build our policy on top of.
An example of a core type is ~home.file~. This is a type applied to home directories such as ~/home/john~. There are many base types for various parts of the filesystem.
Here are some major built in types:
- ~conf.file~ for ~/etc~
- ~lib.file~ for ~/usr/lib~
- ~exec.file~ for ~/usr/bin~
- ~run.file~ for ~/run~
- ~var.file~ for ~/var~
There are also "subtypes" for some of these built in types like ~spool.var.file~ for ~/var/spool~.
*** how do files get typed
**** setfiles
Mount points will be labeled with ~setfiles~, and any new files created underneath that mount point should inherit the label by default. This is default label for files that don't have a filecon defined.
**** filecon
In Selinux policy you will define ~filecon~ expressions like this (ignore the other parts for now):
#+BEGIN_SRC
(block var
(blockherit .file.template)
(filecon "/var" dir file_context)
(filecon "/var/.*" file file_context))
#+END_SRC
After compiling and loading the policy, you would use the built in ~restorecon~ command to apply these labels.
**** type transitions
Also files can change types via type transitions at runtime. An example for weechat, we want all of the runtime files weechat creates to be labeled ~agent.weechat~ or similar, so we define a type transition in the weechat selinux module:
#+BEGIN_SRC
(call .agent.weechat.run.file_type_transition_file (.agent.weechat.subj dir "weechat"))
(call .agent.weechat.run.file_type_transition_file (.agent.weechat.subj file "*"))
#+END_SRC
(Don't worry if you don't understand this yet, we will learn more about the *cil* language in a bit.)
Another example would be transitioning from one context to another when executing something. In our later policy, running the weechat executable causes a type transition from ~sys.subj~ to ~weechat.subj~.
*** how do processes get typed
With dssp5, processes will start in the ~sys.subj~ context which is basically unconfined and has access to everything. Processes change types via type transitions or with ~runcon~. We will go over type transitions a bit more later when we define the weechat module.
*** cil overview
Cil is the language we will write policy in. It's a simple sexpr based language, with namespaces, types, typeattributes (metatypes), macros and templates.
**** cil types
We can define types like this:
#+BEGIN_SRC
(type foo)
#+END_SRC
**** cil namespaces
In cil we will almost always be working in a namespace.
We can define a namespace with the block keyword:
#+BEGIN_SRC
(block foo
(block bar))
#+END_SRC
If a block has already been created and you want to "enter" it, you use the "in" keyword
#+BEGIN_SRC
(in .foo.bar)
#+END_SRC
You access types with the ~.~ operator. A dot at the beginning of the expression starts searching from the "top" namespace rather than looking for a type in the current namespace.
#+BEGIN_SRC
(in foo.bar
(macro baz ((type ARG1))
(do_something_with ARG1))
;; define a type
(type qux)
;; call our macro using local lookup
(call baz (qux))
;; call our macro using global lookup
(call .foo.bar.baz (.foo.bar.qux))
#+END_SRC
We will make great use of namespaces in our policy!
**** macros
Macros are sort of like functions. Macros "capture" local types similar to lambdas and interpolate parameters into expressions.
#+BEGIN_SRC
(block foo
(type bar)
;; define our macro (we will cover typeattributes soon)
(macro test ((type ARG1))
(typeattributeset bar ARG1)))
(block baz
(type qux)
;; call our macro
(call .foo.test (qux)))
#+END_SRC
**** templates
Templates are blocks that are inherited by other blocks.
Abstract blocks are blocks which only exist once they are inherited.
You can think of abstract blocks like inheritance and OOP in programming.
#+BEGIN_SRC
(block foo
;; define our abstract block (template)
(block bar
(blockabstract bar)
;; define a type
(type t)))
(block baz
;; inherit the bar block, now the t type will be created and in scope
(blockinherit .foo.bar)
(dothing t))
#+END_SRC
Hint: abstract blocks are very commonly used to define types, so you will often not be defining ~(type foo)~ directly, but instead letting the templates do the work for you.
We will make great use of the built in templates for almost everything we do.
**** type attributes
Type attributes are like "metatypes". They are used to group types together for shared behaviour.
An example here:
#+BEGIN_SRC
(in file
(block user
(macro type ((type ARG1))
;; since its a macro we can use things before they are defined
(typeattributeset typeattr ARG1))
;; create the type attribute
(typeattribute typeattr)
;; our typeattr can be associated with another one as well
(call .file.home.type (typeattr))
(block base_template
(blockabstract base_template)
(blockinherit .file.base_template)
;; remember the file type is introduced via the template
;; associate the file type with the userfile.typeattr
(call .userfile.type (file)))))
(block ssh
(blockinherit .file.user.template
;; file and file_context are also introduced via the .userfile.base_template, which inherits
;; from .file.base_template (layers of templates like this is important for
;; abstracting out boilerplate)
(filecon "HOME_DIR/\.ssh" dir file_context)
(filecon "HOME_DIR/.ssh/.*" file file_context)))
(block gpg
(blockinherit .file.user.template
(filecon "HOME_DIR/\.gnupg" dir file_context)
(filecon "HOME_DIR/.gnupg/.*" file file_context)))
;; Now we can give something access to all userfiles instead of listing each type.
(block userdel
(blockinherit .subj.template)
;; allow access to all userfiles including ssh and gpg files
(call .file.user.type (subj)))
#+END_SRC
A good example for the usefulness of type attributes is the program ~userdel~, this needs access to ~${HOME}~ and all user files underneath. If each type (ssh, gpg, foo) were not associated with the ~file.home.typeattr~ (via associating with ~.userfile.typeattr~), policy for ~userdel~ would need to manually allow each type to do it's job.
Typeattributes are one of the most important things for abstracting out behavior. You can create hierarchies of types in a way similar to OOP.
**** type transitions
Type transitions are rules in policy that control how types change at runtime. A common desire would be to have files created by weechat end up with a weechat label, or entering a new context when executing something.
I do not fully understand how these work internally, but I will show an example of how to do this:
#+BEGIN_SRC
(block weechat
(block run
(macro file_type_transition_file ((type ARG1) (class ARG2) (name ARG3))
(call .user.run.file_type_transition (ARG1 file ARG2 ARG3)))
;; inherit the template for files in /var/user/${UID}
(blockinherit .file.user.run.template)))
(call .agent.weechat.run.file_type_transition_file (.agent.weechat.subj dir "weechat"))
(call .agent.weechat.run.file_type_transition_file (.agent.weechat.subj file "*"))
#+END_SRC
This will cause files created in the weechat context, under ~/var/run/${UID}~ to be transitioned into the ~agent.weechat.run.file~ type, rather than the "default" ~user.var.file~ (the default type depends on your policy, this is just an example).
*** policy
Lets write some policy now!
**** xdg directories
We want to create some new types for the directories weechat requires access to.
#+INCLUDE: "access-control/xdgfile.cil" src
**** loading policy
You can load dssp5 policy up with:
#+BEGIN_SRC
make modular_install
#+END_SRC
Next run ~restorecon~ to apply our new labels (this could take a while):
#+BEGIN_SRC
restorecon -Rv /
#+END_SRC
If everything went as planned you should be able to use ~ls -alZ ${HOME}~ to see your new labels.
**** weechat policy
Define policy for weechat itself:
#+INCLUDE: "access-control/weechat.cil" src
In dssp5 you will notice that we rarely write ~allow~ rules directly, we use macros and templates to do the heavy lifting when we can. The templates and macros can be a little confusing at first but they make sense once you start to use them for your modules.
Selinux is by far the most verbose of the options I listed, but also the most powerful and flexible, and IMO the most fun.
**** todo
For your real policy you want to create abstractions for common behaviour to cut down on the boilerplate.
A large part of the weechat module could be abstracted out into a new .subj.common module. Common behavor like accessing your own files and accessing things that every process will need like the dynamic loader and system libraries.
With dssp5 it's up to you to build up abstractions, it only provides a base.
* questions
If you have any questions or problems you can email me (my contact info is on my front page), or join the ~#selinux~ channel on [[https://irc.libera.chat]].
|