-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC]: IMA Namespace support #3639
base: main
Are you sure you want to change the base?
Conversation
@@ -82,10 +91,23 @@ type Namespace struct { | |||
Path string `json:"path"` | |||
} | |||
|
|||
// GetPath() concatenates given pid and namespace type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: GetPath returns a procfs path for the given POD and namespace type.
@@ -82,10 +91,23 @@ type Namespace struct { | |||
Path string `json:"path"` | |||
} | |||
|
|||
// GetPath() concatenates given pid and namespace type | |||
// into procfs path. A few namespaces may not have procfs entry, | |||
// call HasProcfs() first to make sure this namespace has one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/HasProcfs()/HasProcfs/
func (n *Namespace) GetPath(pid int) string { | ||
return fmt.Sprintf("/proc/%d/ns/%s", pid, NsName(n.Type)) | ||
} | ||
|
||
// HasProcfs() checks if the given namespace has procfs entry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/HasProcfs() checks if/HasProcfs tells whether/
// These flags must be the same according to | ||
// libcontainer/nsenter/namespace.h | ||
var namespaceNonCloneInfo = map[NamespaceType]int{ | ||
NEWIMA: 0x80000000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- cross-i386 CI job complains:
cannot use 0x80000000 (untyped int constant 2147483648) as int value in map literal (overflows)
So I guess we have to use explicit uint64 here.
- I'd rather not use a map here, but write a function with a switch statement, somewhat similar to HasProcfs above, mostly because a map initialization incurs some init overhead, and in case of a single element it is not worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe uint32
, or uint
(depending on what type the kernel is actually using for this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and modify bootstrapData (both sides) and data structures accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
depending on what type the kernel is actually using for this
The kernel uses nothing for this (as I've written there's no CLONE_*
or any other similar flags for IMA). This value (0x80000000
) is just the next bit after CLONE_NEWNET
. runC can decide to use any flag or even not use a flag.
I'd rather not use a map here, but write a function with a switch statement
Interesting idea, I'll try to modify code this way.
Thanks!
return errors.New("IMA namespace cannot be created without USER namespace") | ||
} | ||
|
||
if config.Namespaces.Contains(configs.NEWUSER) || config.Namespaces.Contains(configs.NEWIMA) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check for NEWIMA here is redundant, as we error out above if NEWIMA is set and NEWUSER is not.
Also note that Contains
is kinda expensive since it does a linear search, so try to optimize the code to not call it 4 times here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I think it is possible to move it to the else
statement for NEWUSER
check like this:
if NEWUSER {
// ...
} else {
if NEWIMA {
return errors.New(...)
}
}
This way Contains
will be called only one more time if NEWIMA
is specified and NEWUSER
isn't. I guess this will be sufficient.
Thanks!
libcontainer/nsenter/nsexec.c
Outdated
|
||
strncat(ima_active, IMA_SECURITYFS_REL_MNT, PATH_MAX); | ||
mntflags = MS_NOATIME | MS_NODEV | MS_NOEXEC | MS_NOSUID | MS_RELATIME; | ||
err = mount("securityfs", ima_active, "securityfs", mntflags , NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this code assumes that there is a mnt
directory under container rootfs? This depends on the distro and might not always be the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right this depends on a distro, but most of them follow FHS (Filesystem Hierarchy Standard) which includes /mnt
directory. Anyway, now I see such a solution:
- Check whether rootfs contains
/mnt
directory (may be any other directory) - If not - create it
- Do mount securityfs and enable IMA namespace
- Unmount securityfs
- Remove
/mnt
directory - otherwise use
/mnt
as it is already presented
Another way I see is to create a temporary directory like .secfs
within current rootfs, and after enabling namespace remove it. But I guess this may affect perfomance as we invoke disk I/O.
Thanks!
libcontainer/nsenter/nsexec.c
Outdated
if ((strnlen(ima_active, PATH_MAX) + ima_active_path_len) > PATH_MAX) | ||
bail("ima active path is too long"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps move this check to before doing mount()
? Also, since we can assume (and do a run-time assert in the code) that ima_active_path_len > strlen(IMA_SECURITYFS_REL_MNT)
, we can omit the first length check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that we can completely omit the first check len(ima_active) + len(IMA_SECURITYFS_REL_MNT) > PATH_MAX
, replacing it with len(ima_active) + ima_active_path_len > PATH_MAX
, because before write_file
we'll do strncat(ima_active, IMA_ACTIVE_PATH)
, where ima_active
includes IMA_SECURITYFS_REL_MNT
.
But you're right it is possible for the second check to do it before mount
like this:
if ((len(ima_active) + len(IMA_SECURITYFS_REL_MNT) + ima_active_path_len) > PATH_MAX)
bail(...)
This will tell us whether len('/path/to/rootfs') + len('/mnt/') + len('integrity/ima/active')
is more than PATH_MAX
. So, no need to do any length check after mount
.
Thanks!
libcontainer/nsenter/nsexec.c
Outdated
strncat(ima_active, IMA_ACTIVE_PATH, PATH_MAX); | ||
err = write_file(&data, 1, ima_active); | ||
if (err < 0) | ||
bail("cannot set up IMA namespace"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here (and above) you are bailing without an unmount. This might or might not be OK to do, need to check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be OK, because we can mount procfs
, sysfs
, etc. including secfs
from newly created USER namespace, but we also must have mount
namespace to be able to automatically unmount all mount points when MOUNT namespace is being destroyed (that means when runC bails it will kill the process that's holding MOUNT namespace). Otherwise mount points will be in the init_mnt_ns
, so no unmount will be done for those points.
Since, it's not possible to mount securityfs
without new MOUNT namespace (mount
will return EPERM
) inside a new USER namespace all mount points will be automatically unmounted during namespace destruction process.
Thanks for pointing this out! Here may be the case that IMA namespace need to check MOUNT namespace is also specified, otherwise mount
will return an error. I will post updates if this is really the case, also need to check.
Added new "ima" namespace type for config.json configuration file for runc, e.g.: ... "namespaces": [ ... { "type": "ima" }, ... ], ... IMA namespace will be created only if USER namespace is, too, specified, otherwise runc returns error, so the only allowed scheme for enabling IMA namespace is: ... "namespaces": [ ... { "type": "user" }, { "type": "ima" }, ... ], ... otherwise runc returns an error. Signed-off-by: Ilya Hanov <[email protected]>
3826b14
to
94b23d5
Compare
Review changesAccording to @kolyshkin comments above:
Please, let me know if I missed something. Thanks! |
The Linux kernel community is now working on supporting IMA namespaces
and it is almost done. This is a new kernel feature that allows isolation of Platform Configuration Register (PCR) values, Measurement Logs (ML), etc. So this patch presents an initial support for IMA
namespaces. The related issue is #3638.
In the Linux kernel IMA namespace is created via securityFS instead of
using
clone()
orunshare()
system calls, e.g.:It is important to note that IMA namespace has no
CLONE_*
or othersystem call flags, therefor "Non-clone flags" was presented.
Sure, this does not look like a good architecture solution, we can
discuss it anyway.
Another important side is that IMA namespace has no procFS entry, so
runC needs to know how to check IMA namespace, e.g. during validation processes.
So, the solution here is to redirect this check to USER namespace, but because runC is not
aware of namespaces which are not in procFS this solution looks buggy too, we should discuss it as well.
Now, IMA namespace creation scheme looks like:
As already mentioned IMA namespace must be created in a new USER
namespace, so the order is important for USER namespace, it should be
unshared first, but this should not affect the creation's order of
remain namespaces. Now IMA namespace created after unsharing USER
namespace and other namespaces except Cgroup namespace. We can discuss
it anyway.
Here is an example of creating IMA namespace:
Since this is RFC unit tests and documentation weren't touched.
Signed-off-by: Ilya Hanov [email protected]
Advanced Software Technology Lab
Huawei