Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create temporary files and folders #10966

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Semphriss
Copy link
Contributor

Description

This PR adds three new functions to create temporary files and folders:

SDL_IOStream *SDL_SYS_CreateSafeTempFile(void)
char *SDL_SYS_CreateUnsafeTempFile(void)
char *SDL_SYS_CreateTempFolder(void)

The difference between the "safe" and "unsafe" versions of creating a temporary file is because using paths opens the door to timing attacks, including (but not only) time-of-check-to-time-of-use (TOCTOU). For any purpose requiring security, paths should not be used, but since temporary paths are sometimes needed more than security, I've included the unsafe version as well.

Currently, only Unix has an implementation.

Existing Issue(s)

Closes #10841

@slime73
Copy link
Contributor

slime73 commented Sep 27, 2024

What are the cross-platform guarantees about their lifetimes (including when the app crashes rather than gracefully closes)?

@Semphriss
Copy link
Contributor Author

For the SDL API, I haven't made any guarantee, since I'm not familiar enough (yet, at least) with what guarantees are provided by other systems to decide which ones are reasonable to apply to SDL.

In the specific case of Unix:

  • SDL_SYS_CreateSafeTempFile: Uses tmpfile under the hood, for which my man page says: "The file will be automatically deleted when it is closed or the program terminates." tmpfile(3)
  • SDL_SYS_CreateUnsafeTempFile: Uses mkstemp under the hood. The man page does not say anything, but experimentation shows that on my system, the file survives termination of the parent.
  • SDL_SYS_CreateTempFolder: Uses mkdtemp under the hood; same implications as above.

@slime73
Copy link
Contributor

slime73 commented Sep 27, 2024

What I've read in the past is that Windows doesn't really have safe/possible cleanup of temporary data if an app crashes, so that's what I'm worried about I guess if people start relying on it. Especially if people create unique temporary paths which would then fill up a user's HD.

I'd also want to know how the lifetimes behave on mobile platforms, where users are not encouraged to quit apps themselves.

@slime73
Copy link
Contributor

slime73 commented Sep 27, 2024

SDL_SYS_CreateUnsafeTempFile: Uses mkstemp under the hood. The man page does not say anything, but experimentation shows that on my system, the file survives termination of the parent.

Maybe it could be named to indicate that the file will be unique but not temporary? The name feels kind of misleading as-is.

@Semphriss Semphriss force-pushed the tmp_folder branch 2 times, most recently from 72788a7 to 1ebb531 Compare September 28, 2024 15:40
@Semphriss
Copy link
Contributor Author

Maybe it could be named to indicate that the file will be unique but not temporary? The name feels kind of misleading as-is.

I've named it "temp" because that's how systems chose to name the concept (tmpfile, mkstemp, /tmp, C:\Users\Someone\AppData\Local\Temp, GetTempFileName, GetTempPath, etc). I think it's named "temporary" as in "This file's usage will be temporary [and does not need to persist on the disk like other files]", not as in "This file's existence must be kept as temporary as possible"; the system decides when the temporary files are deleted, most often with a reboot or on user request.

@slime73
Copy link
Contributor

slime73 commented Sep 28, 2024

Yeah, historical naming of those lower level APIs have temp in the name - it's just pretty confusing to me as a user to have an API with the name 'temporary' that creates something which isn't temporary and requires the caller to clean it up manually. Especially when it's right next to another API with the name 'temporary' that creates something which is supposedly truly temporary. IMO SDL shouldn't be required to follow the naming of POSIX APIs here.

Or alternatively, what sort of use cases do temporary files that aren't automatically cleaned up have which aren't as feasible with a regular file?

@Semphriss
Copy link
Contributor Author

I'm not sure how exclusively historical the naming is, given that additions as recent as 2008 are still named with "temp" without any additional note in the documentation. It's also not exclusive to POSIX; the same concept is used at least on Windows. As for SDL_CreateSafeTempFile, the files do not need to be deleted simply as a side effect of not having any path that could be passed to SDL_RemovePath. None of the mechanisms other than this one have any builtin auto-deletion-on-close, although that doesn't make them permanent either; the system is always the one that takes care of deleting the files, probably on a reboot. Additionally, all systems I could check on have a "Delete all temporary files" button.

One notable use case where temporary files would be needed without holding open file descriptors on every temporary file for their entire lifetime would be pretty much any scenario involving SDL_CreateTempFolder (which cannot have a similar lifetime bound to an open fd, and is therefore inherently "unsafe" in that sense). For example, extracting a downloaded archive file to use its contents, AppImage-style but not necessarily just for executables, would require preserving the directory trees for the contents of the archive, and should preferably not require holding up to several thousand open file handles simultaneously, which would be resource-consuming and is often subject to system-imposed restrictions (ulimit -n).

Another use case is temporary files that must be used by multiple processes.

I wouldn't necessarily oppose a better naming, but I'm concerned about people who already are familiar with temporary files in the current definition, and who would be confused when seeing a different naming. If the name is to be changed, I would recommend double-checking that the change is more beneficial than confusing.

@Semphriss
Copy link
Contributor Author

The Unix implementation should work across all Unix-likes, including Haiku, macOS, POSIX, and so on; how should I handle this? Should I copy-paste the code, or add another file for it?

@slouken
Copy link
Collaborator

slouken commented Oct 12, 2024

The Unix implementation should work across all Unix-likes, including Haiku, macOS, POSIX, and so on; how should I handle this? Should I copy-paste the code, or add another file for it?

Like the process subsystem, you can create a windows folder and a posix folder and all the UNIX-likes can use that.

@Semphriss Semphriss force-pushed the tmp_folder branch 2 times, most recently from c731eb2 to 9c67348 Compare October 14, 2024 19:13
@Semphriss Semphriss marked this pull request as ready for review November 24, 2024 20:57
@Semphriss
Copy link
Contributor Author

I believe the implementation is pretty much done, and it would be ready for review.

@icculus
Copy link
Collaborator

icculus commented Nov 26, 2024

What's the use-case for a file that deletes on close that you can't get the path of?

@Semphriss
Copy link
Contributor Author

I don't use temporary files in C a lot so I may not be the best source, but since nobody replied, I'll answer to the best I can:

When I researched for the most secure method of creating a temporary file, I found that several documentation sources (POSIX, the GNU manual, etc.) point to tmpfile() as being the safest way to create a temporary file. I must admit that I took its existence in a standard as evidence that it is useful in some scenarios, but I don't have a lot of personal experience with temporary files in C. Based on a quick research, it seems that the most interesting use cases is when an application has to deal with large amounts of data that wouldn't fit in memory, compatibility with filesystem-based APIs and inter-process communication (parent-to-child and inversely).

@smcv
Copy link
Contributor

smcv commented Jan 8, 2025

This is related to #11887 and would help to avoid similar /tmp symlink attacks in other parts of SDL (although for #11887, the vulnerable code already has GLib and GTK dependencies, so it was just as easy to use GLib's g_mkdtemp() and avoid a dependency on this feature).

* tasks using the temporary folder. If you need one or more temporary files
* for sensitive purposes, use SDL_CreateSafeTempFile().
*
* The path string is owned by the caller and must be freed with SDL_free().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When in doubt, I think this is the one to recommend. That way, it doesn't matter whether the caller uses atomic or non-atomic I/O inside the temporary directory. Typical stdio-like functions like SDL_SaveBMP() are non-atomic, because it's difficult to do anything other than that on Windows.

I'd suggest calling it a directory rather than a folder, consistent with SDL_GlobDirectory. On Windows, there is a subtle distinction between directories (things that genuinely exist in the filesystem) and folders (a UI construct that behaves like a directory but might be virtual), and this one is specifically a directory. On Unix, folders are not a concept that really exists at all, except perhaps in UIs and user-facing documentation: there are only directories.

any program running as the same user as your program can access the contents of the folders and the files in it

This is equally true for anything that you write to the filesystem, temporary or not!

On Unix, mkdtemp() creates the directory with permissions 0700 (private to the user), so other users cannot interfere with its contents. A typical Unix security model is that all programs running as the same user ID are equally-privileged (they can trace each other, modify files that they trust, and so on) so there is no defending against attacks coming from the same user ID anyway, unless the programs are sandboxed (like Flatpak) in which case they should be using a separate /tmp. The interesting security boundary is that it's necessary to defend against attacks from a different user ID - either a second human user of a multi-user machine, or a privilege-separated system user like avahi, daemon or www-data.

I don't know what the security properties are on Windows, but I'm pretty sure any attempt to defend against attacks coming from a non-sandboxed program running as the same logged-in user would just be security theatre, similar to the situation on Unix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any program running as the same user as your program can access the contents of the folders and the files in it

This is equally true for anything that you write to the filesystem, temporary or not!

Actually, tmpfile() on Unix is specifically designed to help against this. The file does not actually exist on the filesystem; the file descriptor is not linked anywhere on the filesystem, and cannot even be identified by a different process. That's what SDL_CreateSafeTempFile() uses under the hood on Unix.

Hopefully, the name "Unsafe" sounds scary enough to encourage someone to at least take a look at the documentation :)

Comment on lines +494 to +502
* Create a temporary file, with less security considerations.
*
* Unlike SDL_CreateSafeTempFile(), this function provides a path, which can
* then be used like any other file on the filesystem. This has security
* implications; an attacker could exploit the small delay between the moment
* the name is generated and the moment the file is created to create the file
* first and give it undesirable attributes, such as giving itself full
* read/write access to the file, or making the file a symlink to another,
* sensitive file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function, as documented here, is too dangerous to be useful: the author of a SDL game cannot know whether their game will be run on a multi-user system where another user is attempting to carry out a symlink attack.

Instead, I would suggest modelling it on Python's NamedTemporaryFile, which returns (an object containing) a file handle and a path:

This function operates exactly as TemporaryFile() does, except that the file is guaranteed to have a visible name in the file system (on Unix, the directory entry is not unlinked). That name can be retrieved from the name attribute of the returned file-like object.

For example a good signature might be: SDL_IOStream *SDL_CreateNamedTempFile(char **name_out) (and perhaps SDL_IOStream *SDL_CreateTempFile(void) could just be implemented as syntactic sugar for SDL_CreateNamedTempFile(NULL)).

https://docs.gtk.org/glib/func.mkstemp.html is another implementation of the same idea.

It would be a good idea to document what you can safely do with the temporary file by its path (read it, edit it in-place, or overwrite it), and what you cannot safely do with it (after you delete it, even if it's only for a moment, it is no longer safe to reuse that path because an attacker could have hijacked it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The moment paths are involved, a temporary file is no longer "safe" in at least one way, because other processes can refer to it and at least read it/write to it (I believe no environment enforce file locks? Linux provide locks, but they require voluntary implementation). I'm worried about developers thinking that the function is merely a more convenient path-based alternative to the current SDL_CreateSafeTempFile, rather than an insecure alternative to be used by APIs that are inherently insecure, like AppIndicator.

I would prefer to stick to the "safe"/"unsafe" naming to make sure this is properly conveyed to people who eagerly copy-paste code without reading the documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an insecure alternative to be used by APIs that are inherently insecure

What threat model do you have in mind here - what are you assuming that the attacker can and can't do? (It isn't ever really possible for a component to be "secure" in the abstract, but it can be secure against a specific threat model.)

a temporary file is no longer "safe" in at least one way, because other processes can refer to it and at least read it/write to it

Other processes that are running as the same user can do that, yes; but in the typical Unix security model, there is already no security boundary between processes that are running as the same user. They can ptrace() or kill() each other; they are reading and trusting the same configuration files in $HOME and state in $XDG_RUNTIME_DIR; IPC services like X11, Wayland and D-Bus treat them as essentially equivalent; and so on. The threat model that Unix is/was designed for is that you trust yourself and root, but other users are assumed to be hostile.

Other users can be prevented from reading and writing files by setting appropriate permissions on those files, so if your threat model is that the attacker is running code as a different user ID, "all" we need to do is to avoid time-of-check/time-of-use attacks. Once the file is in place and has our desired permissions, other users can't mess with it - unless we delete it, in which case we no longer "own" that path and all bets are off.

If you have a different threat model in mind, then please describe it, and we should evaluate these APIs against that threat model as well, but it's entirely possible that the answer will be: if that is your threat model, then you have already lost, because the OS was not designed for it.

Copy link
Contributor

@smcv smcv Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe no environment enforce file locks? Linux provide locks, but they require voluntary implementation

Yes, mandatory locking isn't implemented on Unix, and the advisory locking that does exist is a data-integrity mechanism rather than a security mechanism (it doesn't prevent an adversary from attacking you, all it does is to let two cooperating processes agree on how to avoid data loss).

Linux used to have mandatory locking on some filesystems, but it was removed, because it was considered to be a denial-of-service security vulnerability (a malicious user with read access could prevent another user from accessing files).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[...] in the typical Unix security model, there is already no security boundary between processes that are running as the same user.

Actually, on Linux at least, there are some kernel settings that can provide better protection. For example, the YAMA module provides /proc/sys/kernel/yama/ptrace_scope, which allows different levels of restrictions on which process can ptrace which other. It provides 4 levels of security (0 = any process with the same uid, 1 = only parent processes, 2 = only processes with CAP_SYS_PTRACE, 3 = deactivated entirely). On my Ubuntu 24.04, it defaults to a value of 1.

The linked documentation states, towards the beginning, some of the issues with processes having the ability to read the memory of other processes with the same uid. That's the threat I'd like to avoid: leaking potentially sensitive information to other processes. ptrace seems to be mitigated to a certain degree on modern systems, and kill wouldn't leak information from temporary files created with tmpfile(), which deletes on close. Other possible attacks have their own mitigation, though all those I can think of is within the control of the OS, the environment, the app developer, the end user, or an entity other than SDL.

I understand SDL can't change how operating systems work, but operating systems may provide settings to adjust how they work. However, they can't reasonably provide settings to change how SDL works, so if we choose to ignore the problem and leave the issue there, there isn't much that developers and end users can do about it, short of patching SDL manually. I'd rather have SDL not be the cause of the issue, even if some OSes will hold some issues out of SDL's control.

Comment on lines +508 to +509
* OS-specific format, and is guaranteed to finish with a path
* separator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filenames don't end with a path separator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy-pasting bug, my bad :)

Comment on lines +44 to +45
/* TODO: Check for possible alternatives to /tmp, like $TMP */
char template[] = "/tmp/tmp.XXXXXX";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's conventional to try $TMPDIR before /tmp, but not usually $TMP.

The template can be static const char template[] if you're going to strdup it anyway.

Comment on lines +62 to +63
path, to avoid issues. In this function, security is assumed to be
unimportant, so no need to worry about it. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, on a multi-user system, security is never unimportant: even if the data you're saving is unimportant, symlink attacks like the one described in #11887 can be used to destroy unrelated data owned by the same user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've called the function "unsafe" to convey that :) This function is meant for interoperability with software that require a path and accept nothing else, for example AppIndicator.

Normally, the symlink attack is mitigated within the function by mkstemp and outside the function by the developers being careful with their code - surely the temp path isn't the only place where an arbitrary filename can be written to. If it can be relevant, I can write that in the documentation.

/* Normal usage of mkstemp() would use the file descriptor rather than the
path, to avoid issues. In this function, security is assumed to be
unimportant, so no need to worry about it. */
/* See https://stackoverflow.com/questions/27680807/mkstemp-is-it-safe-to-close-descriptor-and-reopen-it-again */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tl;dr: "no,it is not safe".

Comment on lines +72 to +73
/* TODO: Check for possible alternatives to /tmp, like $TMP */
char template[] = "/tmp/tmp.XXXXXX";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments as for the function that uses mkstemp()

Comment on lines +392 to +393
/* Note: Someone watching the temp folder could create a file with a
clashing name before it is re-created (TOCTOU). However, this shouldn't
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the security implications of this on Windows. Is the temp directory per-user (good), or is it shared like /tmp is (bad, unless carefully mitigated)?

Someone who knows Windows' security properties should review this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The temporary folder path is something like C:\Users\User\AppData\Roaming\Temp\ -- the exact path might be slightly different, but it's within the current user's home directory, so presumably they are separate for each user. I think even the system has its own temporary folder.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If filesystem permissions prevent a (non-privileged) user from accessing a different user's temp - e.g. C:\Users\Alice\AppData\Roaming\Temp is inaccessible by Bob - then it isn't necessary to be as paranoid as when dealing with Unix /tmp.

Comment on lines +403 to +406
if (!DeleteFileW(tmp_file)) {
WIN_SetError("Couldn't delete the file created by GetTempFileNameW");
return NULL;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this works anything like it does on Unix, you probably shouldn't delete the file? Better to have the possibility of failing than to be subject to TOCTOU attacks.

glib/gfileutils.c in GLib might be a useful reference for how other people have solved this, although it can't be copied directly unless its copyright holders give permission to relicense (it's LGPL).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the docs:

Creates a name for a temporary file. If a unique file name is generated, an empty file is created and the handle to it is released; otherwise, only a file name is generated.

Deleting the file and re-opening it has two purposes:

  1. Detecting if the file name really was unique and could be created. Otherwise, the file couldn't be created, likely because a different, similarly-named file already exists; therefore this name shouldn't be used.
  2. For the TOCTOU issue, the problems already exists by the fact that Win32 closes the file, and that between the time Win32 closes the file and SDL re-opens it, an attacker has the time to change the file.
    I've solved it by passing spacial arguments when re-creating the file: CREATE_NEW ensures that the file doesn't already exist before opening it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SDL_filesystem: get temporary folder
6 participants