Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

let memory never written to be zero #66

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

morganthomas
Copy link
Collaborator

This PR allows programs running in the emulator to reference memory before they write it, with the result being zero. I'm not sure how necessary this is; it's come up, but only in buggy code. I don't think programs can normally rely on memory being set to a value before it's written.

@dlubarov
Copy link
Collaborator

Hmm, I was under the impression that loading from uninitialized memory was undefined behavior, in which case panicking seems valid (and I think failing fast should generally be preferred when it's valid).

Maybe I was mixed up though - does it actually result in an undefined value rather than undefined behavior? If so, I think returning 0 as you suggest makes sense.

@morganthomas
Copy link
Collaborator Author

Accessing uninitialized memory has undefined behavior in general. In practice it will result in an undefined value for memory within a segment owned by the process, and it will result in a segmentation fault for other memory.

There's an interesting debate on Hacker News about whether reading uninitialized memory is always undefined behavior in C or not. It's a bit intricate and I'm not sure exactly what to make of it. As far as I know it may be that reading uninitialized memory is defined behavior in C sometimes, but if so, those cases are unusual.

Perhaps more relevant than what the C standard specifies, is what compilers and programs do. If a compiler is not spec-compliant, or if code we want to run has undefined behavior but relies for its intended behavior on how the machine executes the code, then these facts need to be taken into account.

Overall, I guess my recommendation is that we can table this suggestion until we have clearer evidence one way or another that it matters or doesn't matter. At this point I have evidence that code generated by a buggy compiler reads uninitialized memory, but that's not very interesting; buggy assembly code can have segfaults. Perhaps it's more useful for my purposes that reading unintialized memory is an error condition, if it's the case that reading uninitialized memory should not happen from the C code I'm compiling.

@dlubarov
Copy link
Collaborator

Thanks for the info. In terms of LLVM semantics, the closest thing I could find was

loading from uninitialized memory produces an undefined value.

But that was in the doc of alloca, so I'm not 100% clear on whether the same holds for load, but I would assume so.

Based on that it seems like panicking probably isn't quite valid, so I think we should proceed with your change at some point. Though since this should very rarely come up in well-defined programs, we can hold off for a while if the panic is useful for detecting buggy programs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants