509
One Of The Rust Linux Kernel Maintainers Steps Down - Cites "Nontechnical Nonsense"
(www.phoronix.com)
From Wikipedia, the free encyclopedia
Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).
Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.
Community icon by Alpár-Etele Méder, licensed under CC BY 3.0
At the cost of sounding naive and stupid, wouldn't it be possible to improve compilers to not spew out unsafe executables? Maybe as a compile time option so people have time to correct the source.
the semantics of C make that virtually impossible. the compiler would have to make some semantics of the language invalid, invalidating patterns that are more than likely highly utilized in existing code, thus we have Rust, which built its semantics around those safety concepts from the beginning. there’s just no way for the compiler to know the lifetime of some variables without some semantic indication
It may be a naive question, but it's a very important naive question. Naive doesn't mean bad.
The answer is that that is not possible, because the compiler is supposed to translate the very specific language of C into mostly very specific machine instructions. The programmers who wrote the code, did so because they usually expect a very specific behavior. So, that would be broken.
But also, the "unsafety" is in the behavior of the system and built into the language and the compiler.
It's a bit of a flawed comparison, but you can't build a house on a foundation of wooden poles, because of the advantages that wood offers, and then complain that they are flammable. You can build it in steel, but you have to replace all of the poles. Just the poles on the left side won't do.
And you can't automatically detect the unsafe parts and just patch those either. If we could, we could just fix them directly or we could automatically transpile them. Darpa is trying that at the moment.
Thank you and all the others that took time to educate me on what is for me a "I know some of those words" subject
Meaning a (current) kernel is actually a C to machine code transpiler?
The problem is that C is a prehistoric language and don't have any of the complex types for example. So, in a modern language you create a String. That string will have a length, and some well defined properties (like encoding and such). With C you have a char * , which is just a pointer to the memory that contains bytes, and hopefully is null terminated. The null termination is defined, but not enforced. Any encoding is whatever the developer had in mind. So the compiler just don't have the information to make any decisions. In rust you know exactly how long something lives, if something try to use it after that, the compiler can tell you. With C, all lifetimes lives in the developers head, and the compiler have no way of knowing. So, all these typing and properties of modern languages, are basically the implementation of your suggestion.
Modern C compilers have a lot of features you can use to check for example for memory errors. Rusts borrow-checker is much stricter as it's designed to be part of the language, but for low-level code like the Linux kernel you'll end up having to use Rust's
unsafe
feature on a lot of code to do things from talking to actual hardware to just implementing certain data structures and then Rust is about as good as C.Compilers follow specs and in some cases you can have undefined behavior. You can and should use compiler flags but should complement that with good programming practices (e.g. TDD) and other tools in your pipeline (such as valgrind).
If you write unsafe code then how should it compile?
I’d like to add that there’s a difference between unsafe and unspecified behavior. Sometimes I’d like the compiler to produce my unsafe code that has specified behavior. In this case, I want the compiler to produce exactly that unsafe behavior that was specified according to the language semantics.
Especially when developing a kernel or in an embedded system, an example would be code that references a pointer from a hardcoded constant address. Perhaps this code then performs pointer arithmetic to access other addresses. It’s clear what the code should literally do, but it’s quite an unsafe thing to do unless you as the developer have some special knowledge that you know the address is accessible and contains data that makes sense to be processed in such a manner. This can be the case when interacting directly with registers representing some physical device or peripheral, but of course, there’s nothing in the language that would suggest doing this is safe. It’s making dangerous assumptions that are not enforced as part of the program. Those assumptions are only true in the program is running on the hardware that makes this a valid thing to do, where that magical address and offsets to that address do represent something I can read in memory.
Of course, pointer arithmetic can be quite dangerous, but I think the point still stands that behavior can be specified and unsafe in a sense.
This has been done to a limited extent. Some compilers can check for common cases and you can enforce these warnings as errors. However, this is generally not possible as others have described because the language itself has behaviors that are not safe, and too much code relies on those properties that are fundamentally unsafe.