this post was submitted on 18 Aug 2023
22 points (92.3% liked)

Programming

17024 readers
322 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



founded 1 year ago
MODERATORS
 

From Russ Cox

Lumping both non-portable and buggy code into the same category was a mistake. As time has gone on, the way compilers treat undefined behavior has led to more and more unexpectedly broken programs, to the point where it is becoming difficult to tell whether any program will compile to the meaning in the original source. This post looks at a few examples and then tries to make some general observations. In particular, today's C and C++ prioritize performance to the clear detriment of correctness.

I am not claiming that anything should change about C and C++. I just want people to recognize that the current versions of these sacrifice correctness for performance. To some extent, all languages do this: there is almost always a tradeoff between performance and slower, safer implementations. Go has data races in part for performance reasons: we could have done everything by message copying or with a single global lock instead, but the performance wins of shared memory were too large to pass up. For C and C++, though, it seems no performance win is too small to trade against correctness.

you are viewing a single comment's thread
view the rest of the comments
[โ€“] mo_ztt@lemmy.world 1 points 1 year ago (1 children)

Right, exactly. If you're using C in this day and age, that means you want to be one step above assembly language. Saying C should attempt to emulate a particular specific architecture -- for operations as basic as signed integer add and subtract -- if you're on some weird other architecture, is counter to the whole point. From the point of view of the standard, the behavior is "undefined," but from the point of view of the programmer it's very defined; it means whatever those operations are in reality on my current architecture.

That example of the NULL pointer use in the kernel was pretty fascinating. I'd say that's another exact example of the same thing: Russ Cox apparently wants the behavior to be "defined" by the standard, but that's just not how C works or should work. The behavior is defined; the behavior is whatever the processor does when you read memory from address 0. Trying to say it should be something else just means you're wanting to use a language other than C -- which again is fine, but for writing a kernel, I think you're going to have a hard time saying that the language need to introduce an extra layer of semantics between the code author and the CPU.

The behavior is defined; the behavior is whatever the processor does when you read memory from address 0.

If that were true, there would be no problem. Unfortunately, what actually happens is that compilers use the undefined behavior as an excuse to mangle your program far beyond what mere variation in processor behavior could cause, in the name of optimization. In the kernel bug, the issue wasn't that the null pointer dereference was undefined per se, the real issue was that the subsequent null check got optimized out because of the previous undefined behavior.