Programmer Humor

32063 readers

996 users here now

Post funny things about programming here! (Or just rant about your favourite programming language.)

Rules:

Posts must be relevant to programming, programmers, or computer science.
No NSFW content.
Jokes must be in good taste. No hate speech, bigotry, etc.

founded 5 years ago

MODERATORS

AgreeableLandscape@lemmy.ml

cat_programmer@lemmy.ml

How we see strings Rust/C (lemmy.ml)

submitted 2 years ago by miguel@lemmy.ml to c/programmerhumor@lemmy.ml

12 comments fedilink hide all child comments

top 12 comments

sorted by: hot top controversial new old

[–] pinknoise@lemmy.ml 3 points 2 years ago* (last edited 2 years ago)

Fun fact: the standard neither defines the size or the signedness of char.

[–] Copio@lemmy.ml 1 points 2 years ago

All I see are pointers.

[–] rauba_code@lemmy.ml 0 points 2 years ago* (last edited 2 years ago) (1 children)

There is const char* that should be added to list.

As far as I remember, Win32 API (C-compatible C++) used to add more complexity: wchar_t*, LPSTR, LPCSTR, LPWSTR, LPCWSTR, LPTSTR, LPCTSTR as they preferred 16-bit strings over UTF-8. These were supposed to be independent safe ANSI/Unicode string wrappers. Sick.

[–] pinknoise@lemmy.ml 2 points 2 years ago* (last edited 2 years ago)

I feel bad because I immediately knew what the abbreviations stand for -.-

wchar_t isn't from win32, but from ISO C. There should be a compatibility typedef/define called WCHAR in win32 headers for compilers that don't have wchar_t.

[–] gary_host_laptop@lemmy.ml -1 points 2 years ago (2 children)

All those are ways of calling string in C? Why are there so many?

[–] gun@lemmy.ml 1 points 2 years ago (1 children)

In Rust, and I don't know

[–] gary_host_laptop@lemmy.ml 0 points 2 years ago (2 children)

Isn't it a bad thing?

[–] jet@hackertalks.com 2 points 1 year ago

Not necessarily a bad thing. If your method of invocation gives context about its possible use cases. You can make the program more safe because you know it's being used appropriately. If you're just passing a pointer around anything could happen to it. So it's hard to help the programmer not make mistakes

[–] Ephera@lemmy.ml 1 points 2 years ago

The String vs. &str split is definitely annoying, but a necessary drawback of Rust's ownership mechanics (which bring many benefits compared to C in other places).

The others have their specific use-cases (like file paths or C interop). It's certainly not like you constantly juggle 8 different kinds of strings.

[–] nachtigall@feddit.de 0 points 2 years ago* (last edited 2 years ago) (2 children)

The primary reason there are so many ways is the Rust ownership and type system.

String is a 'normal' mutable UTF-8 string that is allocated on the heap.
&str is either a slice reference pointing to a part of a string or a string literal in read-only memory.
&[u8] is a reference to slice of bytes (so only capable of ASCII characters).
&[u8; N] is a reference to a fixed size slice reference of bytes where the length is encoded in the type.
Vec<u8> is a mutable array of bytes (the former two are immutable in contrast).
&u8 is a reference to a single byte.
OsString is an owned, mutable platform native string in the platform's preferred representation (e.g. non-zero byte UTF-8 on Linux, non-zero byte UTF-16 in Windows).
OsStr is a borrowed version of an OsString.
Path is a slice that supports operations like obtaining the root element or file name or check if it is absolute or relative or to check if the the file exists in the file system.
PathBuf is an owned, mutable version of the former one.
CString is an owned representation of a string that should be compatible with C (e.g. null terminated and no zero bytes in between). They are handy if you want to call C libraries from Rust code.
CStr is a borrowed version of CString.
&'static str is a string slice reference with a static lifetime and is therefore valid for the duration of the entire program (in contrast to slices of Strings that might get de-allocated at some point).

[–] Ephera@lemmy.ml 3 points 2 years ago

&u8 is a reference to a single byte.

I'm not sure this meme really cares to make perfect sense, but I don't think that's ever useful...?

A single byte is going to be smaller than a pointer address, so you can just copy the u8 without loss of efficiency.

[–] nutomic@lemmy.ml 2 points 2 years ago

Good explanation. However, OsString is not necessarily valid UTF-8/UTF-16 (in thay case, it could simply be String). The docs describe it pretty well.

https://doc.rust-lang.org/std/ffi/struct.OsString.html