this post was submitted on 26 Jan 2024
14 points (100.0% liked)

Linux

47557 readers
723 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

I'm currently watching the progress of a 4tB rsync file transfer, and i'm curious why the speeds are less than the theoretical read/write maximum speeds of the drives involved with the transfer. I know there's a lot that can effect transfer speeds, so I guess i'm not asking why my transfer itself isn't going faster. I'm more just curious what the bottlenecks could be typically?

Assuming a file transfer between 2 physical drives, and:

  • Both drives are internal SATA III drives with ~~5.0GB/s~~ 5.0Gb/s read/write
  • files are being transferred using a simple rsync command
  • there are no other processes running

What would be the likely bottlenecks? Could the motherboard/processor likely limit the speed? The available memory? Or the file structure of the files themselves (whether they are fragmented on the volumes or not)?

you are viewing a single comment's thread
view the rest of the comments
[–] mozz@mbin.grits.dev 2 points 8 months ago* (last edited 8 months ago)

I mean, yeah, at this point letting it finish regardless seems like the right play. You could Ctrl-Z and then do little experiments and then resume it if you feel confident mucking around with that and you’re curious.

You can estimate the current speed pretty accurately with something like “df; sleep 30; df” and then do the math.

It’s useful to mess around with tar, because it will try to saturate its pipes without waiting, so even if that saturation on its own doesn’t fix anything, you can start to eliminate possibilities for where the issue might be. You know for sure it won’t wait for anything from the other end before continuing to do its reads. “time dd if=/dev/zero of=file” or similar commands can also determine the speed of individual parts of the pipeline.

(Edit: If you’re doing the dd test make sure you write or read a ton of data, to make sure you’re dealing with the physical disk and not the memory cache)

Best of luck