this post was submitted on 21 Feb 2024
20 points (100.0% liked)

Selfhosted

39410 readers
129 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS
 

Is it RDMA? Is it a modification of SR-IOV?

I'm having trouble even trying to find out more about this since the RDMA definition just says "remote access to device memory" and I'd like to confirm if that includes virtual instances of PCIe devices over the network.

Essentially, I'm looking for a way to share virtual instances of supported PCIe devices over IP. I.e. If you have a GPU, you can create virtual slices of it with SR-IOV on KVM-based hypervisors. I'm looking for something that will take this and make it available over IP.

I have come across Infiniband and QLogic, Mellanox and HP and IBM and RDMA support on Debian and all of that. I just need someone to ELI5 this to me so I know where/what to search and see if what I want is really even possible with FOSS.

I know that Nutanix allows one to serve PCIe hardware over IP on their hypervisor, but I plan to stick with FOSS as far as possible.

Thanks!


Edit: Please let me know what makes my post so hard to grasp - the answer was simple RoCE/iWARP. RDMA is definitely the underlying technology that offers access to the memory of the device whilst bypassing the kernel for good performance; security considerations aside, this is a very good idea since RoCE/iWARP work on the UDP/IP and the TCP/IP stack, making them routable.

Apologies if my post didn't make the most sense, I tried to describe it the best I could. Thanks

top 15 comments
sorted by: hot top controversial new old
[–] olosta@lemmy.world 9 points 7 months ago (2 children)

I have no experience about what you are trying to achieve, but rdma and related technologies (infiniband, qlogic, sr-iov, ROCE) is not it. These are network technologies that permit high bandwidth/low latency data transfer between hosts. Most of these bypass the IP stack entirely.

Infiniband is a network stack that enable RDMA, it's only vendor is now NVIDIA which acquired mellanox. Qlogic was another vendor, but it got acquired by Intel that tried to market it as Omnipath, but it was spinned off to Cornelis network.

Sr-iov is a way to share an infiniband card to a virtual machine on the same host.

ROCE is an implementation of the rdma software stack over ethernet instead of infiniband.

[–] MigratingtoLemmy@lemmy.world 2 points 7 months ago (1 children)

I'm fairly sure there's a way to provide compatible PCIe devices over IP on a network, or "some network" (if you're bypassing the IP stack, perhaps). I just don't know what it's called, and I'm getting more confused by whether RDMA support can do this or not. Essentially, I want to leverage what SR-IOV allows me to do (create virtual functions of eligible PCIe devices) and pass them over IP or some other network tech to VMs/CTs on a different physical host.

[–] twei@discuss.tchncs.de 2 points 7 months ago (1 children)

Do you mean stuff like PCIeoF (PCIe over Fiber)?

[–] MigratingtoLemmy@lemmy.world 1 points 7 months ago (1 children)
[–] twei@discuss.tchncs.de 2 points 7 months ago (1 children)

I don't think so, at least I haven't heard of it. I guess Ethernet would have too much overhead

[–] MigratingtoLemmy@lemmy.world 1 points 7 months ago

I suppose RoCE/iWARP were what I was asking for

[–] MigratingtoLemmy@lemmy.world 1 points 7 months ago

I read a bit more and I'd like to add:

RoCE/iWARP is the technology with which one would be able to route DMA over the network. The bandwidth of the network is the bottleneck but we'll ignore that for now.

SR-IOV is a way to share virtual functions of PCIe devices on the same host.

Regardless of whether one uses IB or iWARP, they can also route data to and from a PCIe device attached to a host to another host over the network. I still have to research the specifics but I'm now positive that it can be done.

Thanks

[–] ramielrowe@lemmy.world 7 points 7 months ago (1 children)

I believe what you're looking for is ROCE: https://en.wikipedia.org/wiki/RDMA_over_Converged_Ethernet

But, I don't know if there's any FOSS/libre/etc hardware for it.

[–] MigratingtoLemmy@lemmy.world 2 points 7 months ago (1 children)

So it is RDMA.

Indeed, I have come across RoCE, and support seems to be quite active on Debian. I was looking at QLogic hardware for this, and whilst I know that firmware for such stuff is really difficult to find, I'm fine with just FOSS support on Debian

[–] ramielrowe@lemmy.world 3 points 7 months ago (1 children)

I think I misunderstood what exactly you wanted. I don't think you're getting remote GPU passthrough to virtual machines over ROCE without an absolute fuckton of custom work. The only people who can probably do this are Google or Microsoft. And they probably just use proprietary Nvidia implementations.

[–] MigratingtoLemmy@lemmy.world 2 points 7 months ago

Well, I'm not a systems engineer, so I probably don't understand the scale of something like this.

With that said, is it really hard to slap TCP/IP on top of SR-IOV? That is literally what I wanted to know, and I thought RDMA could do that. Can it not?

[–] henfredemars@infosec.pub 2 points 7 months ago* (last edited 7 months ago) (1 children)

I'm somewhat confused what you're asking here. The two technologies that you mentioned do not provide the ability to share a PCIe device to my knowledge which is what I understand you wish to do. The first allows network cards to directly access host memory and perform data transfers without consulting the CPU while the other allows for the sharing of a PCIe root or bus, not allowing multiple systems to access the same hardware device at the same time.

I've heard of proprietary solutions, which makes sense because if you want to virtualize multiple instances of one physical hardware device I don't see how you can do that efficiently without really intimate knowledge of device internals. You have to have separate state for these things, and I think that would be really challenging to do for an open source project.

Anyway, just thought I would open up the discussion because I didn't see any other comments. I hope to learn something.

[–] MigratingtoLemmy@lemmy.world 2 points 7 months ago

It seems I have gaps in my understanding. I had assumed that SR-IOV allowed me to "break" PCIe devices (with firmware that supports it) into virtual functions ("slices"), to then be passed through to VMs/used by containers like physical devices.

You're right, in that I didn't really see a mention of TCP/IP in the blogs I've read about RDMA. I understand what it is but unless I can access host memory by bypassing the kernel on other machines on the network, this isn't something I need to consider.

I think virtual functions for compatible PCIe devices is chugging along well in the Linux kernel: check videos about the Nvidia P4 sliced into virtual functions and passed through to different VMs using KVM. It's either that or I'm completely missing the point somewhere.

[–] Decronym@lemmy.decronym.xyz 1 points 7 months ago* (last edited 7 months ago)

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:

Fewer Letters More Letters
IP Internet Protocol
PCIe Peripheral Component Interconnect Express
TCP Transmission Control Protocol, most often over IP

3 acronyms in this thread; the most compressed thread commented on today has 5 acronyms.

[Thread #534 for this sub, first seen 22nd Feb 2024, 04:55] [FAQ] [Full list] [Contact] [Source code]

[–] tagginator@utter.online 0 points 7 months ago

New Lemmy Post: What is the technology of being able to access PCIe devices over IP called? (https://lemmy.world/post/12243907)
Tagging: #SelfHosted

(Replying in the OP of this thread (NOT THIS BOT!) will appear as a comment in the lemmy discussion.)

I am a FOSS bot. Check my README: https://github.com/db0/lemmy-tagginator/blob/main/README.md