this post was submitted on 14 Oct 2024
125 points (91.9% liked)

Technology

59373 readers
8115 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] tal@lemmy.today 2 points 1 month ago* (last edited 1 month ago) (1 children)

If this is you, then build your own home server.

While I don't disagree, there's also a very considerable cost difference here between running locally and remotely.

If a user sets up an AI chatbot, then has their compute card under average 24/7 load of 1% -- which would require averaging, say, a daily session for an hour with the thing averaging 25% of its compute capacity during that session -- then the hardware costs for a local setup would be 100x that of a remote setup that spreads load evenly across users.

That is, if someone can find a commercial service that they can trust not to log the contents, the economics definitely permit room for that service to cost less.

That becomes particularly significant if one wants to run a model that requires a substantial amount of on-card memory. I haven't been following closely, but it looks like the compute card vendors intend to use amount of memory on-card to price discriminate between the "commercial AI" and "consumer gaming" market. That permits charging a relatively large amount for a relatively small amount of additional memory on-card.

So an Nvidia H100 with 80GB onboard runs about (checks) $30k, and a consumer Geforce 4090 with 24GB is about $2k.

An AMD MI300 with 128GB onboard runs about (checks) $20k, and a consumer Radeon XT 7900 XTX with 24GB is about $1k.

That is, at current hardware pricing, the economics make a lot of sense to time-share the hardware across multiple users.

[–] criitz@reddthat.com 10 points 1 month ago

You can't truly trust any commercial service.