this post was submitted on 10 Jun 2023
694 points (99.6% liked)

Lemmy

12459 readers
1 users here now

Everything about Lemmy; bugs, gripes, praises, and advocacy.

For discussion about the lemmy.ml instance, go to !meta@lemmy.ml.

founded 4 years ago
MODERATORS
 

How do you feel about the massive influx of users?

you are viewing a single comment's thread
view the rest of the comments
[–] admin@lemmy-u3.vm.elestio.app 11 points 1 year ago (4 children)

I think we really need to address the scaling issue, one option could be to use clichhouse instead of postgres

[–] theterrasque@infosec.pub 6 points 1 year ago* (last edited 1 year ago)

This gives me MongoDB flashbacks. Postgres, if properly set up, should easily handle thousands of users.

[–] hglman@lemmy.ml 4 points 1 year ago

Thats certainly not the right kida of storage system for a site like this.

[–] 777@lemmy.ml 3 points 1 year ago (1 children)

I think probably a pluggable storage backend is the best move. For example, any cloud hosted instance could use a native document storage format such as dynamodb, which is often quite cheap or free for small use-cases.

[–] bobaduk@lemmy.world 2 points 1 year ago (1 children)

Bit of a pain to store in Dynamo, though. You'd need to write a bunch of different views, I think.

One comment thread makes sense as a partition, but listing threads is going to be awkward, and search is basically a no-no.

[–] AbominableSlinky@lemmy.world 3 points 1 year ago (1 children)

Not necessarily a pain, you just have to model the data very differently in something like DynamoDB. Those views are secondary indexes.

Search, though, you're right. You'd be running ElasticSearch along side it and the cost and complexity starts to go up. Or just abandon having a functional search entirely, like Reddit did...

[–] bobaduk@lemmy.world 1 points 1 year ago

Ja, but you need an index for each thread, some kind of time partitioned thread index for each community, same for all.

Then you need to query all comments or posts by user, so that's another index, then you need some way of querying for hot, or controversial or what have you.

It's doable, but fiddly. Tempted to have a go though!

[–] federico3@lemmy.ml 1 points 1 year ago* (last edited 1 year ago)

Indeed PostgreSQL is not designed for large scale horizontal sharding with eventual consistency. Also ClickHouse is designed for OLAP workloads likely making it even less suitable.

Regardless of database choice, Lemmy is still centralized. Discussion groups are cached across instances but not truly distributed. This is the big blocker.