I don't know enough about how ActivityPub works to be sure, but I suspect the right way to archive a Lmy instance would be to create software that acts like another instance, federates with the one you want to archive, and saves the raw stream of ActivityPub packets.
datahoarder
Who are we?
We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.
We are one. We are legion. And we're trying really hard not to forget.
-- 5-4-3-2-1-bang from this thread
Oh, yeah, you're probably right. Unfortunately I absolutely do not have the knowledge required to do that, but I'll keep it in mind. Thanks
Well since posts are numbered sequentially, you could archive all of them by generating the links. Tiny issue is, this would include every post that was federated with the server, which is almost 2 million it seems. A bit overkill for a relatively small instance.
I think if you filter by local on the main page and click next until you get to the end, there aren't that many pages. You could save those with outlinks.
Also, I believe, the posts will live on on other instances regardless.
You may reach out to the Archive Team. They are not related to archive.org, but they often work together.
They're reachable on IRC, but if you know how, you can also visit the channels from Matrix.
Maybe a plug-in for Lemmy server could be developed to automatically back up and / or restore instances from Arweave. Some protocol could be used to turn the instances into Json, which could then be uploaded as documents and parsed, or something like that. And then the Json could then be potentially restored. There might be many pages for a large instance, but they could perhaps be organized in a thoughtful and functional way.