datahoarder

6699 readers

23 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 4 years ago

MODERATORS

archivist@lemmy.ml

[Solved] How to backup an entire blog hosted on medium.com? (merv.news)

submitted 1 year ago* (last edited 1 year ago) by nix@merv.news to c/datahoarder@lemmy.ml

10 comments fedilink hide all child comments

Using Archive.org doesn't work on medium posts and ideally I want to archive every post. The blog I'm trying to archive is https://itsairborne.com in case the posts go down. Googling how to backup medium posts only gives me articles on how to do it if it were my blog. I found this extension called Monolith of Web that allows you to backup a website using the Rust tool Monolith and I just went to each article and clicked the extension and saved them all one by one

top 10 comments

sorted by: hot top controversial new old

[–] Borger@lemmy.blahaj.zone 5 points 1 year ago

Write a scraper using python and selenium or something. You may have to manually log in as part of it

[–] despotic_machine@lemmy.world 4 points 1 year ago

HTTrack Website Copier may get the job done.

[–] inspxtr@lemmy.world 3 points 1 year ago* (last edited 1 year ago) (1 children)

which of the posts didn’t work on archive.org wayback machine? I tried your post “How Can You Clean the Air” and it worked, though took a bit of time due to a couple of redirecting

[–] nix@merv.news 1 points 1 year ago

Oh weird whenever i tried using it on of his posts it wouldnt archive. Its not my blog

[–] AuroraBorealis@pawb.social 1 points 1 year ago

Does this actually modify the files when monolith embeds everything into one file?

[–] oldfart@lemm.ee 1 points 1 year ago

Maybe some alternative frontend and then the regular methods like wget?

https://github.com/mendel5/alternative-front-ends#medium

[–] wolfshadowheart@kbin.social 1 points 1 year ago (1 children)

Good find on the solution, there's some good alternatives from the github Monolith as well. However, MoW looks great, too bad it's Chrome only from what I can see :(

[–] nix@merv.news 1 points 1 year ago

Yeha i had to get on chrome just for this :/ hopefully someone forks it into a firefox addon

[–] keefshape@lemmy.ca -3 points 1 year ago (1 children)

Pose this question to chat gpt 3.5 or 4. Ask it to assist in making a (python?) script to do this. Feed it errors, and you can get there pretty quickly and learn along the way.

[–] keefshape@lemmy.ca 2 points 1 year ago

Lol, downvoted for...?