Hazzard

joined 1 year ago
[–] Hazzard@lemm.ee 1 points 1 year ago

Biggest mutant like this I ever made was a government requirement to export PDFs. Best way I could find to make PDFs from PHP was a library called wkhtmltopdf. Which, as the name suggests, converts html to pdf.

Installed a library to let me call a local install of wkhtmltopdf on the command line of the host machine. Wrote a ridiculous HTML template, with all kinds of weird styling and jank to support the older version of WebKit that wkhtmltopdf used, and then would save the output as a file. Then I would run wkhtmltopdf with that file as an argument.

Of course, I wasn't done here. They required that I use their existing title page, appendices, etc. Only the data in the middle was to change. So I added a whole "PDF Data" table to the database, with storage locations for them to upload something like 10+ PDFs to append at the front and back of the PDF. Did I mention this whole thing supported two languages?

So then I implemented another command line library, called pdftk, or pdf toolkit. I used a crazy call to pdftk to append all of these to the front and back of the document, making these look like what they wanted. Save to that same folder, send the file to the client through PHP, and use my "command line from PHP" wizardry to rm all the files I'd made in my "cache" folder, as I called it.

But of course... we're not done. Turns out appending files like this horribly breaks the PDF table of contents, which was apparently just using page numbers, not any kind of actual linking. Enter pdftk again, and now I'm running it before generating my HTML, on each and every PDF I'm going to add, to get the page count, and saving that value.

I'd then pass this crazy dictionary into my template and add "fake pages" to the start and end, with headings, and a special margin that wkhtmltopdf interprets as a page break. This even works to add my "additional documents" to the table of contents. Now, my pdftk append commands also deliberately trim the PDF, so as to replace the fake pages, keeping the page count the same, so the links work.

So close.... but it turns out wkhtmltopdf doesn't account for when the table of contents is so ridiculously long that it goes on for more than a whole page. Did I mention these PDFs are more than 300 pages long in many instances? Suddenly every link goes to the page after the one it's supposed to, or even a couple after. Not good.

Yeah... this is the beginning of a nightmare where I add fake table of contents pages that I cut out later with pdftk. Which means I have to somehow know in advance how many pages the ToC will be.... estimation time. That's right, I run through all the data I generate the ToC with in advance, and count the number of entries I'm going to be adding to the ToC, and, by literally counting the entries on a full ToC and saving that as a magic number, guess how many pages there will be.

Oh, but what if a line is too long, and wraps to two lines in the ToC? Well, guess who counted the number of characters in a line to produce another guesstimate? No, neither of these heuristics were perfect, and they looked like a spaghetti mess, but with enough tinkering, I got numbers that worked on everything I tested.

And there you have it, 300+ page PDFs generated from the database, with all the title pages and such that they manually uploaded, in two languages, with a working table of contents. During my time, we never even added a cache to this monstrosity, it did all this every time the user clicked "Download PDF". Took around 30 seconds, and the UI just pretended it was a really big file.

What a wild project, probably the biggest spaghetti mess I've ever written. But hey, actually met all the requirements, no matter how ridiculous, and I'm proud of that monstrosity. Probably still in use today.

[–] Hazzard@lemm.ee 4 points 1 year ago

Thank you! I've just been browsing with NSFW turned off, but: A) I actually would rather turn on the blur function if there wasn't literal porn throughout the "all" feed. B) A bunch of mild soft core stuff like "pretty women" and "celebs" gets through anyway.

Can't believe it never occurred to me to use the block button to shape the all feed.

[–] Hazzard@lemm.ee 1 points 1 year ago

Eh, I'd assume the comparison isn't flattering. I think the point of this article is to argue you don't need ElasticSearch to implement a competent Full Text Search for most applications. Splitting hairs over a few milliseconds would just distract from that point, when most applications should be prioritizing simplicity and maintainability over such tiny gains in a reasonable dataset.

Might be interesting to try to analyze at exactly what point elasticsearch becomes significantly useful, however. Maybe at the point where it saves a full tenth of a second? Or where it's returning in half the time? Could be an interesting follow up article.

[–] Hazzard@lemm.ee 6 points 1 year ago (1 children)

Sounds like it's just going to be a trophy of some kind, which would be alright. It's a pretty darn impressive accomplishment to keep it together in both qualifying and race sessions, so I'd imagine the teams/drivers may actually be proud of them. Not going to be a big deal for viewers though, obviously.

[–] Hazzard@lemm.ee 3 points 1 year ago

Gotta be Google Play Music I'm still bitter about. YouTube music doesn't hold a candle to it, and I've never quite been as happy with Spotify or Apple Music. Getting YT Premium with a good music service was great too, but they shot themselves in the foot.

And there's was just... no reason for it. They even delayed its death when they realized how crap YT Music was, and then later just... decided to do it anyways.

[–] Hazzard@lemm.ee 2 points 1 year ago* (last edited 1 year ago)

Is this whole comment section really about one comment from six years ago, where all he stated is that he's grateful his wife wasn't aborted (I'm assuming that was considered)?

Because I find it hard to take that as all that damming. From the tone of the conversation here, I was expecting several comments over the years stating extremes like that abortion should be illegal, or that rape victims should be denied access to it.

Do we even know that he knew about any of this scandal with March for Life in 2017? Especially if he's not supported it since (which I assume this article would bother to include if he had?)

[–] Hazzard@lemm.ee 1 points 1 year ago

I've recommended it to my friends, but they've instead chosen to take the fall of Reddit as an opportunity to just... step back from the internet. One friend just... isn't browsing Reddit at all on mobile anymore, and is instead just using old.reddit.com on desktop, and culled most of the subs he used to follow, keeping only some small communities.

I respect it, but still feel something like Reddit has a place in my life, but have no faith whatsoever in Reddit itself, and the apps are better here, so I'm completely switched over. I imagine they may still move here in time. Lemmy is a part of my life now, so I'll continue to mention it and share links to things from it, but I've no desire to force the issue.

[–] Hazzard@lemm.ee 3 points 1 year ago (1 children)

Eh, the python one will probably perform better, because sum is probably written in native C under the hood.

[–] Hazzard@lemm.ee 5 points 1 year ago (1 children)

I think what it really shows is just how mixed every other team has been. Even if one team was consistently placing 3rd and 4th, that's 27 points, more than Max can earn on any weekend. But every other team has been off and on, scoring big on occasional weekends, and poorly on many others.

[–] Hazzard@lemm.ee 3 points 1 year ago

Great read!

I think a bonus point in favour of composition here is the power of static typing. Introducing advanced features like protocols can bring back some of that safety that this article describes as being exclusive to inheritance.

Overall, I think composition will continue to be the future going forward, and we'll find more ways to create that kind of compilation-time safety without binding ourselves into too restrictive or complicated models.

[–] Hazzard@lemm.ee 3 points 1 year ago

TL;DR: People like searching for answers to their question from Reddit, since it's got upvotes and a lot of subreddit answers to niche opinion questions. Reddit is one of the most popular search keywords, and is unreliable right now.

Also recaps several other recent Reddit headlines, which I'm sure you've seen if you're hanging out in this subreddit.

[–] Hazzard@lemm.ee 1 points 1 year ago

Ah, it makes way more sense for students, absolutely. None of your code is proprietary, so that's not a concern, student pricing makes things easier.

Plus, your tech stacks are much simpler. Usually just... Java, or Python, or something. Not a python webserver using X framework for templating, Y framework for typing, and Z framework for API calls to some undocumented internal API.

view more: ‹ prev next ›