top | item 43180112

(no title)

hamsterbase | 1 year ago

When it comes to web archiving, I've found that Markdown has some real limitations. Sure, it's great for basic text, but it struggles with things like embedded content and non-standard layouts. Try archiving a Twitter thread or an app-style webpage in Markdown, and you'll see what I mean. It just doesn't capture the full picture.

That's why I've come to prefer formats like webarchive, mhtml, or single HTML files for archiving. They're incredibly faithful to the original content - you get almost perfect rendering of the original page, complete with styling and layout. Plus, they can capture stuff behind paywalls or on logged-in pages, which is a huge plus.

The real challenge, though, isn't just about saving the content. It's about making that saved content useful. These archive formats are great for preservation, but they can quickly become a mess of unorganized files that are hard to search through or make sense of.

I think the key is finding ways to organize and interact with these archives more effectively. Things like full-text search across all your saved pages, the ability to add notes or highlights directly on the archived content, and smart tagging systems could go a long way. And it'd be really powerful if we could integrate these archives with other knowledge management tools we use.

I develop a tool called HamsterBase that seems to address a lot of these issues we've been discussing. t's a local-first app. That means all your data stays on your own device - no need to worry about your personal archives being stored on someone else's servers. There's no sign-up or registration required, which is refreshing in today's cloud-centric world.

discuss

order

thangalin|1 year ago

> [Markdown] struggles with things like embedded content and non-standard layouts.

I don't share that experience. I typeset all these documents using Markdown with pandoc's div extension, transformed into XHTML, and then passed to ConTeXt:

* https://impacts.to/downloads/lowres/impacts.pdf

* https://dave.autonoma.ca/blog/2020/04/28/typesetting-markdow...

* https://pdfhost.io/v/4FeAGGasj_SepiSolar_Highlevel_Software_...

From XHTML, the document is transformed into TeX statements, which opens a world of possibilities. In the following video, custom styling is applied to nested contents:

https://youtu.be/3QpX70O5S30?t=35

DidYaWipe|1 year ago

Those are all PDFs. Why, if Markdown is so great?