Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Alternative to archive.is

Works where archive.is is blocked

Text-only, no DDoS directed at blog

   view-source:https://www.nytimes.com/2026/05/09/business/dealbook/ai-notetakers-legal-risk.html
   Save as 1.htm
Something like

   egrep -o "(\"text\":\"[^\"]+)|(\"textAlign\":\"LEFT\")|(\"url\":\"[^\"]+)|(\"__typename\":\"TextInline\")" 1.htm \
   |sed '/\"url\":\"/{s/??.*//;s/$/\">/;s/.\{7\}/<a href=\"/;};
         /\"__typename\":\"TextInline\"/{s/\"$/<\/a>/;s/.\{24\}//;};
         s/\"textAlign\":\"LEFT\"/<p>/g;/\"text\":\"/s/.\{8\}//' \
   |sed '1s/^/<meta charset=utf-8><meta name=viewport content=width=device-width>/' > 2.htm
   rm 1.htm
   firefox ./2.htm
NB. Javascript and CSS interpreters are needed only for Datadome challenge. The following DNS data, e.g., A RRs, are required

   ct.captcha-delivery.com
   geo.captcha-delivery.com
   www.nytimes.com
   g1.nyt.com 
No other DNS data is required


Cool!

Would you be willing to license this code as GPL-3.0-or-later, or some other free license? I'd like to include a JavaScript derivative of this for Haketilo (a userscript manager). I would add it to a collection of scripts that aim to replace proprietary JavaScript here: https://codeberg.org/JacobK/unfinished-site-fixes/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: