**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* - [A bit opinionated titlefetching](#a-bit-opinionated-titlefetching) - [Preparation](#preparation) - [Actually enabling it](#actually-enabling-it) - [Excluding domains from titlefetching](#excluding-domains-from-titlefetching) - [Titlesnarfing ignored users](#titlesnarfing-ignored-users) - [Bonus: Fediverse](#bonus-fediverse) # A bit opinionated titlefetching ## Preparation ``` load Web config plugins.web.snarfMultipleUrls True config plugins.web.snarferShowDomain False config plugins.web.snarferShowTargetDomain False config supybot.protocols.http.userAgents "Limnoria UrlPreviewBot" config supybot.protocols.http.peekSize 1048576 ``` - enables the plugin (shipped with Limnoria) - enables titlefetching for all links on line, not just the first one - disables showing domain (small protection against multiple titlesfetcherrs entering a loop or simply not annoying users with clientside link previews (Matrix/Telegram bridges/relays included)) - disables showing redirect target (see previous point) - sets user-agent to "Limnoria UrlPreviewBot" instead of ['Mozilla/5.0 (compatible; utils.web python module)' from 2005](https://github.com/ProgVal/Limnoria/blame/2990fcd302afdc6a3b741594017c3959fd5da2fd/src/utils/web.py#L120) - I have heard that it's bad to pretend to be something you aren't and Twitter will only give you HTML ``s if your user-agent contains `UrlPreviewBot`, [thanks Tulir's Synapse patch](https://mau.dev/maunium/synapse/-/commit/55d926999cffee893cb4951890a33985beaf70ba) - search for HTML titles from the first MEGABYTE of the webpage as modern web is horrible (looking at you [HS](https://hs.fi) & [YouTube](https://youtube.com)) ## Actually enabling it ``` config channel #CHAN plugins.web.titleSnarfer True ``` - enables titlefetching per-channel, on #CHAN to be accurate (avoiding unwanted channels in case of botloop) - `"channel #CHAN"` could also be replaced with `network NETWORKNAME` for every channel on network or `config` (or omitted entirely) for everywhere (channel takes priority over network which _probably_ takes priority over global) ## Excluding domains from titlefetching ``` config supybot.plugins.Web.nonSnarfingRegexp m/(t.me|matrix.to|facebook.com|instagram.com|imgur.com)/ ``` - regexp to block the listed domains, which are the first useless examples I have encountered recently. I just stole the regexp from [canonical Limnoria](https://github.com/ProgVal/Limnoria/wiki/Canonical-%23limnoria-doc) ## Titlesnarfing ignored users While I personally don't like to do this, it's possible by ``` config channel #CHAN plugins.web.checkignored False ``` I may have the bot on multiple sides of relay or the user may be ignored due to abuse so this may result into spam. ## Bonus: Fediverse If [the Fediverse plugin is configured with secure fetch](https://github.com/progval/Limnoria/tree/master/plugins/Fediverse), fetching Fediverse profiles/statuses/usernames can be enabled by: ``` channel #CHAN plugins.Fediverse.snarfers.profile true channel #CHAN plugins.Fediverse.snarfers.status true channel #CHAN plugins.Fediverse.snarfers.username true ```