mirror of
				https://gitea.blesmrt.net/mikaela/gist.git
				synced 2025-10-31 17:37:19 +01:00 
			
		
		
		
	
		
			
				
	
	
		
			89 lines
		
	
	
		
			3.3 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			89 lines
		
	
	
		
			3.3 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| <!-- @format -->
 | |
| 
 | |
| <!-- START doctoc generated TOC please keep comment here to allow auto update -->
 | |
| <!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
 | |
| 
 | |
| - [A bit opinionated titlefetching](#a-bit-opinionated-titlefetching)
 | |
|   - [Preparation](#preparation)
 | |
|   - [Actually enabling it](#actually-enabling-it)
 | |
|   - [Excluding domains from titlefetching](#excluding-domains-from-titlefetching)
 | |
|   - [Titlesnarfing ignored users](#titlesnarfing-ignored-users)
 | |
|   - [Bonus: Fediverse](#bonus-fediverse)
 | |
| 
 | |
| <!-- END doctoc generated TOC please keep comment here to allow auto update -->
 | |
| 
 | |
| # A bit opinionated titlefetching
 | |
| 
 | |
| ## Preparation
 | |
| 
 | |
| ```
 | |
| load Web
 | |
| config plugins.web.snarfMultipleUrls True
 | |
| config plugins.web.snarferShowDomain False
 | |
| config plugins.web.snarferShowTargetDomain False
 | |
| config supybot.protocols.http.userAgents "Limnoria UrlPreviewBot"
 | |
| config supybot.protocols.http.peekSize 1048576
 | |
| ```
 | |
| 
 | |
| - enables the plugin (shipped with Limnoria)
 | |
| - enables titlefetching for all links on line, not just the first one
 | |
| - disables showing domain (small protection against multiple titlesfetcherrs
 | |
|   entering a loop or simply not annoying users with clientside link previews
 | |
|   (Matrix/Telegram bridges/relays included))
 | |
| - disables showing redirect target (see previous point)
 | |
| - sets user-agent to "Limnoria UrlPreviewBot" instead of
 | |
|   ['Mozilla/5.0 (compatible; utils.web python module)' from 2005](https://github.com/ProgVal/Limnoria/blame/2990fcd302afdc6a3b741594017c3959fd5da2fd/src/utils/web.py#L120)
 | |
|   - I have heard that it's bad to pretend to be something you aren't and
 | |
|     Twitter will only give you HTML `<title>`s if your user-agent contains
 | |
|     `UrlPreviewBot`,
 | |
|     [thanks Tulir's Synapse patch](https://mau.dev/maunium/synapse/-/commit/55d926999cffee893cb4951890a33985beaf70ba)
 | |
| - search for HTML titles from the first MEGABYTE of the webpage as modern web
 | |
|   is horrible (looking at you [HS](https://hs.fi) &
 | |
|   [YouTube](https://youtube.com))
 | |
| 
 | |
| ## Actually enabling it
 | |
| 
 | |
| ```
 | |
| config channel #CHAN plugins.web.titleSnarfer True
 | |
| ```
 | |
| 
 | |
| - enables titlefetching per-channel, on #CHAN to be accurate (avoiding
 | |
|   unwanted channels in case of botloop)
 | |
|   - `"channel #CHAN"` could also be replaced with `network NETWORKNAME` for
 | |
|     every channel on network or `config` (or omitted entirely) for everywhere
 | |
|     (channel takes priority over network which _probably_ takes priority over
 | |
|     global)
 | |
| 
 | |
| ## Excluding domains from titlefetching
 | |
| 
 | |
| ```
 | |
| config supybot.plugins.Web.nonSnarfingRegexp m/(t.me|matrix.to|facebook.com|instagram.com|imgur.com)/
 | |
| ```
 | |
| 
 | |
| - regexp to block the listed domains, which are the first useless examples I
 | |
|   have encountered recently. I just stole the regexp from
 | |
|   [canonical Limnoria](https://github.com/ProgVal/Limnoria/wiki/Canonical-%23limnoria-doc)
 | |
| 
 | |
| ## Titlesnarfing ignored users
 | |
| 
 | |
| While I personally don't like to do this, it's possible by
 | |
| 
 | |
| ```
 | |
| config channel #CHAN plugins.web.checkignored False
 | |
| ```
 | |
| 
 | |
| I may have the bot on multiple sides of relay or the user may be ignored due
 | |
| to abuse so this may result into spam.
 | |
| 
 | |
| ## Bonus: Fediverse
 | |
| 
 | |
| If
 | |
| [the Fediverse plugin is configured with secure fetch](https://github.com/progval/Limnoria/tree/master/plugins/Fediverse),
 | |
| fetching Fediverse profiles/statuses/usernames can be enabled by:
 | |
| 
 | |
| ```
 | |
| channel #CHAN plugins.Fediverse.snarfers.profile true
 | |
| channel #CHAN plugins.Fediverse.snarfers.status true
 | |
| channel #CHAN plugins.Fediverse.snarfers.username true
 | |
| ```
 |