Commit Graph

111 Commits

Author SHA1 Message Date
Valentin Lorentz
c8030be71a Web: Need to download even more Javascript from Youtube 2024-04-18 19:33:55 +02:00
Valentin Lorentz
689c633e92 Web: Fix crash on socket.timeout on snarfed URLs 2023-10-29 12:32:33 +01:00
Valentin Lorentz
3f9ab4b89c Web: Fix crash on trailing ';' in Content-Type 2023-10-28 09:47:55 +02:00
Valentin Lorentz
1a7c14f4b3 Web: Decode using the charset advertized in response headers
And fall back to the sniffing when not present
2022-11-26 09:06:47 +01:00
Valentin Lorentz
2cfc821203 Web: Allow configuring higher peekSize on Youtube 2022-10-28 14:18:52 +02:00
Valentin Lorentz
eb6fc932d9 Web: Fix matching for youtube 2022-04-04 23:29:47 +02:00
Valentin Lorentz
66d986e820 Web: Add overrides to support Youtube and Reddit; remove Twitter from tests. 2022-03-03 22:16:53 +01:00
Valentin Lorentz
63eb6672ea Revert generic 'The Limnoria Contributors' in copyright notices
This commit reverts db7ef3f025
(though it keeps the year updates)

After discussion with several people, it seems better to mention
copyright owners explicitly. eg. https://reuse.software/faq/#vcs-copyright
explains the issue of using VCSs to track copyright.

As db7ef3f025 only replaced mentions
of my name with 'The Limnoria Contributors', this commit only needs
to undo that + add one person who contributed to setup.py.
2021-10-17 09:57:55 +02:00
Valentin Lorentz
dc79ab193a Update capitalization of my Github username 2021-09-14 20:30:47 +02:00
Valentin Lorentz
db7ef3f025 all: Add generic 'The Limnoria Contributors' to copyright notices.
No need to bother with details (that are all outdated / out of sync
anyway), just look up the git history.
2021-08-01 21:54:49 +02:00
Valentin Lorentz
5195ff8e12 Web: Add new @location command, to follow HTTP redirects.
Useful to un-tinify URLs.
2020-10-13 22:28:52 +02:00
Valentin Lorentz
0b0da9716d callbacks: honor network-specificity of supybot.reply.whenAddressedBy.
A side-effect is that plugins should now pass 'irc' instead of 'irc.nick'
when they call 'callbacks.addressed()'.
2020-04-11 15:00:46 +02:00
Valentin Lorentz
c457b52067 Deduplicate setting Accept-Language HTTP header.
This adds a new function conf.defaultHttpHeaders that can be used by plugins
to get all the default HTTP headers for a given network/channel.
2020-01-14 19:03:12 +01:00
Tasos Sahanidis
ae5ad2ceab Web: Implement protocols.http.requestLanguage 2020-01-14 18:48:11 +01:00
Valentin Lorentz
8491d0b944 Web: Lower log level when title could not be found. 2019-12-15 18:43:51 +01:00
Valentin Lorentz
dc2068deca Web: Remove leading space if the prefix is empty. 2019-12-15 17:27:47 +01:00
Valentin Lorentz
1a1707420b Web: Add early returns on exception when snarfing titles.
Closes GH-1390.
2019-11-22 18:17:53 +01:00
Rodrigo Nascimento Hernandez
7466058c8f Web: Catch more errors in getTitle. 2019-11-01 09:06:45 +01:00
Valentin Lorentz
c1ae3f5c81 all plugins: Use msg.channel instead of msg.args[0] + give network name to self.registryValue. 2019-08-24 23:35:01 +02:00
Valentin Lorentz
0f82f89eec Web: Fix encoding issue on Python 2. Closes GH-1359. 2019-02-01 21:02:57 +01:00
James Lu
2242aadde9 Web: add trailing space for snarferPrefix at runtime
Before, the trailing space in the default snarferPrefix value disappears after a reload because spaces at the end of config lines are ignored.
2018-07-22 04:01:21 +00:00
Valentin Lorentz
cd479717b8 Web: Add supybot.plugins.snarfMultipleUrls. Also, fix Web's test cases. 2018-04-14 21:50:32 +02:00
Valentin Lorentz
e2180a1e08 Add variable supybot.plugins.Web.snarferPrefix. 2018-03-02 01:26:00 +01:00
Ken Spencer
76c73a57b9 Use a prefix-less help string, don't assume a '@' prefix (#1309)
* Use a prefix-less help string, don't assume a '@' prefix

* Nickometer: follow through on plugin.py with ` -> '
2017-10-25 21:19:37 +02:00
Tasos Sahanidis
8dbf37a173
Web: Fix exception raised due to lack of Content-Type 2017-09-20 04:57:47 +03:00
Johannes Löthberg
07f98d3619 Add timeout to web title command
Signed-off-by: Johannes Löthberg <johannes@kyriasis.com>
2016-12-08 10:11:15 +01:00
Valentin Lorentz
9fe4abec48 Web: Use a timeout to fetch pages. Closes GH-1275.
This is required because the sandbox is not used anymore,
since 9c57199838.
2016-12-08 00:48:11 +01:00
Valentin Lorentz
b9b36d4de5 Improve decorator. 2016-12-08 00:37:12 +01:00
Valentin Lorentz
4acb692f17 Web: Use new-style command wrap (as a decorator). 2016-12-08 00:36:30 +01:00
Valentin Lorentz
9c57199838 Web: Disable the fetch sandbox on Python versions with the _MAXHEADERS fix.
Partial fix to GH-1271.
2016-11-11 12:13:02 +01:00
James Lu
66736b22d5 Web: optionally hide the domain in titleSnarfer
This adds a snarferShowDomain option to optionally hide the domain ("(at site.abc)" text) in titleSnarfer output. Closes #1236.
2016-08-09 11:22:00 -07:00
James Lu
e2dedcc5a4 Web: normalize whitespace in titles
Sample link: http://googleblog.blogspot.com/2015/08/android-wear-now-works-with-iphones.html
Before: <bot> 'Title: \nOfficial Google Blog: Android Wear now works with iPhones\n (at googleblog.blogspot.com)'
After: <bot> Title: Official Google Blog: Android Wear now works with iPhones (at googleblog.blogspot.com)
2015-12-29 17:12:26 -08:00
Valentin Lorentz
bc19a9fc7f Web: fix syntax. 2015-11-30 07:45:05 +00:00
Valentin Lorentz
eaf9e40dc2 Web: increase subprocess memory limit and catch MemoryError appropriately. 2015-11-29 18:34:54 +00:00
Valentin Lorentz
a070b658a0 Web: Fix title fetching. 2015-11-29 17:59:57 +00:00
Valentin Lorentz
1f57c31665 Web: Fix NameError with snarferShowTargetDomain. Closes GH-1177. 2015-10-25 16:20:31 +01:00
Valentin Lorentz
e3ff413734 Web & core: Merge features of Web's title parser and utils.web.HtmlToText + don't unescape HTML twice. Closes GH-1176. 2015-10-23 07:41:36 +02:00
Jussi Timperi
5cf1b34f55 Web: Use title instead of parser.title. 2015-10-22 17:13:47 +03:00
Jussi Timperi
df7689cc2e Web & utils.web: Force HTMLParser to process all buffered data.
Python issue 23144.
2015-10-22 16:56:53 +03:00
Valentin Lorentz
526ffb0ccb Web: Fix code factorization (576a96fb71). Closes GH-1173. 2015-10-17 15:41:20 +02:00
James Lu
6e96f8f8bf Web: actually return the whitespace-stripped title 2015-10-04 12:54:41 -07:00
Valentin Lorentz
20ef13ef9f Web: Ignore SVG titles. Closes GH-1147. 2015-08-29 21:08:35 +02:00
Valentin Lorentz
576a96fb71 Web: Factorize the code of the title snarfer and the title command. 2015-08-29 21:04:38 +02:00
Valentin Lorentz
c3a2c800f1 Remove need for 2to3. 2015-08-11 16:50:23 +02:00
Valentin Lorentz
054953891f Web: check URL whitelist in snarfer. 2015-08-11 14:46:47 +00:00
Valentin Lorentz
be6bc1a734 Remove need for fix_unicode. 2015-08-10 18:52:51 +02:00
Valentin Lorentz
6ceec0c541 Web: HTMLParseError is deprecated/unused since Python 3.3 and removed in Python 3.5. 2015-08-10 18:16:02 +02:00
Valentin Lorentz
c0ac84bb53 Remove need for fix_import, fix_types, and fix_urllib. 2015-08-10 17:55:25 +02:00
Valentin Lorentz
216c5d213f Replace sys.version_info[0] usages with minisix.PY{2,3}. 2015-08-09 00:23:03 +02:00
Valentin Lorentz
a81d3ddae6 Web: add option for having titlesnarfer immune to defaultignore. Closes GH-1101 2015-05-15 12:39:30 +02:00