Valentin Lorentz
9c57199838
Web: Disable the fetch sandbox on Python versions with the _MAXHEADERS fix.
...
Partial fix to GH-1271.
2016-11-11 12:13:02 +01:00
James Lu
66736b22d5
Web: optionally hide the domain in titleSnarfer
...
This adds a snarferShowDomain option to optionally hide the domain ("(at site.abc)" text) in titleSnarfer output. Closes #1236 .
2016-08-09 11:22:00 -07:00
James Lu
e2dedcc5a4
Web: normalize whitespace in titles
...
Sample link: http://googleblog.blogspot.com/2015/08/android-wear-now-works-with-iphones.html
Before: <bot> 'Title: \nOfficial Google Blog: Android Wear now works with iPhones\n (at googleblog.blogspot.com)'
After: <bot> Title: Official Google Blog: Android Wear now works with iPhones (at googleblog.blogspot.com)
2015-12-29 17:12:26 -08:00
Valentin Lorentz
bc19a9fc7f
Web: fix syntax.
2015-11-30 07:45:05 +00:00
Valentin Lorentz
eaf9e40dc2
Web: increase subprocess memory limit and catch MemoryError appropriately.
2015-11-29 18:34:54 +00:00
Valentin Lorentz
a070b658a0
Web: Fix title fetching.
2015-11-29 17:59:57 +00:00
Valentin Lorentz
1f57c31665
Web: Fix NameError with snarferShowTargetDomain. Closes GH-1177.
2015-10-25 16:20:31 +01:00
Valentin Lorentz
e3ff413734
Web & core: Merge features of Web's title parser and utils.web.HtmlToText + don't unescape HTML twice. Closes GH-1176.
2015-10-23 07:41:36 +02:00
Jussi Timperi
5cf1b34f55
Web: Use title instead of parser.title.
2015-10-22 17:13:47 +03:00
Jussi Timperi
df7689cc2e
Web & utils.web: Force HTMLParser to process all buffered data.
...
Python issue 23144.
2015-10-22 16:56:53 +03:00
Valentin Lorentz
526ffb0ccb
Web: Fix code factorization ( 576a96fb71
). Closes GH-1173.
2015-10-17 15:41:20 +02:00
James Lu
6e96f8f8bf
Web: actually return the whitespace-stripped title
2015-10-04 12:54:41 -07:00
Valentin Lorentz
20ef13ef9f
Web: Ignore SVG titles. Closes GH-1147.
2015-08-29 21:08:35 +02:00
Valentin Lorentz
576a96fb71
Web: Factorize the code of the title snarfer and the title command.
2015-08-29 21:04:38 +02:00
Valentin Lorentz
c3a2c800f1
Remove need for 2to3.
2015-08-11 16:50:23 +02:00
Valentin Lorentz
054953891f
Web: check URL whitelist in snarfer.
2015-08-11 14:46:47 +00:00
Valentin Lorentz
be6bc1a734
Remove need for fix_unicode.
2015-08-10 18:52:51 +02:00
Valentin Lorentz
6ceec0c541
Web: HTMLParseError is deprecated/unused since Python 3.3 and removed in Python 3.5.
2015-08-10 18:16:02 +02:00
Valentin Lorentz
c0ac84bb53
Remove need for fix_import, fix_types, and fix_urllib.
2015-08-10 17:55:25 +02:00
Valentin Lorentz
216c5d213f
Replace sys.version_info[0] usages with minisix.PY{2,3}.
2015-08-09 00:23:03 +02:00
Valentin Lorentz
a81d3ddae6
Web: add option for having titlesnarfer immune to defaultignore. Closes GH-1101
2015-05-15 12:39:30 +02:00
Mikaela Suomalainen
64c0e38635
Web: fix unmatched parenthesis and add missing dot
2014-12-20 13:14:33 +02:00
Valentin Lorentz
ba12692fb4
Web: Add support for charrefs. Closes GH-923.
2014-12-11 09:59:08 +01:00
Valentin Lorentz
8ab29fb291
Web: Add explicit error when page encoding cannot be guessed.
2014-10-13 01:13:15 +00:00
Valentin Lorentz
8cd0b4c1e3
Web: Increase timeout to 10 and improve error message.
2014-07-30 11:18:54 +00:00
Valentin Lorentz
b8f31a3fca
Web: disable threading in commands. (They are run in separated processes anyway…)
2014-04-06 14:05:40 +00:00
Valentin Lorentz
35a62b4e77
Continue accelerating the 2to3 step (remove fix_ws_comma, fix_xreadlines, and fix_zip).
2014-01-21 10:40:18 +01:00
Valentin Lorentz
bb7db3ab21
Continue accelerating the 2to3 step (remove fix_except).
2014-01-20 15:49:15 +01:00
nyuszika7h
b5a9aee7a6
Web: Fix exception on timeout
2013-12-25 16:43:41 +01:00
Valentin Lorentz
289f614bfa
Web: Make choice of displayed domain (origin/target) configurable.
2013-11-19 10:20:32 +00:00
Valentin Lorentz
11d8f4655b
Web: Display the target domain in snarfer. Re-implements pull request GH-523.
2013-11-19 10:16:43 +00:00
Valentin Lorentz
790bda4664
Web: Fix nesting of commands (bug introduced in d8a4ef8421
).
2013-08-20 11:37:39 +02:00
Kill Your TV
b46a0dd6a2
Unicode fixes for python 2.x
...
These changes have been tested with Python 3.2.3 and Python 2.7.5.
2013-08-17 14:12:10 +00:00
Valentin Lorentz
18cc1ff3bb
Revert "Web: Disable @title and @doctype for non-HTML documents." (incompatible with Python 2)
...
This reverts commit 34b0e5faad
.
2013-08-15 00:14:34 +00:00
Valentin Lorentz
34b0e5faad
Web: Disable @title and @doctype for non-HTML documents.
2013-08-09 18:03:02 +02:00
Valentin Lorentz
d8a4ef8421
Web: Prevent memory bomb when calling commands with an URL to a page sending crafted requests.
2013-08-09 12:16:24 +02:00
Valentin Lorentz
b4402b28ed
utils.web: Rename get_encoding to getEncoding for consistency.
2013-07-09 12:05:51 +00:00
Valentin Lorentz
820113344c
Web: Use utils.web.get_encoding for guessing charset.
2013-07-09 12:02:43 +00:00
Valentin Lorentz
5f1535447c
Web: Use @title's utf8 decoding in the snarfer.
2013-07-02 13:42:53 +02:00
Daniel Folkinshteyn
944f9c3e3f
Web: create a cofigurable url whitelist
...
Prevent various forms of abuse that result via the Web plugin, such as fetching or titling
malicious content, or revealing bot IP.
Conflicts:
plugins/Web/plugin.py
plugins/Web/test.py
2013-06-27 07:09:22 +02:00
George Miller
0150c79924
Added a way to have the urlsnarfer report exceptions (hotsnotnound, ...)
...
(Should be possible changed to only ioExceptions)
Enable/Disable via 'supybot.plugins.Web.snarferReportIOExceptions'
2013-04-05 10:05:00 +02:00
Valentin Lorentz
9ef83f70cf
Web: Fix encoding in @title.
2013-03-06 12:11:46 +00:00
Valentin Lorentz
603f44129d
Web: Fix Python 3 compatibility.
2013-01-06 17:06:26 +01:00
Valentin Lorentz
2177429618
Web: Remove netcraft (which does not seem to want bots).
2013-01-05 19:14:58 +01:00
Valentin Lorentz
918092a54d
Web: Fix snarfing of titles with UTF-8 characters.
2013-01-05 18:02:35 +01:00
Valentin Lorentz
3dba9088b0
Merge remote-tracking branch 'supybot/master' into testing
...
Conflicts:
INSTALL
plugins/ChannelLogger/README.txt
plugins/ChannelStats/README.txt
plugins/Google/plugin.py
plugins/Google/test.py
plugins/Plugin/test.py
plugins/Web/test.py
setup.py
src/callbacks.py
src/ircdb.py
src/irclib.py
src/utils/str.py
test/test_irclib.py
2013-01-01 21:11:24 +01:00
Valentin Lorentz
22febc4a20
Web: Fix encoding issues in title snarfing and @title.
2012-11-17 15:10:36 +00:00
Valentin Lorentz
6ea2d062b7
Web: Filter special chars in @title, and add --no-filter.
...
I'm adding --no-filter just in case someone want to use @title to do this at purpose
2012-10-31 16:35:51 +00:00
Terje Hoås
cb623b2f4e
Web: Fix fetch. Use getUrl instead of getUrlFd.
2012-10-02 18:19:53 +02:00
Valentin Lorentz
ad3bf1302f
Web: Fix compatibility with Python <= 2.6.
2012-09-22 17:43:59 +00:00