Commit Graph

23 Commits

Author SHA1 Message Date
Daniel Folkinshteyn
b533290c7a Web: fix problems with title snarfer and unicode due to bug in HTMLParser in python 2.6+
Upstream bug: http://bugs.python.org/issue3932
Rather than override the unescape method with the patch posted, we just convert the page
text to unicode before passing it to the HTMLParser. UTF8 and Latin1 will eat just about
anything.
2011-10-11 13:06:27 -04:00
Daniel Folkinshteyn
2b708f034b Web: add 'timeout' config for web fetch, default 5 sec.
Otherwise, when a site would take a long time to respond, the thread would hang for quite a while.

also needed to mod src/utils/web.py to take the timeout arg.
2011-06-13 16:42:57 -04:00
James Vega
288d7c6e02 Update plugins to ignore all non-ACTION CTCP messages.
Also update commands.urlSnarfer to do the same, which allows us to revert
"Don't bother snarfing URLs from non-Action CTCP messages."

This reverts commit 3282e3407e.

Signed-off-by: James Vega <jamessan@users.sourceforge.net>
2010-01-28 08:14:44 -05:00
Jeremy Fincher
3282e3407e Don't bother snarfing URLs from non-Action CTCP messages. 2010-01-28 06:35:53 -06:00
James Vega
25fc2de643 utils.web: Provide access to the raw httpUrlRe/urlRe strings
Using the compiled regexps for a PluginRegexp method's __doc__ doesn't work.

Closes Sourceforge #2879862

Signed-off-by: James Vega <jamessan@users.sourceforge.net>
2009-10-15 22:16:38 -04:00
James Vega
ca917d3528 Use utils.web.httpUrlRe for the Web/ShrinkUrl snarfer regexes.
Signed-off-by: James Vega <jamessan@users.sourceforge.net>
2009-10-04 21:41:05 -04:00
James Vega
cbc91c6a26 Use a more appropriate message if the URL definitely has no title. 2009-03-11 13:37:25 -04:00
James Vega
74e06ea52a Catch the proper exception when parsing the title fails. 2009-03-11 13:37:24 -04:00
James Vega
ee9aaa89d6 plugins/Web: Swtich the title parser back to HTMLParser sing sgmllib's parser spins on invalid input. 2006-09-13 19:40:51 +00:00
James Vega
a3e4fc5b1d Change the modeline to use softtabstop instead of tabstop. 2006-02-11 15:52:51 +00:00
James Vega
9d48f2c879 plugins/Web: Update the exception handling for the change in parsers. 2005-09-20 19:06:35 +00:00
James Vega
b375ea9792 plugins/Web: Fixed the title-retrieval parser to actually retrieve the entire title. 2005-07-19 13:55:37 +00:00
James Vega
bc1451e898 plugins/Web: Encountering an HTMLParser exception doesn't mean the title hasn't already been snarfed. Don't bail right away. 2005-06-29 19:05:20 +00:00
Jeremy Fincher
490fb0b140 Changed prefixName to prefixNick, which is more appropriate, and has always bothered me. Better now than later. 2005-06-01 21:08:30 +00:00
James Vega
06800f9fc7 Correctly catch the HTMLParseError 2005-05-07 03:55:14 +00:00
James Vega
47179f8bc6 Catch HTMLParserErrors when we're trying to grab the <title>. 2005-05-07 03:24:10 +00:00
James Vega
fcfda73f64 Bug #1190350, Don't grab fake title. 2005-04-30 12:53:42 +00:00
James Vega
9971e991fe Fix the modelines. 2005-03-23 20:07:45 +00:00
Jeremy Fincher
7e441285c7 Added the Web.fetch command. 2005-03-14 02:44:55 +00:00
Jeremy Fincher
a2e2063d8b Added a callCommand to the Web plugin to catch utils.web.Error. 2005-03-09 07:26:32 +00:00
Jeremy Fincher
b0cb616709 Changed callbacks.Privmsg to be callbacks.Plugin, and callbacks.PrivmsgCommandAndRegexp to be callbacks.Plugin. 2005-02-09 07:04:04 +00:00
James Vega
92839a94e7 Remove supybot.privmsgs imports. 2005-02-01 13:57:14 +00:00
Jeremy Fincher
0c2da03a67 Added the Web plugin (from pieces of Http, Fun, and URL) in the new plugin format. 2005-02-01 09:41:54 +00:00