Python urllib2 does not respect timeout -
the following 2 lines of code hangs forever:
import urllib2 urllib2.urlopen('https://www.5giay.vn/', timeout=5) this python2.7, , have no http_proxy or other env variables set. other website works fine. can wget site without issue. issue?
if run
import urllib2 url = 'https://www.5giay.vn/' urllib2.urlopen(url, timeout=1.0) wait few seconds, , use c-c interrupt program, you'll see
file "/usr/lib/python2.7/ssl.py", line 260, in read return self._sslobj.read(len) keyboardinterrupt this shows program hanging on self._sslobj.read(len).
ssl timeouts raise socket.timeout.
you can control delay before socket.timeout raised calling socket.setdefaulttimeout(1.0).
for example,
import urllib2 import socket socket.setdefaulttimeout(1.0) url = 'https://www.5giay.vn/' try: urllib2.urlopen(url, timeout=1.0) except ioerror err: print('timeout') % time script.py timeout real 0m3.629s user 0m0.020s sys 0m0.024s note the requests module succeeds here although urllib2 did not:
import requests r = requests.get('https://www.5giay.vn/') how enforce timeout on entire function call:
socket.setdefaulttimeout affects how long python waits before exception raised if server has not issued response.
neither nor urlopen(..., timeout=...) enforce time limit on entire function call.
to that, use eventlets, as shown here.
if don't want install eventlets, use multiprocessing standard library; though solution not scale asynchronous solution such 1 eventlets provides.
import urllib2 import socket import multiprocessing mp def timeout(t, cmd, *args, **kwds): pool = mp.pool(processes=1) result = pool.apply_async(cmd, args=args, kwds=kwds) try: retval = result.get(timeout=t) except mp.timeouterror err: pool.terminate() pool.join() raise else: return retval def open(url): response = urllib2.urlopen(url) print(response) url = 'https://www.5giay.vn/' try: timeout(5, open, url) except mp.timeouterror err: print('timeout') running either succeed or timeout in 5 seconds of wall clock time.
Comments
Post a Comment