Our download function currently doesn’t do much in the way of retying downloads. In this next part we will add in some code to make our function try and download the page 3 times if it fails.
We can use the status code of our request object to determine whether or not we should retry the download e.g. if the page is not found there is no point trying to download the page again. Make sure you add importtime to the top of you file with the other imports.
def download(url, header = None, proxy = None, timeout = 5, tries = 3): """ Downloads a page using requests and returns the page content """ if tries == 0: return False page = requests.get(url, headers = header, proxies = proxy, timeout = timeout) if page.status_code >= 300 and page.status_code < 400: # Error code in 300 < code < 400 # Redirect Error return download(url, tries = tries - 1) elif page.status_code >= 400 and page.status_code < 500: # Error code in 400 < code < 500 # Client Error (Something wrong with our request) return False elif page.status_code >= 500 and page.status_code < 600: # Error code 500 < code < 600 # Server error (Try retying after 1 second) time.sleep(1) return download(url, tries = tries - 1) elif page.status_code >= 200 and page.status_code < 300: return page.content else: return False
The above code checks the status code of the request object and if the code is a redirect error or a server error then we wait a second before retrying. If the error code is a client error then that means there is something wrong with our request so there is no point retrying the download.