We now have our controller, but it doesn’t really do much other than throw an error about not being able to load the model or view. So let’s create us a model and view.
So now that we have a basic understanding of what the different parts of CodeIgniter are, we can get down to writing our first controller. The Home controller. This will be where the user lands when they visit our site.
CodeIgniter is a powerful PHP framework with a very small footprint, built for developers who need a simple and elegant toolkit to create full-featured web applications.
Now that we can download our test page we need to parse the HTML so that it can be read easily by our program. This tutorial will give you a basic overview of the python module Beautiful Soup.
Now that we have the beginnings of a web scraper we need to test that it works. The following site is a test site for web scraping.
Our download function currently doesn’t do much in the way of retying downloads. In this next part we will add in some code to make our function try and download the page 3 times if it fails.
Web scraping is a very useful technique and python can make it really easy to do. In this part of the series I will deal with downloading pages and extracting the HTML.