We now have our controller, but it doesn’t really do much other than throw an error about not being able to load the model or view. So let’s create us a model and view.
So now that we have a basic understanding of what the different parts of CodeIgniter are, we can get down to writing our first controller. The Home controller. This will be where the user lands when they visit our site.
CodeIgniter approaches web development with MVC in mind. MVC is a way of presenting data that keeps programme logic and presentation separate from each other. This means that there is little PHP scripting in your presentation pages, keeping them clean and focused on looks.
Now that we can download our test page we need to parse the HTML so that it can be read easily by our program. This tutorial will give you a basic overview of the python module Beautiful Soup.
Now that we have the beginnings of a web scraper we need to test that it works. The following site is a test site for web scraping.
Our download function currently doesn’t do much in the way of retying downloads. In this next part we will add in some code to make our function try and download the page 3 times if it fails.
In the next part of this series we will deal with downloading a page with options for using proxies or custom user agents.