Albedo team logo

Scraping With Laravel

Scraping With Laravel

Our customers often ask us what solutions are the best for scraping. Our team has huge experience in this area. We built a lot of scrapers, from simple scripts for one-time scraping to scraping engines that use proxies to scrape with max speed. For scraping scripts we use PHP-based solutions and Python-based solutions depending upon each particular case. But what if project requires to build scraper as a part of web application, what technical stack do we recommend? Of course, Laravel. Scraping with Laravel is easy and effective.

This is not a secret that our favorite of all times for web application development is Laravel framework. Laravel is one of leading solutions as for today but a lot of fans don't know that Laravel has built-in scraping (also known as data mining, data extraction functionality).

To get started you need a Laravel 5 project ready to go. The next step would be installation of Goutte package that is necessary for our scraper. To install the package run the following command: composer require weidner/goutte. After that please add alias and provider into config/app.php file. The next step is to create a file, for this run the command: php artisan make:command and the open your file in the App/Commands directory. After that we will be creating code for scraping your target site.

Next up we’ll update the signature variable value, this will be the command we run in the terminal. After that you need to create a variable that will contain and array of slugs grabbed from the urls of all pages you want to scrape. Also, the handle function needs to be updated the following way:

public function handle()

{

foreach ($collections as $collection) {

$this->scrape($collection);

}

}

Before we proceed to implementation of scraping function you will need to create an environment variable by adding url to .env file. After that it’s time to create the function for scraping. Use GET request to address the target site for scraping. There you might need to define and grab the number of pages in case it has pagination etc. Optionally, you might want to integrate your scraper with the database to store collected details. This is how scraping with Laravel works.