Projects
Real Estate Scraper
The goal of the project was to build scrapers for regular scraping of real estate properties, from Redfin and Zillow. The data must be updated twice a day to keep the database relevant. The scraper grabs all property details including images. For scraping we developed a multi-thread scraper, one for Zillow and one for Redfin. The scrapers used our cheap and effective custom proxy application in order to get data quickly and not be banned by the sites. The data are collected into a single database, which is then used by realtors in the client’s web application.
Court directory scraper
A real time scraper of cases from District Court website. The website has a captcha protection so beating captcha was one of our challenges, and we resolved it successfully. All scraped cases are represented in scraper admin application with sorting and filtering options, with a selection a date range to view cases for a specific period and export to CSV. This is a multi-threading scraper since our client wanted to monitor multiple districts at the same time, so scraper behaves as multiple users, combining data into a single database, to provide the highest performance and data availability.
Kadaster scraper
A multi-thread scraper of real estate related data from kadaster directory. Required data set includes information about properties, as well as their owners. The scraper is automatically logging into multiple user accounts and grabbing required data in a real time mode. The directory has a sensitive multi-level anti-bot protection so the scraper is imitating a real user behavior in browser, which provides long-time reliable scraping with no interruptions. The scraper is using proxies, which are updated automatically with our custom proxy solution. Data is collected into a single database with a possibility of search, filtering and export.
Social Network for Doctors
AmongDoctors is a unique niche social platform designed for medical doctors. Its goal is to bring together the peers from around the world for close collaboration, enhancement of recruitment opportunities, for networking, for sharing experience and for improving patient care. The platform combines a range of features of a social network, like an option to make posts and upload photos and the features of a professional community, like a forum for discussing various topics and personal blog. Laravel framework has been selected to build the platform as a flexible, reliable and secure solution.
Custom CRM for Ecommerce
Custom CRM solution for a commercial store, based on Laravel and Metronic theme, with MySQL and Mongo databases. The application has multiple permission levels, from owner to sale agent, each has his own interface and options available. The application pulls records from connected lead generating source and from call center software. The custom CRM represents customer data, order data, product data, stock availability, shipment details with search, filtering and sorting options available. The application allows sending automatic sms campaigns based on rules. Flexible reports allow pulling detailed stats on sales, performance, agents and managers, resellers, orders and customer activity.
Smart Marketing Platform
A marketing platform for smart advertising, backend based on Laravel, frontend was implemented on Angular. Server: AWS, database: Mongo and MySQL. Mobile developers, webmasters and web application owners register in the platform, track and deliver user’s activities and detailed into a single database. Advertisers and marketers create campaigns for webpages, mobile applications, sms, emails etc. The flow of campaign creation is flexible and completely based on rules. The advertiser can build the target audience based on age, gender, interests, location, activities, hobbies etc. The showing of the ad is triggered by a specific event or action of user.
Custom Map Plugin
A customized WordPress plugin for the website representing interesting places to eat, drink and see in San Diego. The site shows guides featuring various places that would be interesting for travelers and even citizens. This is a map with markers to visually show locations of POI for people. Site editor creates an article, and when adding information about point of interest he specifies geo location. On frontend when you scroll through the article, the map highlights specific place that you are now reading about. Similarly, when you click a marker on the map, the page will scroll to the description of this specific place.
Real Estate Directory
A business presence website for real estate agency in Nice, France. The website has properties for rent and for sale. Site visitors can check availability of the property and book it online for a desired date range. There is an advanced search on the website, where users can search for any property parameter, like type, size, number of bedrooms, other facilities. There is a blog and a news section on the site. The site owner has advertising spaces in the sidebar, which represent either featured properties or just ads, whatever advertisers prefer.
Building company webpage
A website promoting a new community built in Atlanta. The team behind the project was striving to build a thriving community of collaboration, creation, and unique stories. It is intended to unite people and homes, inspired actions of neighbors, friends, and families and inviting, comfortable spaces: an impromptu conversation in a garden courtyard, an unexpected brainstorm at the coffee shop, a delightful dinner party in an open meadow. The site is built on Wordpress with a customized theme. Company team does not require any development skills to add content and news, update existing pages and add new posts.
Data augmentation
The goal of the project was to use data collected by one of our previous scrapers and find additional information about the companies in Spain and people working in them. Company data was stored in MongoDB. We used company CIF as an identifier to find and grab missing company details (phones, annual accounts) on infocif.es website. To grab NIF and website we built a new scraper, which crawled axesor.es website and stored required data in the main database. The final step was to define and save relationships between the companies, which was done with one more scraper for infoempresa.com website.
Data enrichment
The project consisted of two parts: scraping and data enrichment. We have been provided with the list of educational institutions of the USA and the goal was to find email addresses of people who work in these universities and colleges. We scraped through Google to find universities and colleges websites. After that we went through these webpages to find contact details and grab emails. The next stage was to define the department of email owner and his name and connect this data in the database. Scraper was implemented on Python and MySQL. The resulting database stores more than 6 mln records.
Idealista data scraper
Idealista is a huge database of properties in Spain with detailed information on each: photos, location, description, territory etc. Our customer requested information on two large Spanish provinces, each and every listing. Site is equipped with protection against bots so our script is pretending to be just an ordinary site visitor. Or, rather, visitors from different countries and using different user agents. To beat the site protection we used a combined solution, PHP and Python and used crawlera service, which provides different IP addresses automatically. In all cases we used multi-thread scrapers to grab all necessary data as soon as possible.