This is a small project for an individual client. The application purpose is to request prices for flight routes, generated from a set of cities (airports) and dates. It was implemented with Scrapy platform and Python 2.7 language.

The .ini file is used to get input parameters. User enters the IATA location identifiers (3-letter airport codes) as route stops. In addition, a range of dates is selected. At startup the application generates all permuatations from given locations and applies dates from the range (with a given step) to each permutation. The resulting routes are then fed to Each route is being searched in a number of airticket databases. The website reports busy state for some time, then a table of results is shown. Resulting prices along with each route details are saved as Excel spreadsheets.

A set of routes for each date with extracted prices is saved in a separate spreadsheet. A set of spreadhseets is saved in a separate directory named as run timestamp so that it is easy to navigate in results.

A set of proxies can be set in the config. In this case the application will apply the next proxy from the list to each request. This allows to avoid being detected as bot software.

All route permutations are displayed at the application start, so user can estimate data extraction job time.

An interesting thing about scraping process in this project is that the route information is not returned immediately by input parameters. The portal scans many third-party airticket databases for the best options so the process takes some time. On a first request a session ID is returned which is then used to monitor the scan process state. After a number of requests resulting in busy state, a JSON object with the resulting table is returned.