My website scrape quest.

Recently I was involved in an employment process and during the second round of the interview, I was tasked with getting information from a certain site and then taking that information and getting the average number of a certain column with certain restraints. Maybe that’s not put the best, but hopefully it will make sense. Below is a screenshot of the website and some of the information I needed.

In total there was roughly 1,200 records like this. I needed to write an application which would take the highest 125 paid players and find the average pay of all of them and have that input shown to a user. I knew there would be many ways to approach this and the way I did might not be the fastest, but I wanted to try a few new things.

The first step I took was to find a way to get all of this information into JSON format, that way I could manipulate this data with built in JavaScript methods. In order to do this, I used a program I found while googling the best ways to scrape a website. The program is called ParseHub.

After the installation, I was ready to scrape this site and get the relevant information I needed to begin manipulating the data. Here is a screen shot when this program is loaded and the website is scraped.

As you can see on the bottom there is a tab to get this information back in JSON format. There is a little more to it than that with matching up the columns but luckily this site wasn’t too detailed. Once I got this information in JSON format I had to make a decision on what to do with it. On to React…

Once I had the information, the way I wanted to show this information to a user was to make a small React application and have the average salary of the top 125 players show on the screen. In order to do this, I knew I first had to make a data.json file that I would pull the information from. I would then keep all my code in the App.js file because this was going to be a very small application.

The first step I undertook was to find the top salaries of all the players. Below is a small sample of the JSON that I was going to work with:

I decided to use a JavaScript method called sort which would sort the information I needed.

Now I knew the top salaries of the top 125 players, I only put that information into my data.json file. I then needed a way to get the average salary of all of these players. I decided to use both the round method and reduce method to get this information:

With all of this information now sorted and ready, I decided to put {avg} into my return statement to display this information on the screen. This was just a little information on a quick way to scrape a website, turn information into JSON and then a couple of ways to manipulate this data. Hope you enjoyed!

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store