Skip to content
/ wbgt Public

Python scrip to extract weather data from a website

Notifications You must be signed in to change notification settings

BenbenIO/wbgt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Wbgt

This project is a small part of my master thesis in KEIO university. The objective of this script is to collect weather forecast from a website. The target is the WBGT (Wet Buld Globe Temperature), this index is used to characterize heatwave. I used these data for my heatstroke prevention system. I hope this script can be useful to see how to do "web scraping", so I will explain all steps.

The target

We are going to use the environment ministry website, and especially for Tokyo area (HERE) The website give us a nice forecast table for the next 3days. 3 days can be short but enough for the prove of concept of our system.

We are targeting the 3 days forecast table:

Developer mode

Now in order to extract specific data, it's important to see how the webpage is structured and where are the data encoded. To do so, I recommend to explore the page on developer mode/inspector (Ctrl+Alt+C). Now we can see the matching between the target and the displayed web page code. (htlm help HERE

Extracting the data

To get the webpage content, we used the library requests. We simply do a request to the url target and use Beautiful Soup in order to do all the preprocessing :)

Then we extract the wanted table's data with the function find_all and the htlm tag "tr" and the class used to display the data: We extract the data in 3 strings for each forecasted day. (I really think this can be in a more smartly way but I am not good enough on the scrapping topics to find a solution quickly :D) I find it easier to work with string parameter. D0 for today / D1 for tomorrow / D2 for the day after tomorrow

Getting the average on each day

The following code do the average on each day. I used the fact the date and each wbgt are separated by "\n", and use a double boucle structure to make the average.

Data file creation/update

Finally, we update the data file with the today's data (or create a new one if there is no file)

After few days

We can have a graphic of the WGBT updated:

In this repository, you can find the python script. Do not hesitate in you have any question or advice :)

About

Python scrip to extract weather data from a website

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages