Forecasting precipitation — it is the application of science and technology to predict the amount of precipitation in a region. It is important to accurately determine rainfall for efficient water use, crop productivity and preliminary planning of water features.
In this article, we will use linear regression to predict rainfall. Linear regression tells us how many inches of precipitation we can expect.
The dataset is a publicly available weather dataset from Austin, Texas, available on Kaggle. The dataset can be found here .
Data cleansing:
Data comes in all forms, most of which are very messy and unstructured. They are rarely ready to use. Datasets big and small come with a lot of problems: invalid fields, missing and optional values, and values in forms other than what we want. To bring it into a workable or structured form, we need to "cleanse" our data and prepare it for use. Some common cleanup includes parsing, converting to a oneoff state, deleting unnecessary data, etc.
In our case, our data has several days where some factors were not captured. And the amount of precipitation in cm was marked as T if there were traces of precipitation. Our algorithm requires numbers, so we cannot work with the alphabets that appear in our data. so we need to clean up the data before applying it to our model
Clean up the data in Python:
Once the data has been cleaned up, it can be used as input to our linear regression model. Linear Regression — it is a linear approach to the formation of the relationship between the dependent variable and the set of independent explanatory variables. This is done by plotting the line that best matches our dot plot, that is, with the fewest errors. This gives predictions of the value, i.e. how many, by substituting the independent values in the line equation.
We will use Scikitlearn’s linear regression model to train our dataset. Once the model is trained, we can provide our own data for various columns such as temperature, dew point, pressure, etc. to predict the weather based on these attributes.


Output:
The precipitation in inches for the input is: [[1.33868402]] The precipit ation trend graph:
Precipitation graph against selected attributes:
A day (in red) with about 2 inches of precipitation is tracked by several parameters (the same day is tracked by several parameters such as temperature, pressure, etc.). The Xaxis denotes days, and the Yaxis denotes the magnitude of an element such as temperature, pressure, etc. The graph shows that precipitation can be high if the temperature is high and the humidity is high.