A new statistical forecast model developed by the India Meteorological Department predicts a normal monsoon for 2003.

AFTER successful predictions for 14 years running, the India Meteorological Department's statistical model for the long-range forecast (LRF) of the monsoon failed enormously last year (Table 1). It was off by 20 per cent against the claimed model error of +/- 4 per cent. Following this, the IMD has come up with entirely new models for predicting this year's rainfall. The revamped models permit a forecast in mid-April itself as against the fourth week of May in the case of the earlier model. Besides, the new scheme enables a mid-course correction in July, which could help minimise the impact of a deficient monsoon on agriculture. The forecast for the 2003 southwest monsoon that was issued with the help of the new models on April 16 is that it is likely to be "normal".

The above may seem contradictory to other media reports on the IMD forecast which have, by and large, said that it would be a deficient monsoon.

The reason for the differing perceptions with regard to the 2003 monsoon is not far to seek. The IMD has predicted that the total quantum of rainfall across the country will be 96 per cent of the LTA (with an inherent model error of +/-5 per cent). However, for this prediction, the IMD adopted different nomenclatures for the various rainfall categories, apart from using new models. Until now, there were three categories: `normal,' which is defined as 90-110 per cent of the LTA; `excess,' which is defined as more than 110 per cent; and `deficient,' which is defined as less than 90 per cent. The IMD has now introduced five more categories: `drought' (less than 90 per cent of the LTA); `below normal' (90 to 97 per cent); `near normal' (98 to 102 per cent); `above normal' (103 to 110 per cent) and `excess' (more than 110 per cent). As per these new categories, 96 per cent rainfall is `below normal'. But going by the customary definition in use until last year, 96 per cent is a "normal" monsoon.

While the use of new models to improve the quantitative forecast is fine, the change of nomenclature in an operational forecast was unnecessary. More than the absolute figures, the terms used for the rainfall categories influence the impression of the public at large. The earlier definition of "normal" accounted for a +/-10 per cent window around the average. Perhaps, this stems from what is known in statistical terms as the `standard deviation (SD)' of monsoon rainfall. A measure of 1 SD tells you how tightly the events are clustered around the average. Historical monsoon data suggests that the rainfall data points largely bunch around the average within a 10 per cent window. Also, in the 1950s, it was seen that within 1 SD or the 10 per cent window, the total grain output (TGO) remained roughly the same. However, with the Green Revolution and greatly changed farming practices, this does not hold good any more. This bunching around the average implies that nearly 70 per cent of the time the monsoon would be within the `normal' window. For this reason, the string of successes of the earlier IMD model in predicting a `normal' monsoon is not surprising.

The problem has been compounded by the fact that besides predicting absolute rainfall, a probabilistic model has been used to give the relative likelihood of the 2003 monsoon being in one of the above five categories. According to the IMD, the new categories were necessary for the development of the probabilistic model.

In any statistical exercise, there will be a finite probability of the outcome being in any of these categories. The confidence with which a prediction is made depends on how well is one able to estimate these relative probabilities. The relative chances of this year's monsoon turning out to be in these categories respectively have been given as 21 per cent for a `drought', 39 per cent for `below normal' rainfall, 14 per cent for a `near normal' rainfall, 23 per cent for an `above normal' rainfall and 3 per cent for an `excess' rainfall.

It is important to state this from a technical perspective, but from the point of view of an operational forecast meant for public consumption, a mere word of caution about the uncertainties inherent in a statistical prediction exercise should have sufficed. If the numbers had been communicated in simpler terms, the confusion evident in news reports could have been avoided. What these relative probabilities imply is that the chance of the rainfall turning out to be as predicted (namely between 91 per cent and 101 per cent of the LTA) is a little over 50 per cent. Indeed, the probability of it being a "normal" monsoon as per the earlier definition (90 per cent to 110 per cent of the LTA) is a high 76 per cent. In short, this year's monsoon is likely to be reasonably good according to the forecast and not a deficient one as is feared widely.

The IMD was not alone in not being able to predict last year's monsoon behaviour (Frontline, July 30, 2002). The long-range predictions of almost all major forecasting centres of the world failed. Some centres, such as the National Centres for Environmental Prediction (NCEP) of the National Oceanic and Atmospheric Administration (NOAA) in the United States, did, however, pick up the signals of a deficient monsoon based on changing atmospheric data in June. The drought of 2002 prompted the IMD to address the shortcomings and methodological flaws of the model, a concern that has been long voiced by the scientific community (Frontline, July 13, 2000 and November 9, 2001). All along, the response of the IMD to these criticisms seems to have been on the lines of "Why rock the boat?" because the model's prediction had been right, at least in the gross, for 14 monsoons up to 2001.

Since 1988, the IMD had been using a parametric statistical model, which relates the total quantum of the monsoon (the predictand) to the values of 16 global and regional meteorological variables relating to temperature, pressure and wind during the months from December to May (the predictors), through a "power regression" equation (Table 2). The model's success was only apparent; in strict quantitative terms it had failed as many as nine times. In 2002, it failed even in the gross. The model predicted a "normal" monsoon with 101 per cent rainfall, whereas the actual rainfall was only 81 per cent. This jolted the IMD into re-examining its deeply flawed model.

A major criticism of the earlier model was that the mutual independence of the parameters chosen was never established and that they were too many given the small data set of 37 years of monsoon. Another reason for a critical revaluation of the old parameters was the increasing evidence that over the past two decades the relationship of many of the parameters to the annual rainfall had been weakening and they were losing their predictive potential. Although four parameters were changed in 2000, it did not improve matters much.

Six variables pertaining to the April-May period and four variables pertaining to the winter months were found to be declining in their correlations with the monsoon rainfall and were, therefore, rejected. An extensive search for new stable parameters was carried out to evolve a new model and the number of parameters were whittled down from 16 to 10. Essentially, four more new parameters have been added to the six that remained from the old set (Table 3). According to the IMD, the 10 new parameters have been found to be statistically stable over 38 years (1958-95) of data. Apparently, these have been validated over the seven-year period from 1995 - to 2002 and were found to be robust in their correlation to the monsoon rainfall. Validation of a model on an independent data set is important for statistical models and this was done with the earlier model. Its performance in the gross was taken as its validation. In the case of the monsoon, with correlations changing over time, this becomes even more important.

Unprecedented atmospheric changes in June seem to have been responsible for the failed monsoon last year. Indeed, the behaviour of the 2002 monsoon was intriguing. While the onset over Kerala was on time and the rainfall in June was above normal (+4 per cent), July recorded the lowest rainfall in history (a departure of -49 per cent). A substantial revival of the monsoon in August (-4 per cent) and September (-10 per cent) saved the situation and brought the overall deficit to 19 per cent, a figure comparable to the droughts of 1979 and 1987.

July is usually the wettest month and the most critical for kharif crops. While it is not a one-to-one relationship between the two shortfalls, the gross impact of a drought in July on agriculture is clear. An update of the forecast, or a mid-course correction in early July based on atmospheric changes in June and the monsoon behaviour up to June, will be immensely useful from the point of view of agriculture.

It is generally believed that the combined effect of El Nino - the anomalous warming of Eastern Pacific waters near the Peruvian coast (Nino 1+2) and its westward motion to the Central Pacific (Nino 3+4) - and the Southern Oscillation (the pressure see-saw between Darwin and Tahiti), known as ENSO, adversely impacts monsoon behaviour. Though this correlation is not an absolute one - the 1997 monsoon was a good one despite that year experiencing the strongest El Nino of the century, but the 2002 drought occurred despite a very weak El Nino - the temperature anomaly over the Pacific is one of the key factors responsible for the monsoon. Similarly, sea surface temperatures (SSTs) over the South Indian Ocean are believed to play a significant role in monsoon dynamics.

A statistical model developed by A.K. Sahai at the Indian Institute of Tropical Meteorology (IITM), Pune, using only global SSTs, predicted an 11-12 per cent rainfall deficit last year. Sahai was not willing to disclose the details of this year's forecast because his is not an operational forecast. However, the robustness of a model that is based only on SSTs of 18 regions of the world's oceans is yet to be proven. The exact cause for the 2002 drought remains unclear. However, the revised prediction of the NCEP in June 2002 indicating a deficient monsoon and hindsight predictions based on General Circulation Models (GCMs) by various groups in the country seem to suggest that anomalous changes in the Indian Ocean and Pacific SSTs in June were responsible for the unusually dry break period in July. This was the main point that emerged at the brainstorming session held at the Indian Institute of Science (IISc), Bangalore, in November 2002.

Thus, the four new parameters that have been identified include two which are related to SST, the Nino (3 + 4) temperature and South Indian Ocean SST; the other two are the (850 millibar) wind shear between the lower and upper atmosphere at a height of about 2 km - a consequence of the warming of the oceans - over the South Indian Ocean, and the N-W Europe Temperature. Of these, the values of Nino (3+4) temperature and the 850 mb wind parameters are recorded in June. These two June variables can be used to tweak the prediction in case a mid-course correction is warranted. Having done away with April-May related variables, and retaining the June variables for mid-monsoon update, an eight-parameter model can be built to predict rainfall in mid-April itself. This is what the IMD has done this year instead of the customary issuance of forecast on May 25.

The IMD has evolved four models - an eight-parameter power regression model and an eight-parameter probabilistic model for an April forecast and an eight-parameter power regression model and a 10-parameter power regression model for a mid-July update. The three power regression models have inherent statistical error margins. The eight-parameter forecast in April has an error bar of 5 per cent, the 10-parameter forecast in July has an error margin of 4 per cent and the eight-parameter forecast in July has an error margin of 9 per cent.

A curious and an unsettling feature of the new parameter set is that it has no parameter related to the atmosphere over the Indian region. IMD scientists argue that for longer lead times, global rather than regional parameters play the stronger role in the behaviour of the monsoon. While this may have some validity, once the monsoon sets in, its wet and dry spells (or breaks) and the inter-annual variability of rainfall are governed by internal dynamics as well. So, at least for the July forecast, when the monsoon is well set, one would expect any mid-course correction to involve the behaviour of some key regional variables. Apparently, there is a lot of "noise" in the statistical correlation when regional variables are included.

The NCEP's forecast for the entire monsoon period shows rainfall to be substantially deficient. Its forecasts are based on a GCM, which is driven by SSTs predicted by the NCEP's coupled ENSO forecast system. The final forecast is an average of predictions with 20 different observed initial conditions. Warm conditions (+0.50 C anomaly) currently prevail over Nino (3+4). The NCEP's forecast of Nino (3+4) temperatures show an initial decline during May-June and then an increase during July-September.

Clearly, it is this positive El Nino anomaly that has led to the forecast of a deficient monsoon.

However, it appears that the El Nino trends are not clear as yet. Nevertheless, the NCEP forecast seems to have worried the IMD. Apparently, it has sought more details from the NCEP.

What does a "normal" monsoon (with its broad window of +/- 10) mean for agricultural planning? Vagaries of the monsoon affect the rain-fed kharif crop directly and determine the charging of reservoirs and soil moisture conditions necessary for the rabi crop. But the vulnerability or sensitivity of individual crops to variations in rainfall differs considerably. In the post-Green Revolution period, the percentage variation in the TGO for a 1 per cent deviation of rainfall from the mean (elasticity) has increased from 3 to 5 per cent. That is, crops have become more vulnerable to fluctuations in rainfall.

What is important from the perspective of agricultural strategy planning is not a gross prediction of whether the rainfall will be normal, but its spatial and temporal distribution, which, is not within the means of prediction at present, given the extreme complexity of the monsoon system.

For instance, consider the monsoons of 2000 and 2001. The actual rainfall was the same, namely 92 per cent, the lower end of the normal window. And yet, the TGO was vastly different. Food productivity in 2001 was quite good despite apparently poor rainfall. That is, meteorological drought does not necessarily mean an agricultural drought. Similarly, the hydrological situation is not just a function of the year's rainfall. The reservoir status, for example, would depend on the monsoons for the previous couple of years as well.

From the hydrological perspective, this year's monsoon will be critical because the country has had subdued monsoons for four consecutive years since 1999. The reservoir situation and ground water charging would be definitely below optimum. If this year too it turns out to be a subdued monsoon, the country may be headed for a hydrological drought. So, from the perspective of an organisation like the Central Water Commission (CWC), a gross prediction of the monsoon is not very relevant.

The IMD would do well to provide the absolute figures of the predicted rainfall rather than categorise rainfall into different broad categories. Such categories are significant only from the perspectives of researchers and statisticians.