DST’s Super Model: Or How not to Model an Epidemic

DST’s Super Model: Or How not to Model an Epidemic

The Department of Science and Technology has sponsored a model, described as a “Super Model” that— according to government hype—claims India’s lockdown was highly successful and we perhaps have even reached herd immunity.

The media has widely reported that according to the DST Super Model, the lockdown saved more than two million lives, and by February 2021, even without a vaccine, the epidemic will be over in India. The only saving grace is that it does not suggest doing away with masks, social distancing, hand washing and all the other precautions recommended everywhere, except among Donald Trump’s followers.

Courtesy: commons.wikimedia.org

Most models that have been used – and there have been a large number of COVID-19 models—work for short periods and not beyond. Epidemics are complicated. It is due to the real life complexity of epidemics, which depend on human to human contact, our mobility, the density of population, and a host of other factors.

Models are useful as a worst-case scenario, to tell us what measures we can take to reduce transmission of the virus, but have little predictive value in terms of actual numbers beyond two to three weeks. This has been the experience with all the different kinds of models: the SEIR models, of which the DST Super Model is a variant, are data-driven models which look only at data. More complex, agent-driven models, try to model human interaction and behaviour.

If we want to prove a hypothesis, there are a number of hooks in the model called parameters, which we can tweak to reach a conclusion that we may want. The more the number of parameters in the model, the more the model can be “tortured” to tell us what we want to hear. In this case, the DST Super Modellers wanted to hear that the Modi government has done a super job and all will be hunky dory in a few more months. The Super Model paper, published by Indian Journal of Medical Research and widely reported in the media, “proves” proves precisely that: without a lockdown there would have been more than 2.6 million deaths, and the epidemic will now end by February. Remember the earlier Boston Consulting model, quoted extensively by the government, which said more or less similarly nice things?

According to Professor M. Vidyasgar’s slides—Professor Vidyasagar is the Chairperson of DST’s Super Model Committee— which were presented to the media, the Committee had 10 members. Only three, Professor Manindra Agrawal, Lt Gen. Madhuri Kanitkar and Professor M. Vidyasagar are authors of the Super Model paper. As has been pointed out by Professor Gautam Menon, none of the three are epidemiologists and that shows in their understanding of the COVID-19 epidemic.

Let us dig a little more into the paper. The key departure of this “Super Model” from other similar models, is dividing the people who are exposed into two categories: one group as asymptomatic comprised of who can infect others, the other group consisting of those infected and have symptoms. No explanation has been offered on why these sets have been created, or any basis of such a division. Instead, the paper argues that standard models do not account for, “…asymptomatic patients, which were a novel feature of COVID-19.”

This is an extraordinary claim. In the case of any infectious disease, a number of people will come into contact with an infected person, and only a small fraction will get infected. This depends on a number of factors, including the amount of virus particles shed, the time duration and closeness of the contact and the immunity of the person exposed. All of these play a part in transmission of the infection, and whether the infected person will show symptoms.

Dr. Satyajit Rath, formerly of the National Institute of Immunology, and Professor Gautam Menon (in The Hindu) have pointed out that this separation of people into asymptomatic or not has not been done by any other research group working on COVID-19. Neither has the Committee (nor its subset who have authored the paper) advanced any reason for coming to this division of showing disease and asymptomatic groups.

The Super Model authors then proceed to create a model that has four parameters, and uses these to “fit” the past and “predict” the future. But if we dig a little deeper, these four parameters have different values over six phases, each of two weeks, giving us a total of 24 parameters and not four as claimed in the paper!

For a lay person, what difference does it make whether it is four or 24? The reason for giving it importance is that the more the number of parameters, the easier it is to “force-fit” any conclusion we want from the data. In other words, the greater the parameters, the less its explanatory value and it is easier to manipulate a desired result. A 24-parameter model is simply ludicrous and makes a mockery of modelling as a realistic exercise of understanding the past and predicting the future.

Even worse, in this specific case, the modellers have chosen ε as a parameter. It is the ratio of those infected with the disease to asymptomatic cases, which varies more than 7000 times over a 12-week duration. Though the authors tried to explain why a parameter, expected to be constant, should change so dramatically over 12 weeks, their explanation was that though the parameter is indeed constant, its calculation from inaccurate data creates this large variation. It could have been understood if the variation in the parameter was small, but a variation of 7,000 times in a parameter that is expected to be constant defies all canons of modelling.

Regarding parameters in models, Von Neuman had famously said, “With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.” Simply put, if we have enough parameters, we can “make” the “model” agree to anything we want it to say. Or, in the case of the DST Super Model, ask it to find ε values in the six phases such that the model will “show” that without a lockdown, there would have been more than 2.5 million deaths and that the epidemic will end by February, 2021!

Tucked away in the DST Super Model paper is an another astonishing claim: “If the model is correct, we may have reached herd immunity with about 380 million people already infected.” The serosurveys by ICMR which check for antibodies in our blood, nowhere show that 380 million, or about 30% of Indians, have been infected. The data from serosurveys do not agree with the model’s conclusions. The figures of those infected by August-end were between six to eight per cent for India, with only certain pockets like Dharavi in Mumbai showing high seropositivity. The Super Model has double this number. Even according to the later serosurveys, the numbers are nowhere near 30%, a number that the Super Model “computes”. The WHO, which has collated global figures, has set an upper bound limit of 10% for those among the global population with antibodies.

Apart from grossly overestimating the number of infected, why do Vidyasagar and his colleagues claim herd immunity has been reached with 30% Indians having been infected? The consensus among epidemiologists is that it requires 60% to 70% of a population to have been infected. This is also where the famous or infamous ε parameter—ratio of those infected with the disease to asymptomatic cases—introduced by our “Super Modellers” come in. It reduces the percentage for reaching herd immunity significantly, one of the benefits of ε!

The DST Super Model paper, hurriedly published in the Indian Journal Medical Research, does not seem to have even been proof read properly by the authors; or checked seriously by the journal. In a real howler, in the last paragraph of ‘Pandemic progression in India’, it claims: “Using the 1/ε value (=67) of phase 6, the model predicts total population with infection or antibodies to be around 3.5 million.” The same sentence in paragraph five of the section ‘Discussion’ says, “Using the 1/ε value (=67) of phase 6, the model predicts total population with infection or antibodies to be around 350 million”. Or, hundred times the number of infected in the second sentence from the first!

It is unfortunate that three eminent persons of the stature of the authors have sullied their names in such a shoddy exercise. It only shows that eminence does not prevent people from errors of judgement. In this case, it is not a simple error but has serious consequences for the people. In spite of its platitudes of social distancing, the false claims of achieving herd immunity in the paper can induce a false sense of security in the government, and have serious consequences in the real world. The authors have done themselves and the country no service by producing this poor example of a “scientific” paper.