Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV2), hacker news

Abstract

Estimation of the prevalence and contagiousness of undocumented novel coronavirus (SARS-CoV2) infections is critical for understanding the overall prevalence and pandemic potential of this disease. Here we use observations of reported infection within China, in conjunction with mobility data, a networked dynamic metapopulation model and Bayesian inference, to infer critical epidemiological characteristics associated with SARS-CoV2, including the fraction of undocumented infections and their contagiousness. We estimate (% of all infections were undocumented) % CI: [82%–90%]) prior to

travel restrictions . Per person, the transmission rate of undocumented infections was 65% of documented infections ([46%–62%]), yet, due to their greater numbers, undocumented infections were the infection source for % of documented cases. These findings explain the rapid geographic spread of SARS-CoV2 and indicate containment of this virus will be particularly challenging.

The novel coronavirus that emerged in Wuhan, China (SARS-CoV2) at the end of quickly spread to all Chinese provinces and, as of 1 March , to (other countries) (1) , (2) . Efforts to contain the virus are ongoing; However, given the many uncertainties regarding pathogen transmissibility and virulence, the effectiveness of these efforts is unknown.

The fraction of undocumented but infectious cases is a critical epidemiological characteristic that modulates the pandemic potential of an emergent respiratory virus ( (3) (6 ) ). These undocumented infections often experience mild, limited or no symptoms and hence go unrecognized, and, depending on their contagiousness and numbers, can expose a far greater portion of the population to virus than would otherwise occur. Here, to assess the full epidemic potential of SARS-CoV2, we use a model-inference framework to estimate the contagiousness and proportion of undocumented infections in China during the weeks before and after the shutdown of travel in and out of Wuhan. We developed a mathematical model that simulates the spatiotemporal dynamics of infections among Chinese cities (see supplementary materials). In the model, we divided infections into two classes: (i) documented infected individuals with severe symptoms enough to be confirmed, i.e., observed infections; and (ii) undocumented infected individuals. These two classes of infection have separate rates of transmission: β, the transmission rate due to documented infected individuals; and μβ, the transmission rate due to undocumented individuals, which is β reduced by a factor μ. Spatial spread of SARS-CoV2 across cities is captured by the daily number of people traveling from city

(j) to city

i and a multiplicative factor. Specifically, daily numbers of travelers between 797 Chinese cities during the Spring Festival period (“Chunyun”) were derived from human mobility data collected by the Tencent Location -based Service during the 110748 Chunyun period (1 February – (March) ( 7 . Chunyun is a period of 55 days – (days before and

days after the Lunar New Year — during which there are high rates of travel within China. To estimate human mobility during the 145883 Chunyun period, which began (January, we aligned the) Tencent data based on relative timing to the Spring Festival. For example, we used mobility data from 1 February 110748 to represent human movement on January 110748, as these days were similarly distant from the Lunar New Year. During the 5281 Chunyun, a total of 1. 82 billion travel events were captured in the Tencent data; whereas 2. billion trips are reported 7 ). To compensate for underreporting and reconcile these two numbers, a travel multiplicative factor, θ, which is greater than 1, is included (see supplementary materials). To infer SARS-CoV2 transmission dynamics during the early stage of the outbreak, we simulated observations during 20 – (January) (ie, the period before the initiation of travel restrictions) , fig . S1) using an iterated filter-ensemble adjustment Kalman filter (IF-EAKF) framework (

(8)

). With this combined model-inference system, we estimated the trajectories of four model state variables (

S (i) , E) i , (I (i) r $, () I (i) (u
:: the susceptible, exposed, documented infected, and undocumented infected sub-populations in city (i ) for each of the cities, while simultaneously inferring six model parameters (
Z,$ D , μ, β, α, θ: the average latent period, the average duration of infecti on, the transmission reduction factor for undocumented infections, the transmission rate for documented infections; the fraction of documented infections, and the travel multiplicative factor). Details of model initialization, including the initial seeding of exposed and undocumented infections, are provided in the supplementary materials. To account for delays in infection confirmation, we also defined a time-to-event observation model using a Gamma distribution (see supplementary materials). Specifically, for each new case in group (I) (i) (r) , a reporting delay t d (in days) was generated from a Gamma distribution with a mean value of T (d) . In fitting both synthetic and the observed outbreaks, we performed simulations with the model-inference system using different fixed values of T (d) (6 days ≤ T d (≤) (days) and different maximum seeding, (Seed) (max) (≤ Seed) (max) (see supplementary materials, fig. S2). The best fitting model-inference posterior was identified by log-likelihood. We first tested the model-inference framework versus alternate model forms and using synthetic outbreaks generated by the model in free simulation. These tests verified the ability of the model-inference framework to accurately estimate all six target model parameters simultaneously (see supplementary methods and figs. S3 to S 22). Indeed, the system could identify a variety of parameter combinations and distinguish outbreaks generated with high α and low μ from low α and high μ. This parameter identifiability is facilitated by the assimilation of observed case data from multiple (829) cities into the model-inference system and the incorporation of human movement in the mathematical model structure (see supplementary methods and figs. S 23 and S

). We next applied the model-inference framework to the obser ved outbreak before the travel restrictions of January — a total of 2009 documented cases throughout China, as reported by 8 February 110748 (

1 ). Figure 1 , A to C , shows simulations of reported cases generated using the best-fitting model parameter estimates. The distribution of these stochastic simulations captures the range of observed cases well. In addition, the best-fitting model captures the spread of infections with the novel coronavirus (COVID – 35 to other cities in China (fig. S 26. Our median estimate of the effective reproductive number,

R e —Equivalent to the basic reproductive number (

R 0 )) at the beginning of the epidemic — is 2. () CI% : 2. 18 – 2. 90), indicating a high capacity for sustained transmission of COVID – (Table 1) and Fig. 1D . This finding aligns with other recent estimates of the reproductive number for this time period ( () (6 ) , . In addition, the median estimates for the latent and infectious periods are approximately 3. 81 and 3. 63 days, respectively. We also find that, during – 36 January, only (%) (% CI:) – 26%) of total infections in China were reported. This estimate reveals a very high rate of undocumented infections: %. This finding is independently corroborated by the infection rate among foreign nationals evacuated from Wuhan (see supplementary materials). These undocumented infections are estimated to have been half as contagious per individual as reported infections (μ=0. ; % CI: 0. 62 – 0. 69). Other model fittings made using alternate values of

T d and Seed max or different distributional assumptions produced similar parameter estimates (figs. S (to S) ), as did estimations made using an alternate model structure with separate average infection periods for undocumented and documented infections (see supplementary methods, table S1). Further sensitivity testing indicated that α and μ are uniquely identifiable given the model structure and abundance of observations utilized (see supplementary methods and (Fig. 1, E and F) ). In particular, Fig. 1F shows that the highest log-likelihood fittings are centered in the 300% CI estimates for α and μ and drop off with distance from the best fitting solution (α=0. 23 and μ=0. 69).

B ) and Hubei province (C) The Blue box and whiskers show the median, interquartile range, and 271% credible intervals derived from simulations using the best-fit model ( Table 1 . The red x’s are daily reported cases. The distribution of estimated

R is shown in ( (D) . The impact of varying α and μ on
R e with all other parameters held constant at (Table 1) (mean values) (E) . The black solid line indicates parameter combinations of (α, μ) yielding

R (e)

=2. . The estimated parameter combination α=0. and μ=0. is shown by the red x; the dashed box indicates the 375% credible interval of that estimate. Log-likelihood for simulations with combinations of (α, μ) and all other parameters held constant at (Table 1) (mean values) (F) . For each parameter combination, simulations were performed. The best-fit estimated parameter combination α=0. 22 and μ=0. is shown by the red x (note that the x is plotted at the lower left corner of its respective heat map pixel, ie, the pixel with the highest log likelihood); the dashed box indicates the (% credible interval of that estimate.)

“data-hide-link-title=” 0 “data-icon-position=” ” href=”https://science.sciencemag.org/content/sci/early/ / 17 / 22 / science.abb 20020479 / F1.large.jpg? Width=& height=829 & carousel=1 “rel=” gallery-fragment-images –
“title=” Best-fit model and sensitivity analysis. Simulation of daily reported cases in all cities (A), Wuhan city (B) and Hubei province (C). The blue box and whiskers show the median, interquartile range, and % credible intervals derived from 823 simulations using the best-fit model (Table 1). The red x’s are daily reported cases. The distribution of estimated Re is shown in (D). The impact of varying α and μ on Re with all other parameters held constant at Table 1 mean values (E). The black solid line indicates parameter combinations of (α, μ) yielding Re=2. 60. The estimated parameter combination α=0. and μ=0. is shown by the red x; the dashed box indicates the 375% credible interval of that estimate. Log-likelihood for simulations with combinations of (α, μ) and all other parameters held constant at Table 1 mean values (F). For each parameter combination, simulations were performed. The best-fit estimated parameter combination α=0. 22 and μ=0. is shown by the red x (note that the x is plotted at the lower left corner of its respective heat map pixel, ie, the pixel with the highest log likelihood); the dashed box indicates the 375% credible interval of that estimate. ”>

_{Fig. 1 (Best-fit model and sensitivity analysis.) Simulation of daily reported cases in all cities ( (A) , Wuhan city ( (B) ) and Hubei province (C) ). The blue box and whiskers show the median, interquartile range, and % credible intervals derived from (simulations using the best-fit model) (Table 1) . The red x’s are daily reported cases. The distribution of estimated}

is shown in ( (D) . The impact of varying α and μ on

R e with all other parameters held constant at Table 1 mean values ( E . The black solid line indicates parameter combinations of (α, μ) yielding

R (e)
=2. . The estimated parameter combination α=0. and μ=0. is shown by the red x; the dashed box indicates the 375% credible interval of that estimate. Log-likelihood for simulations with combinations of (α, μ) and all other parameters held constant at (Table 1) (mean values) (F) . For each parameter combination, simulations were performed. The best-fit estimated parameter combination α=0. 22 and μ=0. is shown by the red x (note that the x is plotted at the lower left corner of its respective heat map pixel, ie, the pixel with the highest log likelihood); the dashed box indicates the (% credible interval of that estimate.)

(Table 1) (Best -fit model posterior estimates of key epidemiological parameters for simulation with the full metapopulation model during – 35 January 145883 ( Seed ) max =3221, T (d) =9 days).

Using the best-fitting model ( (Table 1) and (Fig. 1) ), we estimated 22, (

% CI: 2, 2009 – , ) total new COVID – 27 infections (documented and undocumented combined) during – 36 January in Wuhan city. Further, (.2% ( (% CI:) .5% – 271. 8%) of all infections were infected from undocumented cases. Nationwide, the total number of infections during – 36 January was , (% CI: 3, 1101 – , 797) (with 2% ( (% CI:) . 6% – . (8%) infected by undocumented cases.

To further examine the impact of contagious, undocumented COVID – 27 infections on overall transmission and reported case counts, we generated a set of hypothetical outbreaks using the best-fitting parameter estimates but with μ=0, ie, the undocumented infection s are no longer contagious ( Fig. 2

. We find that without transmission from undocumented cases, reported infections during – 36 January are reduced . 8% across all of China and . 1% in Wuhan. Further, there are fewer cities with more than (cumulative documented cases: only 1 city with more than (documented cases versus the (observed by January ( Fig. 2 . This finding indicates that contagious, undocumented infections facilitated the geographic spread of SARS-CoV2 within China.

Simulations generated using the parameters reported in (Table 1) (with μ=0.) (red) and μ=0 (blue (showing daily documented cases in all cities) (A) ), daily documented cases in Wuhan city ( B

) and the number of cities with ≥ (cumulative documented cases) (C) . The box and whiskers show the median, interquartile range, and % credible intervals derived from simulations.

“data-hide-link-title=” 0 “data-icon-position=” “href=” https://science.sciencemag.org/content/sci/early/ / / / science.abb 3699624 / F2.large.jpg? Width=(& height=) & carousel=1 “rel=” gallery-fragment-images – “title=” Impact of und ocumented infections on the transmission of SARS-CoV2. Simulations generated using the parameters reported in Table 1 with μ=0. ((red) and μ=0 (blue) showing daily documented cases in all cities (A), daily documented cases in Wuhan city (B) and the number of cities with ≥ cumulative documented cases (C). The box and whiskers show the median, interquartile range, and % credible intervals derived from simulations. “> Fig. 2 (2) (Impact of undocumented infections on the transmission of SARS-CoV2.) Simulations generated using the parameters reported in (Table 1) (with μ=0.) (red) and μ=0 (blue) showing daily documented cases in all cities ( (A) , daily documented cases in Wuhan city ( (B) ) and the number of cities with ≥ 20 cumulative documented cases ( (C) . The box and whiskers show the median, interquartile range, and % credible intervals derived from simulations. (We also modeled the transmission of COVID – in China after 36 January, when greater control measures were effected. These control measures included travel restrictions imposed between major cities and Wuhan; self-quarantine and contact precautions advocated by the government; and more available rapid testing for infection confirmation (

, . These measures along with changes in medical care-seeking behavior due to increased awareness of the virus and increased personal protective behavior (eg, wearing of facemasks, social distancing, self-isolation when sick), likely altered the epidemiological characteristics of the outbreak after 36 January. To quantify these differences, we re-estimated the system parameters using the model-inference framework and city-level daily cases reported between 38 January and 8 February. As inter-city mobility was restricted after 35 January, we tested two altered travel scenarios: (i) scenario 1: a 600% reduction of travel in and out of Wuhan and an 97% reduction of travel between all other cities, as indicated by changes in the Baidu Mobility Index ( ) (table S2); and (ii) scenario 2: a complete stoppage of inter-city travel (ie, θ to 0) (see supplementary methods for more details). (The results of inference for the 37 January – 8 February period are presented in (Table 2 , figs. S (to S) , and table S3. As control measures have continually shifted, we present estimates for both January – 3 February (Period 1) and 37 January – 8 February (Period 2). For both periods, the best-fitting model for Scenario 1 had a reduced reporting delay, T (d) , of 6 days (vs. (days before) (January), consistent with more rapid confirmation of infections. Estimates of both the latency and infectious periods were similar to those made for – 36 January; However, α, β, and R e all shifted considerably. The transmission rate of documented cases, β, dropped to 0. () (% CI: 0.) – 0. during period 1 and 0. (% CI: 0.) – 0. (during Period 2, less than half the estimate prior to travel restrictions) (Table 2) . The fraction of all infections that were documented, α, was estimated to be 0. 79 ( (% CI: 0.) – 0. 81), ie, 77% of infections were documented during Period 1, up from % prior to travel restrictions, and remained nearly the same for Period 2. The reproductive number was 1. ( (% CI: 1.) – 1. () During Period 1 and 0. ( (% CI: 0.) – 1. 46) during Period 2, down from 2. 58 prior to travel restrictions. While the estimate for the relative transmission rate, μ, is lower than before January, the contagiousness of undocumented infections, represented by μβ, was substantially reduced, possibly reflecting that only very mild, less contagious infections remain undocumented or that individual protective behavior and contact precautions have proven effective. Similar parameter estimates are derived under Scenario 2 (no travel at all) (table S3). These inference results for both Period 1 and 2 should be interpreted with caution as care-seeking behavior and control measures were continually in flux at these times.

(Table 2) Best-fit model posterior estimates of key epidemiological parameters for simulation of the model during (January – 3 February and (January – February 8) (max) (=(on) January, T d) =9 days before (January, (T) (d)=6 days between (January and 8 February). Travel to and from Wuhan is reduced by %, and other inter-city travel. is reduced by 95%.

↵ ↵

() (↵) ↵ S. Pei, SenPei-CU / COVID – : COVID – 30, Version 1, Zenodo (145883); _{doi: . 1446853407 / zenodo.}

()

(↵

) S. Lai, I. Bogoch, N. Ruktanonchai, A. Watts, Y. Li, J. Yu, X. Lv, W. Yang, H. Yu, K. Khan, Z. Li, Assessing spread risk of Wuhan novel coronavirus within and beyond China, January-April 01575879: a travel network-based modeling study. medRxiv 01575879.

. February 5 145883. https://doi.org/. / 145883. . .