Summary Analysis COVID19 Dataset and Journal Articles

On April 3, 2020, Kaggle introduced a COVID-19 challenge to data scientists for data exploration and research. This “Uncover COViD-19 Challenge” contained a collection of datasets from 20 global sources from the United States to South Korea (Kaggle, 2020). The goal is to uncover new aspects of the virus not only from proprietary datasets but from non-proprietary datasets so that first responders may be better prepared. Due to the urgency of this task, the following is an informal analysis based on the datasets and journal articles gathered and presented in a question and answer (Q&A) format.

  1. How is the virus transmitted? The travel mechanism of this virus (COVID19) is through human mucus contact. The virus spreads through sneezing, coughing, surfaces, and speech.
  2. Is the social distance of 6ft sufficient? The social reach of 6 feet is adequate for large particles, such as in human sneezes. However, micro-particles smaller in size may continue to linger in the air (JAMA, 2020).  
  3. Is a mask needed? Yes, if you already have COVID19 or work in the proximity of someone infected. No, if you are relatively isolated and keep your distance from dense populations.
  4. Which populations in areas or cities pass away from COVID19? Densely populated areas where the outbreak has been transmitted by human proximity. The top four of the most densely populated cities in the United States are New York, Los Angeles, Chicago, and Houston. (Maciag, 2017).
  5. What is the demographic of the population unaffected and affected and at risk? The demographics of age groups based upon contact tracing data from China suggest 30-39-year old were the highest group with symptoms (fever). The unaffected age group 0-9 and 70+ without fever was the lowest under surveillance. However, further time-series analysis from virus surveillance may indicate that this age group (30-39) may have brought the virus home and thereby infected the elderly. At-risk are the elderly, with the most severe case, was in the 60-69 range (Surveillances, Feb. 2020). 
  6. What is the incidence of infection with coronavirus among patients? The chances of surviving this pandemic are developing a broad immunity to the flu. Similar to influenza, which is a viral infection that attacks your respiratory system, COVID19 has symptoms of fever and cough. Where COVID19 differs is the shortness of breath, causing severe complications where the mortality rate ten times that of influenza (Higgins, 2020). COVID19 researchers are currently developing drugs, therapeutics, and vaccines; however, at this time, there is no cure, just testing, social distancing, and quarantine.
  7. What patterns of infection rate forecasting between symptomatic patients and pre-symptomatic patients? This pattern emerges between the incubation period, and the serial interval for COVID-19 where the infecter may not show symptoms before the infectee may already show signs shown in Figure 1. As a result, the serial interval is much shorter compared to SARS, which explains the high spike in density per day, as shown in Figure 2 (Nishiura et al., Apr. 2020).  
Fig. 1. Time-series serial interval is shorter for COVID (Nishiura et al., Apr. 2020)

Fig. 2, COVID-19 Time-series serial rate of infection compared to SARS (Nishiura et al., Apr. 2020)


GitHub. (Apr. 2020). nCoVSerialInterval2020. Retrieved from

JAMA. (Mar. 2020). Respiratory Pathogen Emission Dynamics. Retrieved from

Kaggle. (Apr. 2020). UNCOVER COVID-19 Challenge. Retrieved from

Kaiser, J. (Feb. 8, 2019). EXCLUSIVE: Controversial experiments that could make bird flu more risky poised to resume. Retrieved from https:// make-bird-flu-more-risky-poised-resume

Nishiura, H., Linton, N. M., & Akhmetzhanov, A. R. (Apr. 2020). Serial interval of novel coronavirus (COVID-19) infections. International journal of infectious diseases. Retrieved from

Higgins, N., Lovelace, B. (Mar. 2020). Top US health official says the coronavirus is 10 times ‘more lethal’ than the seasonal flu. Retrieved from

Maciag, M. (Nov. 2017). Population Density for U.S. Cities Statistics. Retrieved from

Sample, Ian. (Jun. 11, 2014). Scientists condemn ‘crazy, dangerous’ creation of deadly airborne flu virus. Retrieved from science/2014/jun/11/crazy-dangerous-creation-deadly-airborne-flu-virus

Steenhuysen, Jullie. (Nov. 25, 2019). New Roche flu drug can drive resistance in influenza viruses: researchers. Retrieved from article/us-roche-hldg-flu-resistance/new-roche-flu-drug-can-drive- resistance-in-influenza-viruses-researchers-idUSKBN1XZ27J? feedType=RSS&feedName=healthNews

Surveillances, V. (Feb. 2020). The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19)—China, 2020. China CDC Weekly2(8), 113-122. Retrieved from ,