COVID-19 Number of Tests in Italy

Table of Contents

Menu

Introduction

This page presents some data about the number of tests and people tested for COVID-19 over time in Italy and compares them with the number of people found positive.

This page was created on <2020-08-20 Thu> and last updated on <2021-01-03 Sun>.

The source code available on the COVID-19 pages is distributed under the MIT License; the content is distributed under a Creative Commons - Attribution 4.0.

Getting data into R

We first read the data from the Civil Protection repository adding the ratio between positives and tests, computed on the same day and computed with data shifted by two days (on the assumption tests take two days to complete).

In fact data about tests is used with different semantics by different regions. Some regions reports tests with results (and the ratio new positives / tests makes sense). Other reports the number of test performed, in which case the correct ratio is between positives and tests performed some days earlier. We assume two days and report both ratios for all regions. See the following issue on GitHub for an explanation and some more details https://github.com/pcm-dpc/COVID-19/issues/577 (in Italian).

PATH="./data"
DIGITS = 4

national = read.csv(file.path(PATH, "dpc-covid19-ita-andamento-nazionale.csv"))
national$data <- as.Date(national$data)

national$nuovi_casi_testati = c(NA, diff(national$casi_testati, 1))
national$p_over_t <- round(national$nuovi_positivi / national$nuovi_casi_testati, digits = DIGITS) * 100

national$nuovi_tamponi = c(NA, diff(national$tamponi, 1))
national$p_tamponi_over_t <- round(national$nuovi_positivi / national$nuovi_tamponi, digits = DIGITS) * 100

# national$nuovi_casi_testati_2 <- c(NA, NA, head(national$nuovi_casi_testati, -2))
# national$p_over_t_2 = round(national$nuovi_positivi / national$nuovi_casi_testati_2, digits = DIGITS) * 100

# national$nuovi_tamponi_2 <- c(NA, NA, head(national$tamponi_2, -2))
# national$p_tamponi_over_t_2 = round(national$nuovi_positivi / national$nuovi_tamponi_2, digits = DIGITS) * 100

Concerning the regional level, computed columns, such as the number of people tested in a day, have to be computed after filtering, or the diif will work on values from different regions.

# evolution over time, by Region
data = read.csv(file.path(PATH, "dpc-covid19-ita-regioni.csv"))
data$data <- as.Date(data$data)

These are the columns we are interested in and their translation in English:

cols = c(
  "data",
  "nuovi_positivi",
  "nuovi_tamponi",
  "nuovi_casi_testati",
  "p_tamponi_over_t",
  "p_over_t"
)

We now define a function to ouput the last N rows of the input data frame. The real “challenge”, here, is transposing the data, to get a more natural presentation (with time progressing from left to right).

table_data <- function(df, cols, rows = 10) {
  # get the last 10 elements and the interesting columns of the dataframe
  f  <- tail(df, rows)
  rf <- f[, cols]

  # the labels in the transposed matrix are the column names of the original data.frame
  row_labels  <- colnames(rf)
  # the columns in the trasposed matrix are the dates
  col_labels  <- c("Label", format(rf$data, "%a, %b %d"))

  rft <- data.frame(row_labels, t(rf))
  colnames(rft) <- col_labels
  return(rft[-1,])
}

People Tested and Cases in Italy

Data of the last ten days

table_data(national, cols)
Label Fri, Feb 19 Sat, Feb 20 Sun, Feb 21 Mon, Feb 22 Tue, Feb 23 Wed, Feb 24 Thu, Feb 25 Fri, Feb 26 Sat, Feb 27 Sun, Feb 28
nuovi_positivi 15479 14931 13452 9630 13314 16424 19886 20499 18916 17455
nuovi_tamponi 297128 306078 250986 170672 303850 340247 353704 325404 323047 257024
nuovi_casi_testati 94883 102150 96581 57115 89156 105805 104200 107952 110202 104451
p_tamponi_over_t 5.21 4.88 5.36 5.64 4.38 4.83 5.62 6.3 5.86 6.79
p_over_t 16.31 14.62 13.93 16.86 14.93 15.52 19.08 18.99 17.16 16.71

New Cases

New cases.

## add extra space to right margin of plot within frame
par(mar=c(5, 4, 4, 6) + 0.1)

## Allow a second plot on the same graph
# par(new=TRUE)
new_cases_limits = c( min(national[national$data >= "2020-08-01", c("nuovi_positivi")]), max(national[national$data >= "2020-08-01", c("nuovi_positivi")]) )

p = plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_positivi")], 
     type="l", lwd=6, pch=21, cex=1.5, col=c("#AA0000"),
     axes=FALSE,
     ylim=new_cases_limits,
     ylab="", xlab="")
text(x = tail(national[national$data >= "2020-08-01", c("data")], 5),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 5),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 5),
     pos = 1, cex = 1, col="#AA0000")
mtext("New Cases", side=4, line=4, col="#AA0000") 
axis(4, ylim=new_cases_limits, las=1)

grid(p, col = "black", lty = "dotted")

# x-axis
dates = national[national$data >= "2020-08-01", c("data")]
axis.Date(1, at=seq(min(dates), max(dates), by="week"), format="%b %d", las=2)
mtext("Day", side=1, line=2.5)

## Add Legend
legend("topleft", legend = c("Tests", "New Cases"),
       text.col = c("#3B3176", "#AA0000"), pch= c(15, 17), col=c("#3B3176", "#AA0000"))

new_cases_italia.png

New Cases Tested

plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 
     type="l", lwd=6, pch=16, cex=2.5, col=c("#3B3176"))
text(x = tail(national[national$data >= "2020-08-01", c("data")], 1),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     pos = 4, cex = 1.2, col=c("#3B3176"))
 grid(col="black")

tests_italia.png

Number of Tests and New Cases Tested

Plot new cases and tests together. (Solution taken from How can I plot with 2 different y-axes? on Stack Overflow.)

## add extra space to right margin of plot within frame
par(mar=c(5, 4, 4, 6) + 0.1)

## Plot first set of data and draw its axis
tests_limits = c( min(national[national$data >= "2020-08-01", c("nuovi_casi_testati")]), max(national[national$data >= "2020-08-01", c("nuovi_casi_testati")]) )
plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 
     type="l", lwd=6, pch=11, cex=1.5, col=c("#3B3176"),
     axes=FALSE,
     ylim=tests_limits,
     ylab="", xlab="")
text(x = tail(national[national$data >= "2020-08-01", c("data")], 1),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     pos = 4, cex = 1, col=c("#3B3176"))
mtext("Number of Tests", side=2, col="#3B3176", line=4) 
axis(2, ylim=tests_limits, col="black", las=1)  
box()

## Allow a second plot on the same graph
par(new=TRUE)
new_cases_limits = c( min(national[national$data >= "2020-08-01", c("nuovi_positivi")]), max(national[national$data >= "2020-08-01", c("nuovi_positivi")]) )

p = plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_positivi")], 
     type="l", lwd=6, pch=21, cex=1.5, col=c("#AA0000"),
     axes=FALSE,
     ylim=new_cases_limits,
     ylab="", xlab="")
text(x = tail(national[national$data >= "2020-08-01", c("data")], 1),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 1),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 1),
     pos = 4, cex = 1, col="#AA0000")
mtext("New Cases", side=4, line=4, col="#AA0000") 
axis(4, ylim=new_cases_limits, las=1)

grid(p, col = "black", lty = "dotted")

# x-axis
dates = national[national$data >= "2020-08-01", c("data")]
axis.Date(1, at=seq(min(dates), max(dates), by="week"), format="%b %d", las=2)
mtext("Day", side=1, line=2.5)

## Add Legend
legend("topleft", legend = c("Tests", "New Cases"),
       text.col = c("#3B3176", "#AA0000"), pch= c(15, 17), col=c("#3B3176", "#AA0000"))

tests_and_new_cases_italia.png

Positive/Number of Tests

Here we plot the number of positive people over tests performed. The standard measurement is the ratio between positive and tests performed (shown in blue). The way I understand it is that this number also includes tests performed on people already diagnosed and recovered.

The second graph, in red, shows the ration of positive over new people tested, that is, of all the people not yet diagnosed, how many resulted positive?

plot(national$p_over_t ~ national$data, type="o", lwd=3, pch=21, col="#ff0000", main="Positive over Tests", xlab="Date", ylab="Percentage")
text(y = tail(national, 1)$p_over_t, x = tail(national, 1)$data, lab = paste(tail(national, 1)$p_over_t, "%", sep=""), pos=4, col="#ff0000", cex=1.3)

# Second plot with Positive over tests
p = lines(national$p_tamponi_over_t ~ national$data, type="o", lwd=3, pch=21, col="#000088", xlab="Date", ylab="Percentage")
text(y = tail(national, 1)$p_tamponi_over_t, x = tail(national, 1)$data, lab = paste(tail(national, 1)$p_tamponi_over_t, "%", sep=""), pos=4, col="#000088", cex=1.3)

## Add Legend
grid(col="black")
legend("bottomleft", legend = c("Positive over new People Tested", "Positive over Tests Performed"),
       text.col = c("#ff0000", "#000088"), pch= c(15, 17), col=c("#AA0000", "#000088"))

positive_over_tests_italia.png

People Tested and Cases in Trentino

region <- subset(data, denominazione_regione == "P.A. Trento")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))

region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100
region$nuovi_casi_testati_2 = c(NA, NA, diff(region$casi_testati, 2))
region$p_over_t_2 = round(region$nuovi_positivi / region$nuovi_casi_testati_2, digits = DIGITS) * 100
region$nuovi_casi_testati_2 <- c(NA, NA, head(region$nuovi_casi_testati, -2))
region$p_over_t_2 = round(region$nuovi_positivi / region$nuovi_casi_testati_2, digits = DIGITS) * 100

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100
region$nuovi_tamponi_2 <- c(NA, NA, head(region$tamponi_2, -2))
region$p_tamponi_over_t_2 = round(region$nuovi_positivi / region$nuovi_tamponi_2, digits = DIGITS) * 100

table_data(region, cols)
x
org_babel_R_eoe

People Tested and Cases in Liguria

region <- subset(data, denominazione_regione == "Liguria")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))

region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100
region$nuovi_casi_testati_2 = c(NA, NA, diff(region$casi_testati, 2))

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100

table_data(region, cols)
Label Fri, Feb 19 Sat, Feb 20 Sun, Feb 21 Mon, Feb 22 Tue, Feb 23 Wed, Feb 24 Thu, Feb 25 Fri, Feb 26 Sat, Feb 27 Sun, Feb 28
nuovi_positivi 274 361 266 136 381 285 452 351 351 248
nuovi_tamponi 7030 6741 4549 3438 7735 7707 8015 6542 7033 4837
nuovi_casi_testati 2712 2581 1799 1346 3146 2829 3051 2550 2693 2012
p_tamponi_over_t 3.9 5.36 5.85 3.96 4.93 3.7 5.64 5.37 4.99 5.13
p_over_t 10.1 13.99 14.79 10.1 12.11 10.07 14.81 13.76 13.03 12.33

People Tested and Cases in Veneto

region <- subset(data, denominazione_regione == "Veneto")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))
region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100

table_data(region, cols)
Label Fri, Feb 19 Sat, Feb 20 Sun, Feb 21 Mon, Feb 22 Tue, Feb 23 Wed, Feb 24 Thu, Feb 25 Fri, Feb 26 Sat, Feb 27 Sun, Feb 28
nuovi_positivi 657 1244 718 509 1062 895 1304 1174 1285 911
nuovi_tamponi 27341 44145 37790 10314 37605 39954 45743 38483 41303 20522
nuovi_casi_testati 0 3377 5033 1651 2960 3312 4300 3331 3914 2995
p_tamponi_over_t 2.4 2.82 1.9 4.94 2.82 2.24 2.85 3.05 3.11 4.44
p_over_t Inf 36.84 14.27 30.83 35.88 27.02 30.33 35.24 32.83 30.42

People Tested and Cases in Lombardia

region <- subset(data, denominazione_regione == "Lombardia")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))
region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100

table_data(region, cols)
Label Fri, Feb 19 Sat, Feb 20 Sun, Feb 21 Mon, Feb 22 Tue, Feb 23 Wed, Feb 24 Thu, Feb 25 Fri, Feb 26 Sat, Feb 27 Sun, Feb 28
nuovi_positivi 3724 3019 2514 1491 2480 3310 4243 4557 4191 3529
nuovi_tamponi 51894 44012 33148 17871 35149 50268 51473 46725 45865 37251
nuovi_casi_testati 15863 13948 11331 6673 9478 12549 15576 15061 13513 13253
p_tamponi_over_t 7.18 6.86 7.58 8.34 7.06 6.58 8.24 9.75 9.14 9.47
p_over_t 23.48 21.64 22.19 22.34 26.17 26.38 27.24 30.26 31.01 26.63

Author: Adolfo Villafiorita

Last modified: 2021-01-03 Sun 11:11 (created on: 2020-08-20 Thu 00:00)

Published: 2021-02-28 Sun 20:00