This dataset includes laboratory diagnostics for the complete blood counts without differentiation, C-reactive protein and procalictonin for patients admitted at the University Hospital Leipzig from 2014 to 2019 as Training and from 2020 to 2021 as Validation set, respectively. For external validation the same laboratory values were taken from 2015 to 2020 at the University Hospital Greifswald.

sbcdata

Format

A data.table with 18 columns/variables:

Id

integer, identification number of the case, unique for each center.

Age

integer, age of the patient in years.

Sex

character, "W" for female and "M" for male.

Diagnosis

character, diagnosis, could be "Control", "SIRS" and "Sepsis". See below for details.

Center

character, center, one of "Greifswald" or "Leipzig".

Set

character, center, one of "Training" or "Validation".

Sender

character, sender/origin which send the blood sample to the laboratory. See sendercodes for a description of all possible codes.

Epsiode

integer, counter for episodes on intensive care units, is incremented by one after each discharge from the intensive care unit.

Time

integer, (relative) time when the blood was analysed. The first timepoint for each case is always set to zero.

TragetIcu

character, the name/type of the intensive care unit where the patient/case has to be admitted to. See sendercodes for a description of all possible codes.

SecToIcu

integer, time in seconds until the patient/case has to be admitted to the TargetIcu intensive care unit. This time is negative if the patient/case is already on the intensive care unit TargetIcu.

CRP

double, C-reactive protein in mg/l.

HGB

double, hemoglobin in mmol/l.

MCV

double, mean corpuscular volume in fl.

PLT

double, procalcitonin in Gpt/l.

RBC

double, platelets in Tpt/l.

WBC

double, red blood count in Gpt/l.

PCT

double, white blood count in ng/ml.

Source

Data University Hospital Leipzig

R package

Sebastian Gibb

Data Processing

Maria Schmidt, Paul Ahrens and Mark Wernsdorfer

Laboratory Data Collection/Extraction

Maria Schmidt, Thorsten Kaiser

Administration Data Extraction

Maria Schmidt, Thorsten Kaiser

Reference Ethic Committee

214/18ek

Data University Medicine Greifswald

Data Processing/R package

Sebastian Gibb

Laboratory Data Collection/Extraction

Matthias Nauck and Stefan Bollmann

Administration Data Extraction

Thomas Hildebrandt

Reference Ethic Committee

BB133/10

Details

The Diagnosis was based on ICD10-GM codes. "Sepsis" was assumed for:

  • A02.1

  • A20.7

  • A22.7

  • A24.1

  • A26.7

  • A32.7

  • A39.2, A39.3, A39.4

  • A40.0, A40.1, A40.2, A40.3, A40.8, A40.9

  • A41.0, A41.1, A41.2, A41.3, A41.4, A41.51, A41.52, A41.58, A41.8, A41.9

  • A42.7

  • B37.7

  • R57.2

  • R65.1

If the ICD10 code was R65.x without any of the sepsis-related codes above the Diagnoses "SIRS" was used (except R65.2). Everything else is labeled as "Control".

For the Center "Greifswald" there are a few entries with duplicated time points Time for the same Id and Sender with different laboratory values. This happens due to the analyses of multiple blood samples from the same patient at the same time in the same run of the analyser. It could not be decided which one is the correct/better one so removal is suggested. An example could be found below.

In the Center "Greifswald" the sender codes are not as detailed as in "Leipzig". That's why "OTHER" occures more often and "AMB" fewer than in "Leipzig".

At the Center "Greifswald" the admission/discharge timepoint was recorded in and extracted from the clinical information system. These data were not available for the Center "Leipzig". There the first/last blood sample taken on an intensive care unit was taken as timepoint for admission/discharge (which is not necessarly part of the dataset). That's why the first blood sample on an intensive care unit could have a SecToIcu of zero in contrast to a negative value for Center "Greifswald". If needed this could be harmonized by adding the first SecToIcu for an intensive care unit to all SecToIcu for each Episode in the "Greifswald" and/or "Leipzig" subset. An example could be found below.

In a few cases there is a mismatch between the timepoint of admission/discharge extracted from the clinical information system at Center "Greifswald" and the entry in Center. It could happen that the Sender is a non-ICU ward and the SecToIcu time is negative. According to the admission data the patient was already on an ICU but the laboratory order was taken from a non-ICU ward. Mostly this time difference is around a few minutes (a blood sample was taken on the non-ICU ward, but the analysis in the laboratory started after transfer to the ICU).

Examples

## remove duplicate laboratory entries (see text above for explanation)
greifswald <- subset(sbcdata, Center == "Greifswald")
dup <- duplicated(greifswald[, .(Id, Time)]) |
    duplicated(greifswald[, .(Id, Time)], fromLast = TRUE)
mean(dup)
#> [1] 0.0008293431
greifswald <- greifswald[!dup,]

## adjust SecToIcu for subset Greifswald (see text above for explanation)
greifswald <- subset(sbcdata, Center == "Greifswald")
## create helper columns
greifswald[, isNewWard := (
    c(FALSE, Id[-1] == Id[-.N]) &         # same case
    c(FALSE, Sender[-1] != Sender[-.N])   # new ward
)]
#>             Id   Age    Sex Diagnosis     Center        Set Sender Episode
#>          <int> <int> <char>    <char>     <char>     <char> <char>   <int>
#>      1:      1    25      W   Control Greifswald Validation    AMB       1
#>      2:      2    75      M   Control Greifswald Validation    GEN       1
#>      3:      3    77      W    Sepsis Greifswald Validation  OTHER       1
#>      4:      3    77      W    Sepsis Greifswald Validation  OTHER       1
#>      5:      3    77      W    Sepsis Greifswald Validation  OTHER       1
#>     ---                                                                   
#> 665583: 169140    56      M   Control Greifswald Validation    GEN       1
#> 665584: 169140    56      M   Control Greifswald Validation    GEN       1
#> 665585: 169140    56      M   Control Greifswald Validation    GEN       1
#> 665586: 169141    60      W   Control Greifswald Validation     ED       1
#> 665587: 169141    60      W   Control Greifswald Validation    GEN       1
#>           Time TargetIcu SecToIcu   CRP   HGB   MCV   PCT   PLT   RBC   WBC
#>          <num>    <char>    <num> <num> <num> <num> <num> <int> <num> <num>
#>      1:      0      <NA>       NA  15.5   7.0  80.5    NA   264   4.2  8.40
#>      2:      0      <NA>       NA   7.4   8.4  87.9    NA   260   4.8  8.47
#>      3:      0      <NA>       NA  96.1   4.8  81.7    NA   385   3.0 13.20
#>      4: 318840      <NA>       NA  57.0   4.4  82.2    NA   416   2.8 14.20
#>      5: 578640      <NA>       NA  93.4   5.7  82.0  0.22   437   3.5 13.80
#>     ---                                                                    
#> 665583: 118380      <NA>       NA    NA   8.7  88.1    NA   200   4.7  6.20
#> 665584: 168660      <NA>       NA  95.0   8.7  88.4    NA   233   4.7  6.92
#> 665585: 340440      <NA>       NA  63.6   8.1  87.6    NA   225   4.5  3.70
#> 665586:      0      <NA>       NA    NA   9.1  90.0    NA   337   4.8 10.80
#> 665587:  70740      <NA>       NA   4.7   9.7  91.7    NA   371   5.2 11.60
#>         Excluded   Label isNewWard
#>           <lgcl>  <char>    <lgcl>
#>      1:    FALSE Control     FALSE
#>      2:    FALSE Control     FALSE
#>      3:     TRUE Control     FALSE
#>      4:     TRUE Control     FALSE
#>      5:     TRUE Control     FALSE
#>     ---                           
#> 665583:    FALSE Control     FALSE
#> 665584:    FALSE Control     FALSE
#> 665585:    FALSE Control     FALSE
#> 665586:    FALSE Control     FALSE
#> 665587:    FALSE Control      TRUE
greifswald[, isIcuAdmission := isNewWard & grepl("ICU", Sender)]
#>             Id   Age    Sex Diagnosis     Center        Set Sender Episode
#>          <int> <int> <char>    <char>     <char>     <char> <char>   <int>
#>      1:      1    25      W   Control Greifswald Validation    AMB       1
#>      2:      2    75      M   Control Greifswald Validation    GEN       1
#>      3:      3    77      W    Sepsis Greifswald Validation  OTHER       1
#>      4:      3    77      W    Sepsis Greifswald Validation  OTHER       1
#>      5:      3    77      W    Sepsis Greifswald Validation  OTHER       1
#>     ---                                                                   
#> 665583: 169140    56      M   Control Greifswald Validation    GEN       1
#> 665584: 169140    56      M   Control Greifswald Validation    GEN       1
#> 665585: 169140    56      M   Control Greifswald Validation    GEN       1
#> 665586: 169141    60      W   Control Greifswald Validation     ED       1
#> 665587: 169141    60      W   Control Greifswald Validation    GEN       1
#>           Time TargetIcu SecToIcu   CRP   HGB   MCV   PCT   PLT   RBC   WBC
#>          <num>    <char>    <num> <num> <num> <num> <num> <int> <num> <num>
#>      1:      0      <NA>       NA  15.5   7.0  80.5    NA   264   4.2  8.40
#>      2:      0      <NA>       NA   7.4   8.4  87.9    NA   260   4.8  8.47
#>      3:      0      <NA>       NA  96.1   4.8  81.7    NA   385   3.0 13.20
#>      4: 318840      <NA>       NA  57.0   4.4  82.2    NA   416   2.8 14.20
#>      5: 578640      <NA>       NA  93.4   5.7  82.0  0.22   437   3.5 13.80
#>     ---                                                                    
#> 665583: 118380      <NA>       NA    NA   8.7  88.1    NA   200   4.7  6.20
#> 665584: 168660      <NA>       NA  95.0   8.7  88.4    NA   233   4.7  6.92
#> 665585: 340440      <NA>       NA  63.6   8.1  87.6    NA   225   4.5  3.70
#> 665586:      0      <NA>       NA    NA   9.1  90.0    NA   337   4.8 10.80
#> 665587:  70740      <NA>       NA   4.7   9.7  91.7    NA   371   5.2 11.60
#>         Excluded   Label isNewWard isIcuAdmission
#>           <lgcl>  <char>    <lgcl>         <lgcl>
#>      1:    FALSE Control     FALSE          FALSE
#>      2:    FALSE Control     FALSE          FALSE
#>      3:     TRUE Control     FALSE          FALSE
#>      4:     TRUE Control     FALSE          FALSE
#>      5:     TRUE Control     FALSE          FALSE
#>     ---                                          
#> 665583:    FALSE Control     FALSE          FALSE
#> 665584:    FALSE Control     FALSE          FALSE
#> 665585:    FALSE Control     FALSE          FALSE
#> 665586:    FALSE Control     FALSE          FALSE
#> 665587:    FALSE Control      TRUE          FALSE
## recalculate SecToIcu
greifswald[, SecToIcu := SecToIcu - SecToIcu[isIcuAdmission][Episode]]
#>             Id   Age    Sex Diagnosis     Center        Set Sender Episode
#>          <int> <int> <char>    <char>     <char>     <char> <char>   <int>
#>      1:      1    25      W   Control Greifswald Validation    AMB       1
#>      2:      2    75      M   Control Greifswald Validation    GEN       1
#>      3:      3    77      W    Sepsis Greifswald Validation  OTHER       1
#>      4:      3    77      W    Sepsis Greifswald Validation  OTHER       1
#>      5:      3    77      W    Sepsis Greifswald Validation  OTHER       1
#>     ---                                                                   
#> 665583: 169140    56      M   Control Greifswald Validation    GEN       1
#> 665584: 169140    56      M   Control Greifswald Validation    GEN       1
#> 665585: 169140    56      M   Control Greifswald Validation    GEN       1
#> 665586: 169141    60      W   Control Greifswald Validation     ED       1
#> 665587: 169141    60      W   Control Greifswald Validation    GEN       1
#>           Time TargetIcu SecToIcu   CRP   HGB   MCV   PCT   PLT   RBC   WBC
#>          <num>    <char>    <num> <num> <num> <num> <num> <int> <num> <num>
#>      1:      0      <NA>       NA  15.5   7.0  80.5    NA   264   4.2  8.40
#>      2:      0      <NA>       NA   7.4   8.4  87.9    NA   260   4.8  8.47
#>      3:      0      <NA>       NA  96.1   4.8  81.7    NA   385   3.0 13.20
#>      4: 318840      <NA>       NA  57.0   4.4  82.2    NA   416   2.8 14.20
#>      5: 578640      <NA>       NA  93.4   5.7  82.0  0.22   437   3.5 13.80
#>     ---                                                                    
#> 665583: 118380      <NA>       NA    NA   8.7  88.1    NA   200   4.7  6.20
#> 665584: 168660      <NA>       NA  95.0   8.7  88.4    NA   233   4.7  6.92
#> 665585: 340440      <NA>       NA  63.6   8.1  87.6    NA   225   4.5  3.70
#> 665586:      0      <NA>       NA    NA   9.1  90.0    NA   337   4.8 10.80
#> 665587:  70740      <NA>       NA   4.7   9.7  91.7    NA   371   5.2 11.60
#>         Excluded   Label isNewWard isIcuAdmission
#>           <lgcl>  <char>    <lgcl>         <lgcl>
#>      1:    FALSE Control     FALSE          FALSE
#>      2:    FALSE Control     FALSE          FALSE
#>      3:     TRUE Control     FALSE          FALSE
#>      4:     TRUE Control     FALSE          FALSE
#>      5:     TRUE Control     FALSE          FALSE
#>     ---                                          
#> 665583:    FALSE Control     FALSE          FALSE
#> 665584:    FALSE Control     FALSE          FALSE
#> 665585:    FALSE Control     FALSE          FALSE
#> 665586:    FALSE Control     FALSE          FALSE
#> 665587:    FALSE Control      TRUE          FALSE
## drop helper columns
greifswald[, `:=` (isNewWard = NULL, isIcuAdmission = NULL)]
#>             Id   Age    Sex Diagnosis     Center        Set Sender Episode
#>          <int> <int> <char>    <char>     <char>     <char> <char>   <int>
#>      1:      1    25      W   Control Greifswald Validation    AMB       1
#>      2:      2    75      M   Control Greifswald Validation    GEN       1
#>      3:      3    77      W    Sepsis Greifswald Validation  OTHER       1
#>      4:      3    77      W    Sepsis Greifswald Validation  OTHER       1
#>      5:      3    77      W    Sepsis Greifswald Validation  OTHER       1
#>     ---                                                                   
#> 665583: 169140    56      M   Control Greifswald Validation    GEN       1
#> 665584: 169140    56      M   Control Greifswald Validation    GEN       1
#> 665585: 169140    56      M   Control Greifswald Validation    GEN       1
#> 665586: 169141    60      W   Control Greifswald Validation     ED       1
#> 665587: 169141    60      W   Control Greifswald Validation    GEN       1
#>           Time TargetIcu SecToIcu   CRP   HGB   MCV   PCT   PLT   RBC   WBC
#>          <num>    <char>    <num> <num> <num> <num> <num> <int> <num> <num>
#>      1:      0      <NA>       NA  15.5   7.0  80.5    NA   264   4.2  8.40
#>      2:      0      <NA>       NA   7.4   8.4  87.9    NA   260   4.8  8.47
#>      3:      0      <NA>       NA  96.1   4.8  81.7    NA   385   3.0 13.20
#>      4: 318840      <NA>       NA  57.0   4.4  82.2    NA   416   2.8 14.20
#>      5: 578640      <NA>       NA  93.4   5.7  82.0  0.22   437   3.5 13.80
#>     ---                                                                    
#> 665583: 118380      <NA>       NA    NA   8.7  88.1    NA   200   4.7  6.20
#> 665584: 168660      <NA>       NA  95.0   8.7  88.4    NA   233   4.7  6.92
#> 665585: 340440      <NA>       NA  63.6   8.1  87.6    NA   225   4.5  3.70
#> 665586:      0      <NA>       NA    NA   9.1  90.0    NA   337   4.8 10.80
#> 665587:  70740      <NA>       NA   4.7   9.7  91.7    NA   371   5.2 11.60
#>         Excluded   Label
#>           <lgcl>  <char>
#>      1:    FALSE Control
#>      2:    FALSE Control
#>      3:     TRUE Control
#>      4:     TRUE Control
#>      5:     TRUE Control
#>     ---                 
#> 665583:    FALSE Control
#> 665584:    FALSE Control
#> 665585:    FALSE Control
#> 665586:    FALSE Control
#> 665587:    FALSE Control