Details¶

Goal¶

The goal of VESSEL12 is to compare methods for (semi-)automatic segmentation of the vessels in the lungs from chest computed tomography scans taken from both healthy and diseased populations.

Important dates¶

November 25, 2011: Data ready for download
April 1, 2012: Deadline for submission of results
April 7, 2012: Evaluation results sent to participants
April 13, 2012: Deadline for abstract submission
May 2, 2012: Workshop at ISBI

Task¶

The task in this challenge is to identify vessels in CT images of human lungs. The data set contains both scans from asymptomatic subjects as well as scans from patients with respiratory diseases which affect the lungs in such a way that the task of identifying vessels becomes challenging.

Rules¶

Rules of participation: if you submit results before April 1, we expect you to submit an abstract before April 13 and if at all possible, participate in the ISBI workshop. Rules regarding use of data: for now this data may only be used for the purpose of participation in this challenge.

Data Description¶

For this challenge, a number of chest CT scans is available for download. The scans come from a variety of sources and represent a variety of clinically common scanners and protocols. The scans have been selected such that in approximately half of the scans contrast agent was used. About half of the scans contain abnormalities such as emphysema, nodules or pulmonary embolisms. The maximum slice spacing present is 1 mm and most scans are (near) isotropic. To ensure consistent evaluation, reference vessel segmentations for the data cannot be downloaded and will not be made available in the future.

For each scan in the VESSEL12 dataset a binary lungmask is available on the download page, It is not required to use the lungmasks to participate in the challenge. Only voxels inside these lung masks will be used in the evaluation. The lung masks are provided as-is, without a claim of being perfect.

Data format¶

Downloaded files end with .tar.bz2. They should first be decompressed with bzip2 and subsequently untarred. Many programs, for example the free program 7zip, can do this.

Each downloaded file contains CT scans, stored in Meta (or MHD/RAW) format. This format stores an image as an ASCII readable header file with extension .mhd and a separate binary file for the image data with extension .raw. This format is ITK compatible. Documentation is available here. Applications that can read the data is MeVisLab or SNAP. If you want to write your own code to read the data, note that in the header file you can find the dimensions of the scan and the voxel spacing. In the raw file the values for each voxel are stored consecutively with index running first over x, then y, then z.

The voxel type for The CT scans is SHORT (16 bit signed). The voxel type for the lungmasks is UCHAR (8 bit unsigned). The lungmasks contain only the values 0 for non lung and 1 for lung.

Example scans¶

For this challenge, three example scans can be downloaded. For each of these scans the lung masks image and annotations csv file are included in the download. The annotation protocol used for these scans is the same as is used for the 20 VESSEL12 test scans. Each annotation file contains a list of labeled points for a single scan. Each point has been labeled by three annotators independently. Only points on which 3 annotators agreed on the label have been included.

The annotation files are in csv format. The format of each point is "x,y,z,label"
x, y and z designate 0-based voxel coordinates. The point 0,0,0 is the voxel in the upper left corner of the first slice. A voxel with label 1 indicates a vessel and a label 0 means this voxel is classified as non-vessel.

Submission Requirements¶

All teams are encouraged to submit probabilistic segmentation (on the scale of 0 to 255) for each image. 0 indicates a very low probability for being vessel, 255 a high probability. Binary segmentations can also be submitted. In the case of a binary submission, a distance transform will be applied to generate a probabilistic mask. In Addition to the segmentations, each team should submit a description of their algorithm in pdf format before April 13. The section Description of the algorithm - checklist contains more information on this.

Each submission should be a single compressed archive containing the vessel segmentations of all images. Any archive that can be decompressed with 7zip is allowed. Segmentation files should be directly in the root of the archive, and not nested in a folder structure. Each segmentation should be a MHD/RAW file of type 8 bit unsigned char. The dimensions of each segmentation should be the same as the scan it is based on. E.g. the dimensions of the segmentation for VESSEL12_01 should be 512,512,355. For convenience, we list the expected filename and file size in bytes for each segmentation:

ResultFileName : Size In bytes
VESSEL12_01.raw : 93061120
VESSEL12_02.raw : 108789760
VESSEL12_03.raw : 139984896
VESSEL12_04.raw : 111673344
VESSEL12_05.raw : 111149056
VESSEL12_06.raw : 98304000
VESSEL12_07.raw : 120848384
VESSEL12_08.raw : 115867648
VESSEL12_09.raw : 142344192
VESSEL12_10.raw : 111673344
VESSEL12_11.raw : 110362624
VESSEL12_12.raw : 116916224
VESSEL12_13.raw : 123469824
VESSEL12_14.raw : 101187584
VESSEL12_15.raw : 99090432
VESSEL12_16.raw : 118226944
VESSEL12_17.raw : 112459776
VESSEL12_18.raw : 106954752
VESSEL12_19.raw : 103809024
VESSEL12_20.raw : 106430464

Evaluation¶

All submissions are evaluated against a manually labeled reference. The goal of the evaluation is to compare the performance of different methods in identifying vessels in the lungs.

Reference Standard Protocol¶

Several axial slices are selected from each image for manual labeling. In each slice, a large number of points within the lung fields (local maxima points and random points) are labeled by human observers in the following categories:
- vessel,
- non-vessel: lung parenchyma,
- non-vessel: fissure,
- non-vessel: airway/airway wall,
- non-vessel: lesion

Note that in this way of constructing the reference, where local maxima are used to select which points to label, many vessel points are identified close to the vessel centerlines. The evaluation is therefore not focused on the accuracy of vessel diameter assessment of the automatic methods. Instead, our focus is on the task of identification of vessels.

Evaluation Metric¶

ROC curve analysis will be used to evaluate the performance of each submission with different probability threshold values. Binary submission will result in a single operating point. For binary submissions, a signed distance transform will be computed so that a ROC curve can still be computed.

Additionally, ROC curve analysis would be performed to zoom in on performance for segmenting large and small vessels, and the performance of the method to differentiate vessel from airway wall and dense lesions, respectively.

Evaluation Result¶

Each team will receive statistics regarding their results. After the workshop, an overview article will be compiled by the organizers of the challenge, with up to three members per participating team as co-authors.

The format of the results table is as follows:

Optimal Threshold is computed to be at: xxx


Dataset	Az	Specificity at optimal threshold	Sensitivity at optimal threshold
Vessels/Non-vessels
VESSEL12_01
VESSEL12_02
...
VESSEL12_20
Small vessels/Non-vessels
Medium vessels/Non-vessels
Large vessels/Non-vessels
Vessels/Airway walls
Vessels/Dense abnormalities
Vessels/Mucus-filled bronchi
Vessels in dense abnormality/Dense abnormalities (Contrast scans only)
Vessels/Nodules (CAD)

Results table Description¶

Metrics for each dataset 2 types of metrics are displayed: First, Area Under the Curve (Az) denotes the area under the ROC curve. For probabilistic submissions this is computed directly from the submitted data. For binary submissions, the ROC curve is computed on a signed distance transform of the binary mask. Second, Specificity/Sensitivity at optimal threshold for probabilistic submissions denotes the point on the ROC curve closest to the optimal classifier, i.e. closest to the left upper corner of the ROC graph for the dataset Vessels/Non-vessels. For binary submissions these values denote the actual Specificity/Sensitivity (i.e. operating point) of the binary submission.
Optimal threshold The values for Specificity/Sensitivity at optimal threshold were computed using this threshold. For binary submissions, this reads "Specificity/Sensitivity is calculated for your original binary mask". This means that optimal Specificity/Sensitivity is computed on the binary mask that was originally submitted. In this case the signed distance transform of the binary mask is not used to calculate optimal Specificity/Sensitivity.
Vessels/Non-vessels This dataset consists of all points which were given the same label by three independent observers. These points will be named Unanimous points in the descriptions below. Metrics are computed over all 20 scans combined.
Small/Medium/Large vessels/Non-vessels Intensity (i.e. HU value) after blurring with a 1mm 3D kernel is used as a measure of vessel size. To obtain equally sized small/medium/large vessel datasets, the 33d and 66th percentile of HU value after blurring is used to divide all unanimous points labeled as vessel. Subsequently all unanimous points not labeled as vessel are added to each of the three datasets as the non-vessel class. Metrics are computed over all 20 scans combined.
Vessels/Airway walls The positive class in this dataset consists of all unanimous points labeled as vessel. For the negative class points on the airway walls were sampled throughout the bronchial tree in each scan. Metrics are computed over all 20 scans combined.
Vessels/Dense abnormalities The positive class in this dataset consists of all unanimous points labeled as vessel. The negative class consists of points sampled inside dense abnormalities (e.g. atelectasis, nodules, fibrosis, adhesive straining). Scans in which no dense abnormalities could be found were excluded from the computation of this score.
Vessels/Mucus-filled bronchi The positive class in this dataset consists of all unanimous points labeled as vessel. The negative class consists of points sampled inside mucus-filled bronchi. Scans in which no mucus-filled bronchi could be found were excluded from the computation this score.
Vessels in dense abnormalities/Dense abnormalities (Contrast scans only) For this score points were sampled inside dense abnormalities in contrast enhanced scans. Points labeled vessel are the positive class, points labeled as dense abnormalities are the negative class. Scans in which no dense abnormalities could be found and scans which were not contrast-enhanced were excluded from the computation of this score.
Vessels/Nodules (CAD) The positive class in this dataset consists of all unanimous points labeled as vessel. For the negative class, points were sampled inside nodules. A nodule detection algorithm was used to make sure no nodules were overlooked. Scans in which no nodules could be found were excluded from the computation of this score.

Evaluation Result¶

Description of the algorithm - checklist¶

Each submission should contain a description file in pdf format. The length of the description should be about 1 - 2 pages. For convenience, we provide a checklist below of items that we believe should be mentioned in a description of a vessel segmentation algorithm.

Briefly describe each step in the structure of the algorithm (If applicable, which type of algorithms were used for preprocessing? How are different types of information combined?).
List limitations of the algorithm. Is the algorithm specifically designed to segment only certain types of scans? Is your algorithm intended for segmenting vessel in pathological lungs? Was it optimized to work for scans with thick or thin slices, are other technical scan parameters expected to influence segmentation performance?
Was the algorithm trained with example data? If so, describe the characteristics of the training data.
If the algorithm has been tested on other databases, you could consider including those results.
What is the average runtime of your algorithm, and on which system is this runtime achieved?
Is your algorithm automatic or semi-automatic? If user input is used, how much is needed and in what way?

FAQ¶

How thick should vessels be labeled?
The thickness of labeling does not impact the overall score greatly. The reference standard is constructed to focus on identification of vessels and not on the accuracy of vessel diameter assessment. Moreover, we encourage teams to submit a probabilistic segmentation instead of a binary one. We will evaluate using ROC curve analysis.
What does your reference data look like?
The description of the reference data can be found here. Three example scans which include lungmasks and reference annotations can be downloaded here.
Do you make a distinction between arteries and veins?
No. In this challenge we only look at vessels in general.
Both automated and semi-automated methods are allowed to participate. Are they evaluated using the same metric?
Both are evaluated using the same method. The nature of the method (automatic / semi-automatic) will be stated upfront when publishing the results. Teams are required to include a description of their algorithm in their submission.
We want to train/extend our algorithm to work on vessels in the lung. Is there any training data available?
Three example scans including reference annotations can be found in the download section.