Summary: | Poster presentation at the NORA Annual Conference 2023 05.06. - 06.06.23, Tromsø, Norway. Development, validation and comparison of machine learning methods require access to data, sometimes lots of data. Within health applications, data sharing can be restricted due to patient privacy, and the few publicly available data sets become even more valuable for the machine learning community. One such type of data are H&E whole slide images (WSI), which are stained tumour tissue, used in hospitals to detect and classify cancer, see Fig. 1. The Cancer Genome Atlas (TCGA) has made an enormous contribution to publicly available data sets. For breast cancer H&E WSI they are by far the largest data set, with more than 1,000 patients, twice as many as the second largest contributor, the two Camelyon competition data sets [1] with 399 + 200 patients.
|