
The next section describes the collection of samples and preparing them for the samples listed in the data collection provided. Moreover, we explain the explanation of the data and methods used to verify the authenticity of the data set.
Ethics approval and approval to participate
This work obtained the approval of the Ethics Committee with the number of 2022Zfyj295-01. Related data for this work was obtained from the Heilongjiang Mother and Child Hospital (HMCHH) through cooperation. The original data collection was approved by the HMCHH Corporation and a commitment to the principles shown in the Helsinki Declaration. The waiver was granted to the patient’s enlightened approval by HMCHH because all samples were unidentified by the Institutional Review Council.
Prepare the sample and digitization
The data collection contains samples of patients who underwent cervical cells at Heilongjiang Hospital for Mother and Child between October 2018 and May 2019. According to exam reports, a total of 129 TCT (see ethics and assembly of participation) was collected. As shown in Figure 1, each slice was numbered and divided into 333 non -overlapping spots (2048 x 2048 pixels) in the 20x objective enlargement using Olympus Bx53 visual microscope. Image corrections are removed with low information content, such as those covered by background or unclear corrections. The remaining 8,037 corrections are stored as cell science images in the data set in a file format.
Workflow for data generation and evaluation experience. ((ATCT samples were numbered into full chip pictures (WSIS) and divided into spots 2048 x 2048 pixels. ((forAll stains have been classified, reviewed by two pathogens, and have been finally examined by experienced pathologists to end the explanation of abnormal cells. ((CMany representative detection models have been adopted to verify our data collections. The experiment is carried out by dividing the data groups, training the model and evaluating the prediction results.
The explanatory comments process
For unnatural cell thunderbolt comments, three pathologists have participated in preparing the explanatory comments for this data group, with the aim of making comprehensive comments for abnormal cervical cells in each correction. We have mentioned three specialists in pathology as A, B and C; A with about 33 years of experience in reading cervical cell science, the reader B and C had about 10 years of experience. A defect or its nature in the cervical cell has been determined according to ACOGS directives14. Depending on the guidance, readers draw specific boxes around the abnormal cells in each image correcting using the Colabiller tool (http://www.jingbiaozhu.com/As shown in Figure 1b.
The generation of final illustrative comments file follows three steps: the initial signs step, the verification step, and the final examination step. A picture was first set randomly for the reader B or C. Once the signs were completed, then the image and explanation were transferred to another reader for the review. Finally, the explanatory comments were examined and exported by the reader A.
All illustrations file files are stored in a file format. XML and keep the same number as the corresponding images.
Evaluation methods
To verify the authenticity of the proposed data collection in this study, we used many pre-published representative detection models: two-phase structure R-C-15CASCADE R-CNN16R-CNN scattered17And SSD for one -stage architecture18Al Ain retina networks19FCOS20 And architecture from one side to the tip of Yolov321Yolov722Detr23 To perform an abnormal cervical cell detection analysis. Initially, we randomly chose 20 % of all patients as a test sub -group. After that, the remaining patients were randomly divided into five folds to create a five -fold subsidiary of the training set. Note that all the correspondence of the image corresponding to the patient itself enters the same sub -group in this process.
In each fold, we trained 30 pm using the sub -training group and retaining the model that was better to perform on the health verification group during the training process. After that, a sub -group was conducted for the test and used to assess the model upon completion of the training. Model parameters have been updated via Adamw24 Improving the size of 8 batch, while the initial learning rate (LR0) was 2 x 10−5 With weight decay 1 x 10−4. To avoid excess, the pocket of solid completeness25 The algorithm was used to adjust the change rate change.
The data increase has also been adopted to overcome the involvement. In our experience, we conducted the random horizontal face, vertical face, rotation, brightness change, gray scale, and blurring Gaous. These AFFINE transformations helped the model a better understanding of the entry image because it saw the photos in many converted views. The input photos were randomly rotated by 90, 180, 270 degrees. The scope between [0.8, 1.5] It was used to change brightness, and the value of Sigma between [0, 5] It was used in Blid Gaousi.