Classifying Cancer Pathology Reports with Hierarchical Self-Attention Networks [electronic resource]

0 Người đánh giá. Xếp hạng trung bình 0

Tác giả:

Ngôn ngữ: eng

Ký hiệu phân loại: 616.99 Tumors and miscellaneous communicable diseases

Thông tin xuất bản: Washington, D.C. : Oak Ridge, Tenn. : United States. Dept. of Energy. Office of Science ; Distributed by the Office of Scientific and Technical Information, U.S. Dept. of Energy, 2019

Mô tả vật lý: Size: Article No. 101726 : , digital, PDF file.

Bộ sưu tập: Metadata

ID: 259975

Thêm vào giỏ

Liên kết toàn văn

Tóm tắt
Chủ đề

We introduce a deep learning architecture, hierarchical self-attention networks (HiSANs), designed for classifying pathology reports and show how its unique architecture leads to a new state-of-the-art in accuracy, faster training, and clear interpretability. We evaluate performance on a corpus of 374,899 pathology reports obtained from the National Cancer Institute's (NCI) Surveillance, Epidemiology, and End Results (SEER) program. Each pathology report is associated with five clinical classification tasks ? site, laterality, behavior, histology, and grade. We compare the performance of the HiSAN against other machine learning and deep learning approaches commonly used on medical text data ? Naive Bayes, logistic regression, convolutional neural networks, and hierarchical attention networks (the previous state-of-the-art). We show that HiSANs are superior to other machine learning and deep learning text classifiers in both accuracy and macro F-score across all five classification tasks. Compared to the previous state-of-the-art, hierarchical attention networks, HiSANs not only are an order of magnitude faster to train, but also achieve about 1% better relative accuracy and 5% better relative macro F-score.

1. 60 applied life sciences
2. 97 mathematics and computing
3. Cancer pathology reports
4. Clinical reports
5. Deep learning
6. Natural language processing
7. Text classification

Tạo bộ sưu tập với mã QR