SUMMARY: Watson for Oncology, is an Artificial Intelligence (AI) computer developed by IBM in collaboration with Memorial Sloan Kettering Cancer Center. This revolutionary tool has the advanced ability to analyze the meaning and context of structured and unstructured data in the patients chart and is able to assimilate key patient information and then deliver evidence based treatment recommendations, through analytical approaches. The authors conducted this study to assess concordance between the Artificial Intelligence platform, Watson for Oncology (WFO) and their own multidisciplinary tumor board, which comprised of a group of 12 to 15 oncologists, who met weekly to review cases from their hospital system. The goal of the study was to understand how Watson for Oncology would impact oncologists day-to-day practice, and how Watson’s recommendations compared to the decisions of their team of experts.
The researchers studied 638 patients with breast cancer treated at Manipal Comprehensive Cancer Center in Bengaluru, India. Patient data was entered into the Watson for Oncology (WFO) computer system and the degree of concordance between WFO’s recommendations and those of the tumor board were analyzed, in addition to the time it took for each group to come up with their recommendations. In this study, WFO analyzed more than 100 patient attributes for breast cancer and provided treatment options ranked as follows – Recommended Standard Treatment (REC), For Consideration (FC) and Not Recommended (NREC). These recommendations provided by WFO were evidence based and the computer system allowed the treating physicians to learn more about the recommendations and the rationale behind those recommendations.
It was noted that 90% of WFO’s Recommendations for Standard Treatment (REC) and For Consideration (FC) were concordant with the recommendations of the tumor board. WFO recommendations were concordant nearly 80% of the time in non-metastatic breast cancer, but only 45% of the time in metastatic disease. In patients with triple-negative breast cancer, WFO agreed with the physicians 68% of the time, but in HER-2 negative cases, WFO’s recommendations matched the physician’s recommendations only 35% of the time. The authors attributed the difference in concordance to fewer treatment options for triple-negative breast cancer, compared to HER-2 negative breast cancer. Further, including HER-2 patients made more treatment options available and this would increase the demands on human thinking capacity. Additionally, more complicated cases lead to more divergent opinions on the recommended treatment.
This study also compared the amount of time it took to provide recommendations, after the data was captured and analyzed. It took an average of 20 minutes when done manually, but after gaining more familiarity with the cases, the time decreased to about 12 minutes. Watson for Oncology by comparison, took a median time of 40 seconds to capture and analyze data and give a treatment recommendation.
It was concluded that while Artificial Intelligence is a step towards personalized medicine, it should not be viewed as a replacement for a physician, but rather as a complement. In the end, the best treatment option for the patient should be determined together by the treating physician and the patient. Double blinded validation study to assess performance of IBM artificial intelligence platform Watson for oncology in comparison with Manipal multidisciplinary tumor board—first study of 638 breast cancer cases. Somashekhar SP, Kumar R, Rauthan A, et al. Presented at: San Antonio Breast Cancer Symposium, Friday, Dec. 9, 2016; San Antonio, TX. Abstract S6-07