Wang and Li of Sentrana approached the modeling challenge assuming a non-linear causal relationship exists between the disease progression rate and the initial condition. They then used the different dimensions of available data to develop new variables to better predict the progression of ALS by looking for patterns behind the data, such as relationships between different body parts. One example of how they handled the variables was to subdivide the ten component sub-scores of the ALS Functional Rating Scale (ALSFRS) into five body-part related categories (such as face, arms, legs, chest, etc.). They then found that the rapidity or stability of the face-related scores (which include tests of speech and swallowing) best predicted the overall disease trajectory and that a decline in chest-related scores signaled the end stage of the disease. The new variables created using a common sense approach, rather than a purely statistical approach, when used along with other variable sets allowed for better prediction of the progression. Because the variables created by the Sentrana team have meanings on their own, they provide the added advantage that clinical trial experts can easily understand the data and interpret results.
The team selected a random forest method due to the relative accuracy of the learning algorithm and its ability to identify the interactions among a variety of predictor variables. Random forest algorithms use a series of decision trees made from random samples of data. After multiple iterations, the results are aggregated for greater predictive accuracy. Additionally, rather than use just one random forest algorithm, Wang and Li opted to use a mixed model to incorporate multiple random forest models in order to mitigate the potential impact of inconsistent or missing measurements across certain important predictor variables. The multiple random forest models developed for each subset of data were then combined for a final result.
“It’s very exciting to use something that we do every day and apply those techniques in other really meaningful areas, especially considering the treatment limitations for this disease,” said Quantitative Modeler Guang “Eric” Li. “Also, the problem itself was interesting because there is not a great deal of existing research to refer to, meaning we needed to make certain assumptions and then revise them as we continued to learn with each modeling iteration. This discovery process was very rewarding,” he added.
“Our goal is not necessarily to have the greatest accuracy, but rather to raise questions that can help inform and perhaps guide future research. If we can ask questions that we haven’t asked before, then we can really drive new research discoveries for those in the field,” said Principal Scientist Liuxia Wang, PhD. “There are two key areas our research can inform: the possibility of stages in the disease and the relationship between body parts. We hope the questions we raised around these two areas will help guide the direction of new research,” added Wang.
To watch or download the presentation please click here: http://sentrana.wistia.com/
To view the presentation slides please visit: http://www.sentrana.com/
The RECOMB conference series was founded in 1997 to provide a scientific forum for theoretical advances in computational biology and their applications in molecular biology and medicine. The conference solicits research contributions from all areas of computational molecular biology. Learn more at http://recomb.org/
Sentrana is a scientific marketing company. We help our clients make more informed and accurate decisions about pricing, promotions & advertising, product assortment, and sales force alignment with the data and capabilities they already possess. Our holistic demand optimization solutions integrate advanced predictive technology with the qualitative insights derived from human knowledge and experience. Sentrana’