MiRoR project highlight – Assisted authoring for avoiding inadequate claims in scientific reporting
by Anna Koroleva, MiRoR PhD Fellow at the Computer Science Laboratory for Mechanics and Engineering Sciences (LIMSI – CNRS, France)
My PhD project tackles the problem of misleading reporting of research results in scientific articles, primarily in the field of biomedicine. This phenomenon is known as “spin”. It consists in presenting the studied treatment as being more effective and/or safer than the experiments actually proved. For example, an article’s conclusion may say that “The experimental treatment is well tolerated and efficacious”, while during the trial a high percentage of patients experienced negative side effects related to the treatment. Such too optimistic presentation of research results may have a highly negative impact on health care, making both physicians and patients believe in the beneficial effect of the presented treatment, and thus inducing a propensity to use it more widely than it would be reasonable while the effect has not been effectively confirmed by experience.
Most widespread forms of spin include reporting only a part of planned and obtained research results: focusing on secondary outcomes or patient subgroups for which the results are statistically significant, thus not reporting, or only briefly mentioning, those that have not reached significance. It means that the targeted results, for which the trial had been designed and conducted, are simply omitted in the report if they do not confirm positive effect of the treatment (or even show its negative effects).
Spin is a very common problem; authors may use it even unintentionally, for example because of a common desire to present “meaningful” and “important” results. Spin is often not recognized by peer-reviewers and readers of scientific articles. So there is a need to help both authors and readers to check if the research results are adequately reported in a given article. My goal is to create an algorithm that would detect spin automatically with the use of relevant Natural Language Processing (NLP) techniques. Broadly speaking, NLP deals with creating computational models of natural language phenomena in order to make the computer understand and analyze the text, ideally the way people do, or at best as nearly as is possible. Automatic text processing helps to save time and effort that would be required for treating data manually. Even for tasks that cannot be completely automated at the moment, NLP can still provide valuable assistance to human readers to analyze texts faster and more efficiently.
In my project, we regard spin as a linguistic phenomenon which can be modeled and thus detected on the basis of certain features of the text of scientific articles (words and constructions used, similarity or dissimilarity between pieces of the text, semantic relationships between words and particular linguistic constructions, etc.). Spin is a complex phenomenon, and a spin-detecting algorithm should include many NLP techniques, such as text classification, information extraction, paraphrase analysis, semantic analysis. We do not aim at detecting all existing types of spin in all types of scientific publications; we focus on a few types of spin that are most common for randomized controlled trials. At one stage of my project I plan to assess the complexity of the spin detection task and the highest level of performance that our algorithm can achieve.
Why am I interested in this project? I have a degree in theoretical and applied linguistics and after graduating I worked in computational linguistics, being part of a team developing a multi-task NLP system. My work gave me a possibility to see in practice applications of NLP to different tasks in various fields and to understand its importance. Medicine is one of the domains where NLP is most demanded, because the amount of existing data is large and cannot be processed manually, while information should not be missed because often the patient health is at stake. My project gives me an exciting chance to get involved in a highly challenging task of developing a complex, multi-level NLP system that could contribute to provide a solution to automatic spin detection. It offers me as well an opportunity to act in an international environment, collaborating with researchers from France, the Netherlands, the UK, working in the fields of medicine, computer science, and linguistics. I hope that in the course of this project, we will be able to create an effective approach to automatic spin detection to support correct reporting of scientific research results.