Consortium for Healthcare Informatics Research CHIR

United States

Consortium for Healthcare Informatics Research CHIR

United States

Time filter

Source Type

Luther S.,Consortium for Healthcare Informatics Research CHIR | Berndt D.,Consortium for Healthcare Informatics Research CHIR | Berndt D.,University of South Florida | Finch D.,Consortium for Healthcare Informatics Research CHIR | And 5 more authors.
Journal of Biomedical Informatics | Year: 2011

Statistical text mining was used to supplement efforts to develop a clinical vocabulary for post-traumatic stress disorder (PTSD) in the VA. A set of outpatient progress notes was collected for a cohort of 405 unique veterans with PTSD and a comparison group of 392 with other psychological conditions at one VA hospital. Two methods were employed: (1) "multi-model term scoring" used stepwise logistic regression to develop 21 separate models by varying three frequency weight and seven term weight options and (2) "iterative term refinement" which used a standard stop list followed by clinical review to eliminate non-clinical terms and terms not related to PTSD. Combined results of the two methods were reviewed by two clinicians resulting in 226 unique PTSD related terms. Results of the statistical text mining methods were compared with ongoing efforts to identify terms based on literature review, focus groups with clinicians treating PTSD and review of an existing vocabulary, lending support to the contributions of the STM analyses. © 2011.


PubMed | Consortium for Healthcare Informatics Research CHIR
Type: | Journal: Journal of biomedical informatics | Year: 2011

Statistical text mining was used to supplement efforts to develop a clinical vocabulary for post-traumatic stress disorder (PTSD) in the VA. A set of outpatient progress notes was collected for a cohort of 405 unique veterans with PTSD and a comparison group of 392 with other psychological conditions at one VA hospital. Two methods were employed: (1) multi-model term scoring used stepwise logistic regression to develop 21 separate models by varying three frequency weight and seven term weight options and (2) iterative term refinement which used a standard stop list followed by clinical review to eliminate non-clinical terms and terms not related to PTSD. Combined results of the two methods were reviewed by two clinicians resulting in 226 unique PTSD related terms. Results of the statistical text mining methods were compared with ongoing efforts to identify terms based on literature review, focus groups with clinicians treating PTSD and review of an existing vocabulary, lending support to the contributions of the STM analyses.

Loading Consortium for Healthcare Informatics Research CHIR collaborators
Loading Consortium for Healthcare Informatics Research CHIR collaborators