ObjectiveTo investigate the construction strategy of a knowledge base for health technology assessment (HTA) indicators based on a multi-granularity knowledge representation model, in order to meet the users' diverse demands for HTA knowledge services. MethodsFirstly, we constructed a multi-granularity HTA indicator knowledge representation model based on systematically analyzing the content and structure of the HTA indicator system in literature. Secondly, we extracted multi-granularity HTA indicator knowledge from literatures and conduct subject indexing in a human-computer collaborative way. Finally, based on the HTA knowledge service requirements, a prototype of the HTA indicator knowledge base-HTA Indicators was designed and developed. ResultsA multi-granularity HTA indicator knowledge representation model was constructed, covering 5 core knowledge units(indicator systems, indicator items, formulas, measurement variables, and subjects), 20 types of attributes, and 12 types of relationships. This model represents the intrinsic characteristics and connections between multi-granularity indicator knowledge units. Knowledge extraction and subject indexing of multi-grain HTA indicators were conducted based on 227 HTA indicator documents, forming instance data. Finally, a prototype of the HTA indicator knowledge base, named HTA Indicators, was developed.HTA Indicators provides services such as multi-granularity HTA indicator knowledge retrieval, navigation, and linking. ConclusionThe construction strategy of the HTA indicator knowledge base based on the multi-granularity knowledge representation model is feasible. The indicator knowledge base can achieve multi-dimensional semantic organization of indicator knowledge, provide multi-level and multi-dimensional indicator knowledge retrieval and discovery services, and meet the users' demand for precise HTA knowledge. In the future, we will explore the use of cutting-edge technologies such as large language models to achieve the automated construction of large-scale HTA knowledge, thereby enhancing the efficiency and intelligence level of knowledge base construction.
ObjectiveTo summarize and explore the application of machine learning models to survival data with non-proportional hazards (NPH), and to provide a methodological reference for large-scale, high-dimensional survival data. MethodsFirst, the concept of NPH and related testing methods were outlined. Then the advantages and disadvantages of machine learning algorithm-based NPH survival analysis methods were summarized based on the relevant literature. Finally, using real-world clinical data, a case study was conducted with two ensemble machine learning models and two deep learning models in survival data with NPH: a study of the risk of death within 30 days in stroke patients in the ICU. ResultsEight commonly used machine learning model-based NPH survival analyses were identified, including five traditional machine learning models such as random survival forest and three deep learning models based on artificial neural networks (e.g., DeepHit). The case study found that the random survival forest model performed the best (C-index=0.773, IBS=0.151), and the permutation importance-based algorithm found that age was the most important characteristic affecting the risk of death in stroke patients. ConclusionSurvival big data in the era of precision medicine presenting NPH are common, and machine learning model-based survival analysis can be used when faced with more complex survival data and higher survival analysis needs.