387 0

A Study on Effective Machine Learning Approaches Using Interpretability

A Study on Effective Machine Learning Approaches Using Interpretability
Other Titles
해석가능성을 고려한 효과적인 기계학습 접근방법에 대한 연구
Kiburm Song
Alternative Author(s)
Issue Date
Development of machine learning applications is common in various fields of industry. Neural network-based algorithms popularly used are known for property of black box. In accordance with the stream of right to know, interpretable machine learning (IML) or explainable artificial intelligence (XAI) is getting interested as hot issue in the field of machine learning / artificial intelligence. Two hot keywords, interpretability and explainability, are used in mixed way. In spite of the absence of unified term, we think interpretability and explainability have same meaning, and will treat two terms equally. There are two big approaches in the field of interpretable ML. Intrinsic interpretable models mean machine learning models that are regarded interpretable due to their simple structure, such as decision trees or linear models. Post hoc interpretable models represent the application of model- agnostic interpretability methods after model training. For instance, permutation feature importance is a kind of post hoc interpretability methods. In this article, we covered both of intrinsic interpretability and model- agnostic interpretability. Therefore, we proposed three perspective of effective machine learning approaches using interpretability: ∙ Development of intrinsic interpretable model (PCAR) ∙ Development of feature summary statistic and visualization (CR2D plot) ∙ Applying model-agnostic methods to post hoc interpretable model (SAX-TM and its statistical properties) Fist, for the development of intrinsic interpretable model, we proposed predictability based collective class association rule mining (PCAR). Since basically rule based classification, it clarifies rationale of classification according to each record of database. Most of all, PCAR extracts predictability of each rules in the phase of rule evaluation. PCAR uses good rules which has high predictability preferentially. We verified that PCAR reduces classification error comparing to other popular rule-based algorithms. Second, for the development of feature summary statistic and visualization, we proposed algorithm to validate robustness of the viii representative pattern. We created two feature summary statistics, coverage rate (CR) and ranking distance (RD). Then, we plotted CR and RD in a same time for meaningful use of those statistics and named it as CR2D plot. CR2D plot could distinguish the normal majority streams with outliers. After off-line training, we can monitor whether new stream is outlier or not in on-line environment Lastly, in applying model-agnostic methods to post hoc interpretable model, we proposed symbolic aggregate approximation with transition matrix (SAX-TM) algorithm. Original SAX cannot capture trend information in the perspective of shape of time series. SAX-TM can represent trend information by appending transition matrix to basic SAX. In addition, we identified intrinsic statistical properties of SAX-TM. We selected information loss, KL-divergence, and information embedding cost as important measurements referring to literature review. SAX-TM remarkably reduce classification error comparing with SAX and shows explicit profits in accuracy even though it merely appended transition information to original SAX.
Appears in Collections:
Files in This Item:
There are no files associated with this item.
RIS (EndNote)
XLS (Excel)


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.