A Study on Effective Machine Learning Approaches Using Interpretability
- Title
- A Study on Effective Machine Learning Approaches Using Interpretability
- Other Titles
- 해석가능성을 고려한 효과적인 기계학습 접근방법에 대한 연구
- Author
- Kiburm Song
- Alternative Author(s)
- 송기범
- Advisor(s)
- 이기천
- Issue Date
- 2019-02
- Publisher
- 한양대학교
- Degree
- Doctor
- Abstract
- Development of machine learning applications is common in various fields
of industry. Neural network-based algorithms popularly used are known for
property of black box. In accordance with the stream of right to know,
interpretable machine learning (IML) or explainable artificial intelligence
(XAI) is getting interested as hot issue in the field of machine learning /
artificial intelligence.
Two hot keywords, interpretability and explainability, are used in mixed
way. In spite of the absence of unified term, we think interpretability and
explainability have same meaning, and will treat two terms equally.
There are two big approaches in the field of interpretable ML. Intrinsic
interpretable models mean machine learning models that are regarded
interpretable due to their simple structure, such as decision trees or linear
models. Post hoc interpretable models represent the application of model-
agnostic interpretability methods after model training. For instance,
permutation feature importance is a kind of post hoc interpretability
methods.
In this article, we covered both of intrinsic interpretability and model-
agnostic interpretability. Therefore, we proposed three perspective of
effective machine learning approaches using interpretability:
∙ Development of intrinsic interpretable model (PCAR)
∙ Development of feature summary statistic and visualization (CR2D plot)
∙ Applying model-agnostic methods to post hoc interpretable model
(SAX-TM and its statistical properties)
Fist, for the development of intrinsic interpretable model, we proposed
predictability based collective class association rule mining (PCAR). Since
basically rule based classification, it clarifies rationale of classification
according to each record of database. Most of all, PCAR extracts
predictability of each rules in the phase of rule evaluation. PCAR uses good
rules which has high predictability preferentially. We verified that PCAR
reduces classification error comparing to other popular rule-based
algorithms.
Second, for the development of feature summary statistic and
visualization, we proposed algorithm to validate robustness of the
viii
representative pattern. We created two feature summary statistics,
coverage rate (CR) and ranking distance (RD). Then, we plotted CR and RD
in a same time for meaningful use of those statistics and named it as CR2D
plot. CR2D plot could distinguish the normal majority streams with outliers.
After off-line training, we can monitor whether new stream is outlier or not
in on-line environment
Lastly, in applying model-agnostic methods to post hoc interpretable
model, we proposed symbolic aggregate approximation with transition
matrix (SAX-TM) algorithm. Original SAX cannot capture trend information
in the perspective of shape of time series. SAX-TM can represent trend
information by appending transition matrix to basic SAX. In addition, we
identified intrinsic statistical properties of SAX-TM. We selected
information loss, KL-divergence, and information embedding cost as
important measurements referring to literature review. SAX-TM
remarkably reduce classification error comparing with SAX and shows
explicit profits in accuracy even though it merely appended transition
information to original SAX.
- URI
- https://repository.hanyang.ac.kr/handle/20.500.11754/99309http://hanyang.dcollection.net/common/orgView/200000434367
- Appears in Collections:
- GRADUATE SCHOOL[S](대학원) > INDUSTRIAL ENGINEERING(산업공학과) > Theses (Ph.D.)
- Files in This Item:
There are no files associated with this item.
- Export
- RIS (EndNote)
- XLS (Excel)
- XML