๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
HYU/๋ฐ์ดํ„ฐ์‚ฌ์ด์–ธ์Šค

11. Rule Based Classification

by Jaeguk 2024. 4. 14.

Rule์— ๊ธฐ๋ฐ˜ํ•œ ๋ถ„๋ฅ˜๊ธฐ

  • ๊ธฐ๋ณธ์ ์ธ ์•„์ด๋””์–ด๋Š” IF-THEN์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ
    • Ex) IF age = youth AND student = false THEN buys_computer = no
  • ๋ฐ์ดํ„ฐ์˜ ์ˆ˜๊ฐ€ ๊ทธ๋ ‡๊ฒŒ ๋งŽ์ง€ ์•Š์€ ๊ฒฝ์šฐ์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค
    • ์ด Rule๋“ค์€ Domain Experts (Human Experts)์— ์˜ํ•ด์„œ ๋งŒ๋“ค์–ด์ง„๋‹ค
  • ํŠน์ • ๋ฐ์ดํ„ฐ๊ฐ€ ์—ฌ๋Ÿฌ ๊ฐœ์˜ Rule์— ๋ถ€ํ•ฉํ•˜๋Š” ๊ฒฝ์šฐ์—๋Š” Conflict Resolution์ด ํ•„์š”ํ•˜๋‹ค
    • Size Ordering
      • Rule์˜ Size๋ผ๋Š” ๊ฒƒ์€ IF๋ฌธ์— ๊ฑธ๋ ค์žˆ๋Š” Feature์˜ ์ˆ˜
      • ์ฆ‰, Size๊ฐ€ ํฌ๋‹ค๋Š” ๊ฒƒ์€ Rule์ด ๊ตฌ์ฒด์ ์ด๊ณ  Toughest ํ•˜๋‹ค๋Š” ๊ฒƒ์ด๋‹ค
      • Size๊ฐ€ ํฐ Rule์ผ์ˆ˜๋ก ๋” ๋†’์€ ์šฐ์„ ์ˆœ์œ„๋ฅผ ์ฃผ๋Š” ๋ฐฉ์‹
    • Class-based Ordering
      • Misclassification cost๊ฐ€ ๋‚ฎ์€ Rule์„ ์„ ํƒํ•˜๋Š” ๋ฐฉ์‹
    • Rule-based Ordering
      • Rule์— ๋Œ€ํ•œ ์šฐ์„ ์ˆœ์œ„๋ฅผ ๋ฏธ๋ฆฌ ์ •ํ•ด๋‘” List๋ฅผ ๊ฐ€์ง€๊ณ  ํŒ๋‹จํ•˜๋Š” ๋ฐฉ์‹
      • ๊ฐ Rule์— ๋Œ€ํ•œ ์šฐ์„ ์ˆœ์œ„ ์—ญ์‹œ Domain Experts๊ฐ€ ์ •ํ•œ๋‹ค

 

Rule Extraction from a Decision Tree


Decision Tree๊ฐ€ ์‚ฌ์‹ค Rule-based๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋‹ค

  • Deicision Tree์˜ ๊ฐ Path๋ฅผ Rule๋กœ ๋ณ€๊ฒฝ ํ•˜๋ฉด ๋œ๋‹ค
  • Rule์€ ๊ฑฐ๋Œ€ํ•œ ํŠธ๋ฆฌ๋ณด๋‹ค ์ดํ•ดํ•˜๊ธฐ๊ฐ€ ์‰ฝ๋‹ค
  • ๋ฃจํŠธ๋ถ€ํ„ฐ ๋ฆฌํ”„๋…ธ๋“œ๊นŒ์ง€์˜ Path ๊ฐ๊ฐ์„ ํ•˜๋‚˜์˜ Rule๋กœ ์ •์˜ํ•œ๋‹ค
    • Path ์ƒ์— ์žˆ๋Š” Feature-Value ์Œ์€ ๊ฐ๊ฐ Rule์—์„œ ํ•˜๋‚˜์˜ ์ ‘์†์‚ฌ๋ฅผ ํ˜•์„ฑํ•œ๋‹ค
    • ๋ฆฌํ”„๋…ธ๋“œ๋Š” ์ตœ์ข…์ ์ธ Class Prediction์„ ์˜๋ฏธํ•œ๋‹ค
    • Ex) age <= 30 AND
  • Decision Tree๋กœ๋ถ€ํ„ฐ Rule์„ ์ถ”์ถœํ•˜๋ฉด, Rule๊ฐ„์— Conflict๊ฐ€ ๋ฐœ์ƒํ•˜์ง€ ์•Š๋Š”๋‹ค (Mutually Exclusive)
    • Decision Tree์—์„œ ๋‚ด๋ ค๊ฐˆ ๋•Œ, ์—ฌ๋Ÿฌ ๊ฐœ์˜ Branch์— ๊ฑธ์น˜๋Š” ๊ฒฝ์šฐ๋Š” ์—†๊ธฐ ๋•Œ๋ฌธ ์—

 

Rule Extraction from Association Rule Mining


Association Rule Mining ์„ ๊ธฐ๋ฐ˜์œผ๋กœ Rule์„ ์ƒ์„ฑ

  • Associative Classification์œผ๋กœ๋„ ๋ถˆ๋ฆฐ๋‹ค
    • Frequent Pattern Mining์„ ๊ธฐ๋ฐ˜์œผ๋กœ Rule์„ ์ƒ์„ฑํ•œ๋‹ค
    • min_sup, min_conf ๊ฐ’์„ ์กฐ์ ˆํ•ด์„œ, Feature-Value(Condition)๊ณผ Class(Prediction) ์‚ฌ์ด์— ๊ฐ•๋ ฅํ•œ ์—ฐ๊ด€๊ด€๊ณ„๋ฅผ ์ฐพ์„ ์ˆ˜๋„ ์žˆ๋‹ค
      • ๋†’์€ Support, Confidence ๊ฐ’์„ ๊ฐ–๋Š” ์—ฐ๊ด€๊ด€๊ณ„๋Š” ๊ฐ•๋ ฅํ•œ ์—ฐ๊ด€๊ด€๊ณ„
    • Rule๋“ค์€ Mutually Exclusiveํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— Conflict Resolution์ด ํ•„์š” ํ•˜๋‹ค
  • Benefits and Limits
    • min_conf ๊ฐ’์„ ๋†’๊ฒŒ ์žก์œผ๋ฉด, ๋งŒ์กฑํ•˜๋Š” Frequent Pattern์˜ ์ˆ˜๊ฐ€ ์ค„์–ด๋“ค ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์— Rule์˜ ์ˆ˜๊ฐ€ ์ค„์–ด๋“ค๊ฒŒ ๋˜์–ด Conflict๊ฐ€ ๊ฐ์†Œํ•˜๊ฒŒ ๋œ๋‹ค
    • ๊ทธ๋ ‡์ง€๋งŒ ๊ทธ๋Ÿด ๊ฒฝ์šฐ์—๋Š” Coverage๊ฐ€ ๋‚ฎ์•„์ง€๋Š” ๋ฌธ์ œ ๊ฐ€ ์žˆ๋‹ค

 

Coverage and Accuracy of a Rule R


Rule R์˜ Coverage์™€ Accuracy

  • Cover๋ฅผ ํ•œ๋‹ค๊ณ ํ•ด์„œ ๋ฐ˜๋“œ์‹œ ์ •ํ™•ํžˆ ๋ถ„๋ฅ˜๋˜๋Š” ๊ฒƒ์€ ์•„๋‹ˆ๋‹ค
  • Coverage๋Š” ์ „์ฒด ๋ฐ์ดํ„ฐ ์ค‘์— ์–ผ๋งˆ๋‚˜ ๋งŽ์€ ๋ฐ์ดํ„ฐ๊ฐ€ R์— ์˜ํ•ด์„œ ํŒ๋ณ„์ด ๋  ์ˆ˜ ์žˆ๋Š”์ง€
  • Accuray๋Š” R์ด ์–ผ๋งˆ๋‚˜ ์ •ํ™•ํ•˜๊ฒŒ ๋ถ„๋ฅ˜๋ฅผ ํ•˜๋Š”์ง€

 

Lazy vs Eager Learning


Lazy Learning๊ณผ Eager Learning์˜ ์ฐจ์ด๋ฅผ ์•Œ์•„๋ณด์ž

  • Eager Learing
    • ์ฃผ์–ด์ง„ Training Data๋ฅผ ๊ฐ€์ง€๊ณ  ๋ฏธ๋ฆฌ Classification Model์„ ํ•™์Šต ์‹œ์ผœ๋‘”๋‹ค
    • ๊ทธ๋ฆฌ๊ณ  Test Data๊ฐ€ ๋“ค์–ด์˜ค๋ฉด, ํ•ด๋‹น ๋ชจ๋ธ์„ ํ†ตํ•ด์„œ ๋ถ„๋ฅ˜๋ฅผ ํ•œ๋‹ค
    • ๋ฏธ๋ฆฌ ๋ชจ๋ธ์„ ๋งŒ๋“ค์–ด๋‘๊ณ  ํ•ด๋‹น ๋ชจ๋ธ์„ ํ†ตํ•ด ์˜ˆ์ธก ์„ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— Eagerํ•˜๋‹ค๊ณ  ํ•œ๋‹ค
    • Machine Learning์˜ ์ „ํ˜•์ ์ธ ๋ฐฉ์‹
    • Ex) Decision Tree
  • Lazy Learning
    • Test Data๊ฐ€ ๋“ค์–ด์˜ค๊ธฐ ์ „๊นŒ์ง€๋Š” ์•„๋ฌด๊ฒƒ๋„ ํ•˜์ง€ ์•Š๋Š”๋‹ค
    • Training Data๋ฅผ ๋‹จ์ง€ ์ €์žฅ๋งŒ ํ•˜๊ณ  ์žˆ๋Š”๋‹ค
    • Test Data๊ฐ€ ๋“ค์–ด์˜ค๋ฉด, Training Data๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ถ„๋ฅ˜ํ•œ๋‹ค
    • Training์—๋Š” ์‹œ๊ฐ„์ด ๋ณ„๋กœ ์†Œ์š”๋˜์ง€ ์•Š์ง€๋งŒ, ์˜ˆ์ธก์„ ํ•˜๋Š”๋ฐ ๋งŽ์€ ์‹œ๊ฐ„ ์ด ๊ฑธ๋ฆฐ๋‹ค
    • Ex) KNN Algorithm

 

K-Nearest Neighbor(KNN) Algorithm


๋Œ€ํ‘œ์ ์ธ Lazy Learning ๋ฐฉ์‹์ธ KNN ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž

K=5์ธ ๊ฒฝ์šฐ์˜ KNN ์•Œ๊ณ ๋ฆฌ์ฆ˜

  • ์œ„์˜ ์˜ˆ์‹œ๋Š” K = 5์ธ ๊ฒฝ์šฐ์ด๋‹ค
  • ๋ชจ๋“  ๋ฐ์ดํ„ฐ๋Š” N์ฐจ์› ์ƒ์—์„œ ๋‚˜ํƒ€๋‚ด์ง„๋‹ค (N์€ Data์˜ Feature์˜ ์ˆ˜)
  • ๋‘ ๋ฐ์ดํ„ฐ ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ(dist(X1, X2))๋Š” ๊ณต๊ฐ„์ƒ์—์„œ ์ •์˜๋œ๋‹ค
    • ์ด๋•Œ dist function์€ ์–ด๋–ค ๊ฒƒ์„ ์‚ฌ์šฉํ•ด๋„ ์ƒ๊ด€์—†๋‹ค
    • Ex) Euclidean, Manhattan, ...
  • Test Data์™€ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด K๊ฐœ์˜ ์ด์›ƒ๋“ค์˜ Class Label์„ ๋ณด๊ณ , ๋” ๋งŽ์€์ชฝ์œผ๋กœ ์˜ˆ์ธก (Voting)
    • Majority Voting (๋‹ค์ˆ˜๊ฒฐ)
    • ์œ„์˜ ์˜ˆ์‹œ์—์„œ๋Š” 5๊ฐœ์˜ ์ด์›ƒ๋“ค ์ค‘์— - ๊ฐ€ ๋” ๋งŽ๊ธฐ ๋•Œ๋ฌธ์— - ๋กœ ๋ถ„๋ฅ˜ํ•œ๋‹ค
  • ๊ทธ๋Ÿฐ๋ฐ K๊ฐœ์˜ ๋ฐ์ดํ„ฐ ์ค‘์—์„œ๋„ ๋” ๊ฐ€๊นŒ์šด ๋ฐ์ดํ„ฐ๋“ค์ด ์กด์žฌํ•˜์ง€ ์•Š๋Š๋ƒ! ๋ผ๋Š” ์ƒ๊ฐ์ด ๊ฐ€๋Šฅ
    • ๊ทธ๋ž˜์„œ ๋‚˜์˜จ ๋ฐฉ์‹์ด Weighted Voting
    • K๊ฐœ์˜ ์ด์›ƒ๋“ค์— ๋Œ€ํ•ด์„œ ๊ฐœ์ˆ˜๋งŒ ์„ธ๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์ด์›ƒ๊ฐ„์˜ ๊ฑฐ๋ฆฌ๋„ ๊ณ ๋ คํ•ด์„œ ํŒ๋‹จ
728x90

'HYU > ๋ฐ์ดํ„ฐ์‚ฌ์ด์–ธ์Šค' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

12. Evaluation & Ensemble  (0) 2024.04.16
10. Overfitting  (0) 2024.04.14
9. Decision Tree  (0) 2024.04.13
8. Classification  (0) 2024.04.13
7. Association Rules  (0) 2024.04.13