K-medoids clustering (R PAM)

2019. 9. 30. 21:37

K-medoids clustering (R PAM) Start

BioinformaticsAndMe

K-medoids clustering (PAM; Partitioning Around Medoids)

: 이전에 소개된 K-means clustering은 평균을 이용하기에 이상치(outlier)에 민감함

: PAM(Partitioning Around Medoids, 1987)는 평균 대신 대표주자(medoid)를 선택하고, 더 좋은 군집을 만드는 대표주자가 있으면 대체

: PAM은 K-means보다 강건한 방법(robust method)

: PAM은 소규모 자료 적용에는 유리하지만, 대규모 자료 적용에는 불안정(non-scalable)

K-medoids clustering 과정

1) 패키지 설치

install.packages(“cluster”)
library(cluster)

2) pam 함수 적용

pam.result <- pam(iris, 3)

3) table 함수로 배치된 클러스터 확인

table(pam.result$clustering, iris$Species)

    setosa versicolor virginica
  1     50          0         0
  2      0          3        49
  3      0         47         1

4) 한 화면에 클러스터링 결과가 모두 출력

par(mfrow=c(1,2))
plot(pam.result) 
par(mfrow=c(1,1))

clusterplot

: iris 데이터가 3개의 클러스터로 나뉨

: 분홍선은 클러스터간의 거리

Silhouette plot of pam

: iris 데이터의 실루엣(silhouette)

: 군집 1에는 50개 데이터를 포함 (0.80은 클러스터링 설명력)

#1.00 에 가까울수록 데이터들이 적합하게 클러스터링

#음수값은 데이터가 잘못된 클러스터에 속함

#Reference

1) https://rstudio-pubs-static.s3.amazonaws.com/249084_09c0daf4ceb24212a81ceddca97ba1ea.html

2) http://ropatics.com/data-mining/r-and-data-mining/RDM-Clustering.html

3) https://slidesplayer.org/slide/15150112/

4) https://blog.naver.com/asus1984/120065317344

5) https://win100.tistory.com/167

K-medoids clustering (R PAM) End

BioinformaticsAndMe

저작자표시 (새창열림)

[R Shiny] 샤이니 소개 (0)	2019.10.06
Hierarchical clustering (R 계층적 군집화) (0)	2019.10.04
K-means clustering (R 군집분석) (0)	2019.09.24
if, else, else if, ifelse (R 조건문) (0)	2019.09.16
while, for (R 반복문) (0)	2019.09.16

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

BioinformaticsAndMe