Lecture 7 | Edge Detection + HoG

형태

Computer Vision

수강 일자

2022/09/26

Canny Edge Detection

Sobel Filtering

Non-local Suppression (Non-Maximum Suppression)

Tracking by Hysteresis

Final

•

Effect of σ\sigmaσ

◦

큰 σ\sigmaσ 는 large scale edge 를 감지

◦

작은 σ\sigmaσ 는 fine feature 를 감지

•

Effect of threshold

◦

큰 threshold 는 확실한 edge 만을 감지

◦

작은 threshold 는 noise 를 비롯한 덜 확실한 edge 도 감지

Histogram of Oriented Gradients

•

Template 과 동일한 영역을 이미지 속에서 찾기 위해서는 이미지를 순회하면서 patch 영역과의 correlation 을 계산해야 함! 

\frac{u \cdot v}{\|u\|\|v\|}=\cos\theta

◦

보통 M×NM\times NM×N 의 영역을 MN×1MN \times 1MN×1 벡터로 변형하여 dot product 를 진행함

◦

하지만, 이러한 matching 은 완벽히 동일한 shape, size, orientation, viewpoint 를 가정한 상황에서만 동작할 수 있고 실제 상황은 rotation, scale, color, lighting, occlusion 등 다양한 variation 이 존재함

◦

때문에 아래 예시처럼 단순한 correlation 은 의미적으로 유사한 이미지에서 높은 값을 보장해주지 못함!

•

대게 gradient 정보는 illumination 과 같은 정보에 있어서 invariant 하기 때문에 feature 로 자주 사용됨!

◦

아래의 그림처럼, 사람 형태의 이미지 두 장이 direction 에 따라 분류된 gradient 의 magnitude 에 대한 histogram 이 굉장히 비슷한 것을 알 수 있음

◦

하지만, global 한 영역의 histogram 이 유사한 것은 spatial relationship 이 비슷한 것을 보장해주진 않음 → local 한 영역의 histogram 이 중요함!

•

Histogram of Gradients 를 만드는 과정

Sobel filter 를 이용해 xxx, yyy direction 의 gradient 를 계산함

xxx, yyy direction 의 gradient 로 magnitude of gradient, orientation of gradient 를 계산함

Orientation Binning 을 진행함.

각 영역을 cell_size×cell_size{\rm cell\_size} \times {\rm cell\_size}cell_size×cell_size 의 크기로 나누고 남은 영역은 버림 → M×NM\times NM×N cell

각 로컬 영역에서 # of bins\#\ {\rm of\ bins}# of bins 개의 gradient angle 범위로 나누어지고 gradient magnitude 의 합을 값을 가진 histogram 을 만들고 이 # of bins\#\ {\rm of\ bins}# of bins 개의 값을 각 cell 에 추가적인 dimension 을 만들어서 저장함 → M×N×# of binsM\times N \times \#\ {\rm of\ bins}M×N×# of bins

Block Normalization 을 진행함. (각 histogram 의 scale 은 constrast 에 따라 달라질 수 있기 때문에 normalize 하는 과정)

3 번 과정을 끝내고 얻어낸 oriented histogram 을 또 다시 block_size×block_size{\rm block\_size} \times {\rm block\_size}block_size×block_size 의 크기로 순회하여 해당 영역의 bin\rm binbin vector 들을 이어서 하나의 긴 vector 를 만듬

구해낸 # of bins×block_size2\#\ {\rm of\ bins} \times {\rm block\_size}^2# of bins×block_size2 크기의 vector 를 l2 norm 을 사용해 normalization 함 → M−block_size+1×N−block_size+1×# of binsM-{\rm block\_size + 1} \times N-{\rm block\_size + 1} \times \#\ {\rm of\ bins}M−block_size+1×N−block_size+1×# of bins

최종적으로 구해낸 matrix 를 K×1K\times 1K×1 의 긴 1 dimensional vector 로 변경하고 이것이 HoG 가 됨

Face Detection

사람 얼굴 형태 template 의 HoG 를 먼저 구함

이미지를 순회하면서 영역을 자르고 HoG 를 구한 뒤 앞서 구한 HoG 와의 similarity 를 구하고 이 값이 정해둔 threshold ϵ\epsilonϵ 보다 크면 사람 얼굴 영역 후보로 지정함

\frac{u\cdot v}{\|u\| \|v \|} > \epsilon

사람 얼굴 영역 후보 각각에 대해서 높은 similarity 순으로 정렬하고, 가장 높은 후보를 뽑아서 해당 후보와 IoU 0.5 보다 큰 다른 후보들을 삭제하는 과정을 모든 후보군이 선택 혹은 삭제될때까지 반복함. (Non-Maximum Suppression)

Object Recognition with HoG

•

HoG feature 를 SVM 을 이용해 positive 와 negative sample 로 나누는 과정을 진행할 수 있음!

◦

Nearest neighbor classifier 로 충분하지 않고, linearly separable 한 경우에 더 이점을 가짐!

HoG Extension: Deformable Part Model (DPM)

•

신체의 각 부분부분마다를 recognize 할 수 있는 template 들을 사용해 HoG 를 뽑고 각 신체마다 SVM 을 이용해 그 신체인 것과 아닌 것을 구별할 수 있는 decision boundary 를 찾음

•

찾아낸 decision boundary 로 이미지의 각 부분이 신체의 각 부분인지를 판단함

•

(각 부분끼리의 연결에 있어서의 유연함?)