Matrix Factorization

๋ถ„์•ผ
Recommendation System
๋ฆฌ๋ทฐ ๋‚ ์งœ
2021/04/01
๋ณธ ํฌ์ŠคํŠธ๋Š” ์ œ๊ฐ€ ํœด๋จผ์Šค์ผ€์ดํ”„ ๊ธฐ์ˆ  ๋ธ”๋กœ๊ทธ์— ๋จผ์ € ์ž‘์„ฑํ•˜๊ณ  ์˜ฎ๊ธด ํฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค.
๋ณธ ํฌ์ŠคํŠธ์—์„œ๋Š” Netfilx ์—์„œ๋„ ํ™œ์šฉํ•˜๊ณ  ์žˆ๋Š”, ๋”ฅ๋Ÿฌ๋‹์„ ํ™œ์šฉํ•œ ์ถ”์ฒœ ์‹œ์Šคํ…œ์ธ Matrix Factorization ์„ ์†Œ๊ฐœํ•œ ๋…ผ๋ฌธ์— ๋Œ€ํ•ด์„œ ๋ฆฌ๋ทฐํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ๋ฆฌ๋ทฐํ•˜๋ ค๋Š” ๋…ผ๋ฌธ์˜ ์ œ๋ชฉ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
โ€œMatrix Factorization Techniques For Recommender Systemsโ€
๋…ผ๋ฌธ์— ๋Œ€ํ•œ ๋‚ด์šฉ์„ ์ง์ ‘ ๋ณด์‹œ๊ณ  ์‹ถ์œผ์‹  ๋ถ„์€ย ์ด๊ณณ์„ ์ฐธ๊ณ ํ•˜์‹œ๋ฉด ์ข‹์Šต๋‹ˆ๋‹ค.

Objective

๋…ผ๋ฌธ์˜ ๋ฐฐ๊ฒฝ์€ย Netflix Prize Competitionย ์—์„œ๋ถ€ํ„ฐ ์‹œ์ž‘๋ฉ๋‹ˆ๋‹ค.
Netflix Prize Competitionย ์€ Netflix ๊ฐ€ 2006๋…„๋ถ€ํ„ฐ ๊ฐœ์ตœํ•˜๊ณ  ์žˆ๋Š” ๋Œ€ํšŒ์ž…๋‹ˆ๋‹ค. Netflix ๋Š” ์ด ๋Œ€ํšŒ๋ฅผ ์œ„ํ•ด 50๋งŒ๋ช…์˜ ์‚ฌ์šฉ์ž๊ฐ€ 17000 ๊ฐœ์˜ ์˜์ƒ์— ๋Œ€ํ•ด์„œ ๋งค๊ธด 1์–ต ๊ฐœ์˜ ํ‰๊ฐ€๋“ค(๋ณ„์ )์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ณตํ–ˆ์Šต๋‹ˆ๋‹ค.
Netflix ๊ฐ€ ๊ทธ๋“ค์˜ ์ˆ˜๋งŽ์€ ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ณตํ•˜๋ฉด์„œ๊นŒ์ง€ ์›ํ–ˆ๋˜ ๊ฒƒ์€ ๊ทธ๋“ค์ด ๊ฐ€์ง€๊ณ  ์žˆ๋Š” Recommender System ๋ณด๋‹ค RMSE ์ธก๋ฉด์—์„œ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ด๋Š” ๋ฐฉ๋ฒ•๋ก ์ž…๋‹ˆ๋‹ค. ์‹ฌ์ง€์–ด RMSE ๊ฐ€10% ์ด์ƒ ๊ฐœ์„ ๋œ ๋ฐฉ๋ฒ•๋ก ์— ๋Œ€ํ•ด์„œ 100๋งŒ ๋‹ฌ๋Ÿฌ๋ฅผ ์ง€๊ธ‰ํ•˜๊ธฐ๋กœ ํ–ˆ๊ณ , ๊ทธ๋Ÿฐ ๋ฐฉ๋ฒ•๋ก ์ด ์กด์žฌํ•˜์ง€ ์•Š์•˜๋‹ค๋ฉด ๊ฐ€์žฅ ํฐ ๊ฐœ์„ ์„ ์ด๋ฃฌ ํŒ€์—๊ฒŒ 5๋งŒ ๋‹ฌ๋Ÿฌ๋ฅผ ์ง€๊ธ‰ํ•˜๊ธฐ๋กœ ํ–ˆ์Šต๋‹ˆ๋‹ค.
์ง€๊ธˆ๋ถ€ํ„ฐ ์†Œ๊ฐœํ•  ๋…ผ๋ฌธ์€ 2007๋…„์— 8.43% ์˜ ๊ฐœ์„ ์œผ๋กœ 1์œ„๋ฅผ ์ฐจ์ง€ํ•˜๊ณ  2008๋…„ 9.46% ์˜ ๊ฐœ์„ ์œผ๋กœ 1์œ„๋ฅผ ์ฐจ์ง€ํ•œ ํŒ€์˜ ๋ฐฉ๋ฒ•๋ก ์— ๋Œ€ํ•œ ๋‚ด์šฉ์„ ๋‹ด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

Background

๊ธฐ์กด์˜ recommender system ์€ ํฌ๊ฒŒ ๋‘ ๊ฐ€์ง€,ย Content Filteringย ๊ณผย Collaborative Filteringย ์œผ๋กœ ๋‚˜๋ˆ„์–ด์ง‘๋‹ˆ๋‹ค.

Content Filtering

๋จผ์ €,ย Content Filteringย ์€ ๊ฐ€์žฅ ๊ธฐ๋ณธ์ ์ธ recommender system ์œผ๋กœ ์†Œ๊ฐœ๋œ ์นœ๊ตฌ์ž…๋‹ˆ๋‹ค.ย Content Filteringย ์˜ ํ•ต์‹ฌ์€ product ์˜ profile ์„ ๋งŒ๋“œ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด๋ ‡๊ฒŒ ์„ค๋ช…ํ•˜๋ฉด ์—ฅ ์ด๊ฒŒ ๋ฌด์Šจ ์†Œ๋ฆฌ์ง€โ€ฆ??? ํ•˜์‹ค ๋ถ„๋“ค์„ ์œ„ํ•ด ๋ถ€์—ฐ ์„ค๋ช…์„ ๋“œ๋ฆฌ์ž๋ฉด, user ์™€ product ์˜ ํŠน์ง•์„ ๋งŒ๋“ค๊ณ , user ๊ฐ€ ์ข‹์•„ํ• ๋งŒํ•œ ํŠน์ง•์„ ๊ฐ€์ง„ product ๋ฅผ ์ฐพ์•„๋‚ธ๋‹ค๊ณ  ๋ณด์‹œ๋ฉด ๋ฉ๋‹ˆ๋‹ค.
์œ„ ๊ธ€๊ท€๊ฐ€ ํ˜น์‹œ๋ผ๋„ ์ดํ•ด๋˜์ง€ ์•Š๋Š” ๋ถ„๋“ค์„ ์œ„ํ•ด ์˜ˆ์‹œ๋ฅผ ๋“ค์–ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ์ดํ•ด๊ฐ€ ๋˜์…จ๋‹ค๋ฉด ๋„˜์–ด๊ฐ€์…”๋„ ์ข‹์Šต๋‹ˆ๋‹ค :)product ๋ฅผ ์˜ํ™”๋กœ ์ƒ๊ฐํ•ด๋ด…์‹œ๋‹ค.product ์˜ profile ์€ ์˜ํ™”์˜ ์žฅ๋ฅด, ์˜ํ™”์— ์ฐธ์—ฌํ•œ ๋ฐฐ์šฐ ๋ชฉ๋ก, ๋ฐ•์Šค ์˜คํ”ผ์Šค ์ˆœ์œ„์™€ ๊ฐ™์€ ์˜ํ™”์˜ ์ •๋ณด๋ผ๊ณ  ๋ณด์‹œ๋ฉด ๋ฉ๋‹ˆ๋‹ค.
์ด๋Ÿฐ profile ๋“ค์„ ์‚ฌ์šฉํ•ด์„œ user ๊ฐ€ ๊ณผ๊ฑฐ์— ์„ ํ˜ธํ–ˆ๋˜ product ์˜ profile ์„ ๋ฐ”ํƒ•์œผ๋กœ ๋น„์Šทํ•œ profile ์„ ๊ฐ€์ง€๋Š” product ๋ฅผ ์—ฐ๊ด€์ง€์–ด ์ถ”์ฒœ์„ ํ•˜๊ฒŒ ๋˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๊ทธ๋Ÿฐ๋ฐ,ย Content Filteringย ์€ ํ•œ ๊ฐ€์ง€ ํฐ ๋ฌธ์ œ์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค. Product ์˜ ๋ช…์‹œ์ ์ธ profile ์„ ์ž‘์„ฑํ•˜๋Š” ๊ฒƒ์ด ์ƒ๋‹นํžˆ ๊ท€์ฐฎ๊ณ  ์–ด๋ ค์šด ์ผ์ด๋ผ๋Š” ์ ์ž…๋‹ˆ๋‹ค. ๋”๋ถˆ์–ด ์–ด๋–ค profile ์„ ์ž‘์„ฑํ•ด์•ผ ๋†’์€ ๋งŒ์กฑ๋„์˜ ์ถ”์ฒœ์„ ํ•  ์ˆ˜ ์žˆ๋Š”์ง€์— ๋Œ€ํ•ด์„œ๋„ ๋ชจํ˜ธํ•ฉ๋‹ˆ๋‹ค.

Collaborative Filtering

์œ„์—์„œ ์„ค๋ช…๋“œ๋ฆฐย Content Filteringย ์˜ ๋ฌธ์ œ์ ์— ๋Œ€ํ•œ ๋Œ€์•ˆ์ฑ…์œผ๋กœ ๋“ฑ์žฅํ•œ ๊ฒƒ์ดย Collaborative Filteringย ์ž…๋‹ˆ๋‹ค.ย Collaborative Filteringย ์€ product ์˜ ๋ช…์‹œ์ ์ธ profile ์„ ์ž‘์„ฑํ•  ํ•„์š”๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ๋Œ€์‹ , user ๊ฐ„์˜ ๊ด€๊ณ„์™€ user-product ๊ฐ„์˜ ์ƒํ˜ธ ์˜์กด์„ฑ์„ ๋ถ„์„ํ•˜์—ฌ ์ƒˆ๋กœ์šด user-product ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ๋ฐํ˜€๋ƒ…๋‹ˆ๋‹ค.
๋˜ ์ด๋ ‡๊ฒŒ ์„ค๋ช…ํ•˜๋ฉด ์—ฅ ์ด๊ฒŒ ๋ฌด์Šจ ์†Œ๋ฆฌ์ง€โ€ฆ??? ํ•˜์‹ค ๋ถ„๋“ค์„ ์œ„ํ•ด ๋ถ€์—ฐ ์„ค๋ช…์„ ๋“œ๋ฆฌ์ž๋ฉด, user ๊ฐ„์˜ ์œ ์‚ฌ๋„๋ฅผ ์ธก์ •ํ•˜๊ณ , ์œ ์‚ฌํ•œ user ๊ฐ€ ๊ธฐ์กด user ์ž…์žฅ์—์„œ ์ƒˆ๋กœ์ด ๋งž์ดํ•˜๋Š” product ์— ๋Œ€ํ•ด์„œ ํ‰๊ฐ€ํ•œ ๋‚ด์šฉ์„ ๋ฐ”ํƒ•์œผ๋กœ ์ถ”์ฒœ์˜ ์ฒ™๋„๋ฅผ ๊ฒฐ์ •ํ•œ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์œ ์‚ฌํ•œ user ๊ฐ€ product ์— ๋Œ€ํ•ด์„œ ์ข‹์€ ํ‰๊ฐ€๋ฅผ ํ–ˆ๋‹ค๋ฉด, user ๊ฐ„ ๊ด€๊ณ„๊ฐ€ ์œ ์‚ฌํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๊ธฐ์กด user ๋„ ์ข‹์€ ํ‰๊ฐ€๋ฅผ ํ•  ๊ฐ€๋Šฅ์„ฑ์ด ๋†’๋‹ค๊ณ  ๋ณด๊ณ  ์ถ”์ฒœ์„ ํ•œ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๋‹ค์‹œ ๋˜ ์˜ˆ์‹œ๋ฅผ ๋“ค๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ์ด ์˜ˆ์‹œ๋Š” ์œ„ ๊ธ€๊ท€๋ฅผ ์ดํ•ดํ•˜์…จ์–ด๋„ ๋ด์ฃผ์„ธ์š”!!product ๋ฅผ ์˜ํ™”๋กœ ์ƒ๊ฐํ•ด๋ด…์‹œ๋‹ค.user1 ์ด๋ž‘ user2 ๋Š” ์˜ํ™” ์ทจํ–ฅ์ด ์ž˜ ๋งž๋Š” ์ ˆ์นœ์ž…๋‹ˆ๋‹ค. ์„ธ๋ถ€์ ์œผ๋กœ, user1 ์ด ์Šค๋ฆด๋Ÿฌ ์˜ํ™” 1์„ ์ข‹์•„ํ•  ๋•Œ user2 ๋„ ์Šค๋ฆด๋Ÿฌ ์˜ํ™” 1์„ ์ข‹์•„ํ–ˆ๊ณ , user1 ์ด ์Šค๋ฆด๋Ÿฌ ์˜ํ™” 2๋ฅผ ์ข‹์•„ํ•  ๋•Œ user2 ๋„ ์Šค๋ฆด๋Ÿฌ ์˜ํ™” 2๋ฅผ ์ข‹์•„ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ๊ฐ™์ด ์ข‹์•„ํ•œ ์Šค๋ฆด๋Ÿฌ ์˜ํ™”๊ฐ€ 100๊ฐœ๋„ ๋„˜์Šต๋‹ˆ๋‹ค.user1์ด ๊ทผ๋ฐ ๊ฐ‘์ž๊ธฐ ๋กœ๋งจ์Šค ์˜ํ™” 1์„ ์ข‹์•„ํ•œ๋‹ค๊ณ  ํ‰๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด ๋•Œ ์ถ”์ฒœ ์‹œ์Šคํ…œ์€ user1 ๊ณผ user2 ๋ฅผ ์œ ์‚ฌํ•œ ์ทจํ–ฅ์˜ ์‚ฌ๋žŒ์œผ๋กœ ๋ณด๊ณ  ๋กœ๋งจ์Šค ์˜ํ™” 1์„ ์•„์ง ๋ณด์ง€ ์•Š์€ user2 ์—๊ฒŒ ์ถ”์ฒœํ•ด ์ฃผ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์œ„ ์˜ˆ์‹œ์—์„œ ๋“œ๋Ÿฌ๋‚˜๊ธฐ๋„ ํ•œ Collaborative Filtering ์˜ ์žฅ์ ์€ domain-free ํ•˜๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. Domain-free ํ•จ์€ product ์˜ ์„ธ๋ถ€์ ์ธ ํŠน์ง•(์•ž์„œ ์„ค๋ช…ํ•œ profile ๋กœ๋„ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๊ฒ ๋„ค์š”)์— ๋Œ€ํ•ด์„œ ์•Œ ํ•„์š”๊ฐ€ ์—†๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ๊ทธ ๋งŒํผ profiling ํ•˜๊ธฐ ์–ด๋ ค์šด ๋ฐ์ดํ„ฐ ์–‘์ƒ๋„ ์ถ”์ฒœ ์‹œ์Šคํ…œ์ด ์บ์น˜ํ•˜๊ณ  ์˜ฌ๋ฐ”๋ฅธ ์ถ”์ฒœ์„ ํ•ด ์ค„ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์˜ˆ์‹œ์—์„œ ๋œฌ๊ธˆ์—†์ด ๋กœ๋งจ์Šค ์˜ํ™”๋ฅผ ๋“ฑ์žฅ์‹œ์ผฐ์—ˆ๋Š”๋ฐ์š”,๊ทธ ์ด์œ ๊ฐ€ Collaborative Filtering ์„ ์ด์šฉํ•œ ์ถ”์ฒœ์ด profiling ํ•˜๊ธฐ ์–ด๋ ค์šด ๋ฐ์ดํ„ฐ๋„ ๋‹ค๋ฃฐ ์ˆ˜ ์žˆ์Œ์„ ์•Œ๋ ค๋“œ๋ฆฌ๊ธฐ ์œ„ํ•จ์ด์—ˆ์Šต๋‹ˆ๋‹ค.product ์˜ profile ๋‚ด์— "์žฅ๋ฅด" ๊ฐ€ ๋“ค์–ด๊ฐˆ ๊ฒƒ์ด์ง€๋งŒ, "์žฅ๋ฅด" ๋กœ๋Š” ์ถ”์ฒœ์„ ํ•˜์ง€ ๋ชปํ–ˆ๋˜ user ์˜ ์„ธ๋ถ€ ์„ ํ˜ธ ํŠน์„ฑ์„ Collaborative Filtering ์ด ์ถ”์ฒœ์„ ํ•ด์ค„ ์ˆ˜ ์žˆ์—ˆ๋˜ ๊ฒƒ์ž…๋‹ˆ๋‹ค.๊ทน๋‹จ์ ์œผ๋กœ๋Š”, Collaborative Filtering ์ด ์บ์น˜ํ•œ product ์˜ profile ์ด "์˜ํ™”์˜ ํ›„๋ฐ˜๋ถ€์— ์—ฌ์ฃผ์ธ๊ณต์ด ์•ˆํƒ€๊น๊ฒŒ ์ƒ์„ ๋งˆ๊ฐํ•œ ์ดํ›„ ๊ฐ์„ฑํ•œ ๋‚จ์ฃผ์ธ๊ณต์ด ์•…๋‹น์ด ๋˜์–ด๋ฒ„๋ฆฌ๋Š” ์Šคํ† ๋ฆฌ" ์ผ ์ˆ˜๋„ ์žˆ๋Š”๋ฐ ์ด๋Ÿฐ ํ‘œํ˜„ํ•˜๊ธฐ๋„ ์–ด๋ ค์šด ํŠน์ง•๋“ค์„ ๋‹ค๋ฃฐ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ง€๊ธˆ๊นŒ์ง€ Collaborative Filtering ์ด ๋ฌด์—‡์ด๊ณ , Content Filtering ์— ๋น„ํ•ด์„œ ๊ฐ€์ง€๋Š” ์žฅ์ ์ด ๋ฌด์—‡์ธ๊ฐ€์— ๋Œ€ํ•ด์„œ ์•Œ์•„๋ณด์•˜์Šต๋‹ˆ๋‹ค. ์ง€๊ธˆ๋ถ€ํ„ฐ๋Š” ๋…ผ๋ฌธ์—์„œ ์ œ์‹œํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ๊ณผ ์กฐ๊ธˆ ๋” ๊ฐ€๊นŒ์›Œ์ง€๊ธฐ ์œ„ํ•ด Collaborative Filtering ์˜ ์„ธ๋ถ€์ ์ธ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด์„œ ์„ค๋ช…๋“œ๋ฆฌ๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

Collaborative Filtering โ€” Neighborhoods Method

Collaborative Filtering ์˜ ์ฒซ ๋ฒˆ์งธ ๋ฐฉ๋ฒ•,ย Neighborhoods Method๋Š” ์•ž์„œ ์„ค๋ช…ํ•œ Collaborative Filtering ์˜ ์„ค๋ช…๊ณผ ํฌ๊ฒŒ ๋‹ค๋ฅด์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
๋‹ค๋งŒ ์•ž์—์„œ๋Š” user ์˜ ์œ ์‚ฌ๋„์— ๋Œ€ํ•ด์„œ ์ดˆ์ ์„ ๋งž์ถ”๊ณ  ์œ ์‚ฌํ•œ user ์˜ ์„ ํ˜ธ๋„์— ๋”ฐ๋ฅธ ์ถ”์ฒœ์„ ์ง„ํ–‰ํ•˜๋Š”ย User-Oriented Neighborhood Methodย ๋ฅผ ์†Œ๊ฐœํ–ˆ์—ˆ๋Š”๋ฐ, ๋ฐ˜๋Œ€๋กœ user ์˜ ์„ ํ˜ธ๋„๊ฐ€ ์œ ์‚ฌํ•œ product ๋ฅผ ์ฐพ์•„์„œ ๊ทธ product ์— ๋Œ€ํ•œ ๋ณธ์ธ์˜ ์„ ํ˜ธ๋„๋กœ ์ถ”์ฒœ์„ ์ง„ํ–‰ํ•˜๋Š”ย Product-Oriented Neighborhood Methodย ๋„ ์žˆ๋‹ค๋Š” ์ ๋งŒ ์ถ”๊ฐ€๋กœ ์–ธ๊ธ‰๋“œ๋ฆฌ๋ฉด ๋  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.
User-Oriented Neighborhood Method
์œ„ ๊ทธ๋ฆผ์€ย User-Oriented Neighborhood Methodย ๋ฅผ ๋‚˜ํƒ€๋‚ธ ๊ทธ๋ฆผ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ฆผ์˜ ๊ฐ€์žฅ ์ขŒ์ธก์— ์žˆ๋Š” Joe ๊ฐ€ ์ข‹์•„ํ•˜๋Š” ์„ธ ๊ฐœ์˜ ์˜ํ™”๋ฅผ ์ข‹์•„ํ•˜๋Š” ๋‹ค๋ฅธ user ์„ธ ๋ช…์„ ์ฐพ์•„์„œ ๊ทธ๋“ค์ด ๋˜ ์ข‹์•„ํ•˜๋Š” ๋‹ค๋ฅธ ์˜ํ™”๋“ค์„ Joe ์—๊ฒŒ ์ถ”์ฒœ์„ ํ•˜๋Š” ํ˜•ํƒœ์ž…๋‹ˆ๋‹ค.

Collaborative Filtering โ€” Latent Factor Models

Collaborative Filtering ์˜ ๋‘๋ฒˆ์งธ ๋ฐฉ๋ฒ•,ย Latent Factor Modelsย ๋Š” ์•ž์„œ ์œ ์‚ฌ๋„๋ผ๊ณ  ํ‘œํ˜„ํ–ˆ๋˜ ์š”์†Œ๋ฅผ user ์™€ product ๋ชจ๋‘์—์„œ ์ง์ ‘์ ์ธ Latent Factor (์ž ์žฌ์š”์ธ)์œผ๋กœ ๋ฝ‘์•„๋‚ด์–ด ํŠน์ • ์œ ์ €๊ฐ€ ํŠน์ • product ์™€ ๊ฐ€๊นŒ์šด ์ •๋„๋ฅผ ์ธก์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.
Latent Factor Models
์ด Latent Factor (์ž ์žฌ์š”์ธ)๋ผ๋Š” ์นœ๊ตฌ๋ฅผ ์„ค๋ช…ํ•˜๊ธฐ ์œ„ํ•ด ์œ„ ๊ทธ๋ฆผ์„ ์‚ด์ง ์ด์šฉํ•ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
์œ„ ๊ทธ๋ฆผ์˜ x ์ถ•์€ ๋‚จ์„ฑ์„ ํƒ€๊ฒŸ์œผ๋กœ ํ•œ ์˜ํ™”์ธ์ง€, ์—ฌ์„ฑ์„ ํƒ€๊ฒŸ์œผ๋กœ ํ•œ ์˜ํ™”์ธ์ง€์— ๋”ฐ๋ฅธ ๋ถ„๋ฅ˜ ๊ธฐ์ค€์„ ์ด๋ฉฐ, y ์ถ•์€ ํ˜„์‹ค์ ์ธ์ง€, ๋น„ํ˜„์‹ค์ ์ธ์ง€์— ๋”ฐ๋ฅธ ๋ถ„๋ฅ˜ ๊ธฐ์ค€์„ ์œผ๋กœ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ํ™”์˜ ๊ฒฝ์šฐ์—๋Š” ์ขŒํ‘œํ‰๋ฉด์—์„œ ํ•ด๋‹น ๋ถ„๋ฅ˜์— ๋”ฐ๋ผ ๋ฐฐ์น˜ ๋˜์–ด ์žˆ๊ณ , ์‚ฌ๋žŒ์˜ ๊ฒฝ์šฐ์—๋Š” ์ขŒํ‘œํ‰๋ฉด์—์„œ ๋ถ„๋ฅ˜์— ๋”ฐ๋ฅธ ์˜ํ™”๋ฅผ ์–ผ๋งˆ๋‚˜ ์ข‹์•„ํ•˜๋Š”์ง€์— ๋”ฐ๋ผ ๋ฐฐ์น˜ ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹น์—ฐํ•˜๊ฒŒ๋„, ์˜ํ™”์™€ ์‚ฌ๋žŒ์˜ ๊ฑฐ๋ฆฌ๊ฐ€ ๊ฐ€๊นŒ์šธ ์ˆ˜๋ก ์‚ฌ๋žŒ์ด ์ข‹์•„ํ•˜๋Š” ์˜ํ™”๋ผ์„œ ์ข‹์€ ํ‰๊ฐ€๋ฅผ ๋‚ด๊ฒ ์ฃ ??
๋†€๋ผ์šด ์ ์€ ์ด ์ข‹์€ ํ‰๊ฐ€์˜ ์ •๋„๋ฅผ ์˜ํ™”์™€ ์‚ฌ๋žŒ์˜ ์ขŒํ‘œ์˜ dot product ๋กœ ์ˆ˜์น˜ํ™”ํ•˜์—ฌ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. Gus ์˜ ๊ฒฝ์šฐ์—๋Š” Dumb and Dumber ์™€์˜ dot product ๋Š” ํฐ ๊ฒƒ์— ๋น„ํ•ด The Color Purple ๊ณผ์˜ dot product ๋Š” ์ž‘์€ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์ฃ .
๋‹ค๋งŒ, ์—ฌ๊ธฐ์„œ ํ•œ ๊ฐ€์ง€ ์˜๋ฌธ์„ ํ’ˆ์œผ์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€œ์•„๋‹ˆ ์˜ํ™”๋Š” ์ €๋Ÿฐ ๋ถ„๋ฅ˜๋กœ ๊ตฌ๋ถ„ํ•  ์ˆ˜ ์žˆ๋‹ค ์ณ๋„, ์‚ฌ๋žŒ์„ ์–ด๋–ป๊ฒŒ ์ €๋Ÿฐ ๊ธฐ์ค€์„ ์— ๋ฐฐ์น˜ํ•ด???โ€
์ด๋Ÿฐ ์ƒ๊ฐ์ด ๋“œ์…จ๋‹ค๋ฉด ์ƒ๋‹นํžˆ ์ž˜ ์ดํ•ดํ•˜๊ณ  ๊ณ„์‹  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด๋Ÿฐ ์˜๋ฌธ์ ์„ ํ•ด๊ฒฐํ•ด์ฃผ๋Š” ๊ฒƒ์ด ๋…ผ๋ฌธ์—์„œ ์ œ์‹œํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ย Matrix Factorizationย ์ž…๋‹ˆ๋‹ค.

Matrix Factorization

Matrix Factorizationย ์€ Latent Factor Models ์˜ ๊ฐ€์žฅ ์„ฑ๊ณต์ ์ธ ๊ตฌํ˜„์ด๋ผ๊ณ ๋„ ๋ถˆ๋ฆฝ๋‹ˆ๋‹ค.
ํ•˜์ง€๋งŒ, ์œ„์˜ ์ˆ˜์‹์–ด๊ฐ€ ๋ถ™์€ ๊ฒƒ ์น˜๊ณ ๋Š” ์ƒ๋‹นํžˆ ๊ฐ„๋‹จํ•œ ๊ฐœ๋…์ž…๋‹ˆ๋‹ค.
ํ•œ ๋งˆ๋””๋กœ ์„ค๋ช…ํ•˜์ž๋ฉด, Latent Factor Models ์—์„œ Latent Factor (์ž ์žฌ์š”์ธ)์œผ๋กœ ๋ถˆ๋ ธ๋˜ ์นœ๊ตฌ๋“ค์„ ๋”ฅ๋Ÿฌ๋‹์„ ์‚ฌ์šฉํ•ด ํ•™์Šต์„ ์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.
๋ฐ”๋กœ ์˜๋ฌธ์ด ํ’€๋ฆฌ์…จ์ฃ ??
์‚ฌ๋žŒ์„ ๊ธฐ์ค€์„ ์— ๋ฐฐ์น˜ํ•˜๋Š” ๊ฒƒ์€ ์ €ํฌ๊ฐ€ ํ•  ์ผ์ด ์•„๋‹ˆ๋ผ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์ด ์‚ฌ๋žŒ์ด ๋ช…์‹œ์ /๋น„๋ช…์‹œ์ ์œผ๋กœ ํ‘œ์ถœํ•œ ์„ ํ˜ธ๋„๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ €ํฌ๊ฐ€ ์•Œ ์ˆ˜ ์—†๋Š” ์ž ์žฌ์š”์ธ์„ ์ฐพ์•„์„œ โ€œ์•Œ์•„์„œ" ๋ฐฐ์น˜ํ•ด์คŒ์œผ๋กœ์จ ํ•ด๊ฒฐํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ฆ‰, recommendation ๋„ Latent Factor Models ์—์„œ ์ง„ํ–‰ํ–ˆ๋˜ ๊ฒƒ๊ณผ ๋‹ค๋ฅด์ง€ ์•Š์Šต๋‹ˆ๋‹ค. User ์˜ Latent Factor ์™€ product ์˜ Latent Factor ๊ฐ„์˜ dot product ๋กœ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ์ˆ˜์‹์œผ๋กœ ๋‚˜ํƒ€๋‚ด๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
r^ui=qiTpu\hat{r}_{ui}=q_i^Tp_u
r_ui hat ์€ inferred recommendation ์ด๊ณ , q_i ๋Š” item(์ง€๊ธˆ๊นŒ์ง€๋Š” product ๋กœ ์„ค๋ช…ํ•ด์™”๋˜ ๊ทธ ๊ฒƒ์ž…๋‹ˆ๋‹ค.)์˜ Latent Factor ์ด๊ณ , p_u ๋Š” user ์˜ Latent Factor ์ž…๋‹ˆ๋‹ค.
์—ฌ๊ธฐ์„œ ์„ ํ˜•๋Œ€์ˆ˜ํ•™์„ ๊ณต๋ถ€ํ•ด๋ณด์‹  ๋ถ„์ด๋ผ๋ฉด ground truth ์ธ recommendation ์œผ๋กœ๋ถ€ํ„ฐ ๋ฐ˜๋Œ€๋กœ q_i ์™€ p_u ๋ฅผ ๊ตฌํ•ด๋‚ด๋ฉด ๋˜์ง€ ์•Š์„๊นŒ? ํ•˜๋Š” ์ƒ๊ฐ์— EVD, SVD ๋“ฑ์˜ ๋ฐฉ๋ฒ• ๋“ฑ์„ ์ƒ๊ฐํ•ด๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.ํ•˜์ง€๋งŒ item ์€ ๋งŽ์€๋ฐ user ์˜ rating ์€ ์ ๊ธฐ ๋•Œ๋ฌธ์— recommendation matrix ๊ฐ€ sparse ํ•˜๊ณ , ์ด ๋•Œ๋ฌธ์— decomposition ์„ ํ†ตํ•ด ์—ญ์œผ๋กœ Latent Factor ๋ฅผ ๊ตฌํ•˜๋Š” ๊ฒƒ์€ ๋ฌด๋ฆฌ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค
Inferred recommendation ์ด ์œ„์™€ ๊ฐ™์€ ํ˜•ํƒœ์ด๋ฉด, ์ €ํฌ๋Š” SSE with regularization term ์„ ์•„๋ž˜์™€ ๊ฐ™์ด ์“ธ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ๋„ˆ๋ฌด๋‚˜๋„ ์ž˜ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ด๋ฅผ ์ตœ์†Œํ™” ํ•˜๋Š” ๊ฒƒ์ด ํ•™์Šต์˜ ๋ชฉํ‘œ๊ฐ€ ๋  ๊ฒƒ์ด๋ผ๋Š” ๊ฒƒ๋„ ์•„์‹œ๊ฒ ์ฃ ..!
minโกqโˆ—,pโˆ—โˆ‘(u,i)โˆˆK(ruiโˆ’qiTpu)2+ฮป(โˆฃโˆฃqiโˆฃโˆฃ2+โˆฃโˆฃpuโˆฃโˆฃ2)\min_{q^*,p^*}\sum_{(u,i)\in K}(r_{ui}-q_i^Tp_u)^2+\lambda(\left|| q_i|\right|^2+\left|| p_u|\right|^2)
ํ˜น์‹œ๋‚˜ SSE ๊ฐ€ ๋ฌด์—‡์ธ์ง€ ๋ชจ๋ฅด์‹ค ๋ถ„๋“ค์„ ์œ„ํ•ด...Sum Squared Error ๋กœ, ์˜ค์ฐจ์˜ ์ œ๊ณฑ์˜ ํ•ฉ์ด๋ผ๊ณ  ๋ณด์‹œ๋ฉด ๋ฉ๋‹ˆ๋‹ค.๋”๋ถˆ์–ด regularization ์ด ๋ฌด์—‡์ธ์ง€ ๋ชจ๋ฅด์‹ค ๋ถ„๋“ค์„ ์œ„ํ•ด...Overfitting ์„ ๋ง‰๋Š” ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜๋กœ, ํ•™์Šต๋  weight๋ฅผ loss ์— ์ถ”๊ฐ€๋กœ ํฌํ•จ์‹œํ‚ด์œผ๋กœ์จ training data ์—๋งŒ ์ •ํ™•ํžˆ ๋งž์•„๋–จ์–ด์ง€๋Š” ํ˜•ํƒœ๋กœ ํ•™์Šต๋˜๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•˜๊ณ  ์ผ๋ฐ˜์ ์ธ ์–‘์ƒ์— ๋งž์•„๋–จ์–ด์ง€๋Š” ํ˜•ํƒœ๋กœ ํ•™์Šต๋˜๊ฒŒ ํ•˜๋Š” ์นœ๊ตฌ์ž…๋‹ˆ๋‹ค. Lambda ๋Š” ๊ทธ ์ •๋„๋ฅผ ์กฐ์ ˆํ•˜๋Š” ์š”์†Œ๋กœ ๋ณด์‹œ๋ฉด ๋ฉ๋‹ˆ๋‹ค.
์—ฌ๊ธฐ๊นŒ์ง€ ํ•˜๋ฉด Matrix Factorization ์˜ ๊ฐœ๊ด„์ ์ธ ๋‚ด์šฉ์— ๋Œ€ํ•ด์„œ๋Š” ์ „๋ถ€ ๋‹ค๋ฃจ์—ˆ์Šต๋‹ˆ๋‹ค. ๋…ผ๋ฌธ์—์„œ๋Š” Matrix Factorization ์˜ 5๊ฐ€์ง€ ๊ฐœ์„  ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด์„œ๋„ ์„ค๋ช…์„ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ง€๊ธˆ๋ถ€ํ„ฐ๋Š” ๊ทธ ๋‚ด์šฉ์— ๋Œ€ํ•ด์„œ ์†Œ๊ฐœ๋“œ๋ฆฌ๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

Improvement โ€” Learning Algorithms

Deep Learning ์˜ optimization algorithm ์ค‘์— SGD(Stochastic Gradient Descent) ๋ผ๋Š” ์นœ๊ตฌ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
SGD ์— ๋Œ€ํ•ด์„œ ์ž์„ธํžˆ ์•Œ์•„๋ณด๊ณ  ์‹ถ์œผ์‹  ๋ถ„์€ ์ œ๊ฐ€ย ์ด์ „์— ์ž‘์„ฑํ–ˆ๋˜ ํฌ์ŠคํŠธ๋ฅผ ์ฐธ๊ณ ํ•˜์‹œ๋ฉด ์ข‹์„ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” SGD ๊ฐ€ ํ•˜๋‚˜์˜ training dataset ์— ๋Œ€ํ•ด์„œ ํ•œ ๋ฒˆ์˜ update ๋ฅผ ์ง„ํ–‰ํ•œ๋‹ค๋Š” ์ ๋งŒ ์งš๊ณ  ๋„˜์–ด๊ฐ€๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
SGD ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ q_i ์™€ p_u ๋ฅผ update ํ•˜๋Š” ๊ณผ์ •์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
eui=defruiโˆ’qiTpue_{ui} \overset{\underset{\mathrm{def}}{}}{=}r_{ui}-q_i^Tp_u
์œ„ ์‹์€ error ๋ฅผ ๋‚˜ํƒ€๋‚ธ ๊ฒƒ์ด๊ณ ,
qiโ†qi+ฮณโ‹…(euiโ‹…puโˆ’ฮณโ‹…qi)puโ†pu+ฮณโ‹…(euiโ‹…qiโˆ’ฮปโ‹…pu)q_i\gets q_i+\gamma\cdot(e_{ui}\cdot p_u-\gamma\cdot q_i)\\ p_u\gets p_u+\gamma\cdot(e_{ui}\cdot q_i-\lambda\cdot p_u)
์œ„ ์‹์€ Loss function ์„ ๊ฐ๊ฐ q_i, p_u ์— ๋Œ€ํ•ด์„œ ํŽธ๋ฏธ๋ถ„ํ•œ ๊ฐ’์ด ๊ด„ํ˜ธ ์•ˆ์— ๋“ค์–ด ์žˆ๋Š” ์นœ๊ตฌ์˜ 2๋ฐฐ์ธ ๊ฐ’์ด๊ธฐ ๋•Œ๋ฌธ์—, 2 ๋ฅผ ํฌํ•จํ•ด์„œ ์ƒ์ˆ˜๋ฐฐ๋งŒํผ์„ gamma ๋กœ ์น˜ํ™˜ํ•ด์„œ ์ž‘์„ฑํ•œ update ์‹์ž…๋‹ˆ๋‹ค.
๊ทธ๋Ÿฐ๋ฐ, ์—ฌ๊ธฐ์„œย ๊ธฐ๊ฐ€๋ง‰ํžŒ ๊ฐœ์„  ๋ฐฉ์•ˆ์ด ํ•˜๋‚˜ ์žˆ์Šต๋‹ˆ๋‹ค.
p_i, q_u ์ค‘ ํ•˜๋‚˜๋ฅผ ๊ณ ์ •ํ•˜๋ฉด gradient descent ๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋”๋ผ๋„ loss function ์ด quadratic ์ด ๋˜์–ด ์ตœ์†Œ๋กœ ๋งŒ๋“œ๋Š” ๊ฐ’์„ ๊ณ„์‚ฐํ•˜์—ฌ ๊ตฌํ•ด๋‚ผ ์ˆ˜ ์žˆ์ง€ ์•Š์„๊นŒ?
์œ„ ์ƒ๊ฐ์„ ์‹คํ˜„ ํ•œ ๊ฒƒ์ด ๋…ผ๋ฌธ์—์„œ ์ œ์‹œํ•œย ALS (Alternating Least Squares)์ž…๋‹ˆ๋‹ค.
์œ„ ์ƒ๊ฐ๋Œ€๋กœ ํ•˜๋‚˜๋ฅผ ๊ณ ์ •ํ•˜๊ณ  ํ•˜๋‚˜๋ฅผ update ํ•˜๋Š” ๊ณผ์ •์„ ๋ฒˆ๊ฐˆ์•„๊ฐ€๋ฉด์„œ ์ˆ˜๋ ด์ด๋ผ๊ณ  ํŒ๋‹จ๋˜๋Š” ์ง€์ ๊นŒ์ง€ ๋ฐ˜๋ณตํ•˜๋Š” ๊ฒƒ์ด ALS ์˜ ํ•ต์‹ฌ์ž…๋‹ˆ๋‹ค. ๋‹ค๋งŒ, ์ œ๊ฐ€ ์„ค๋ช…ํ•˜๋Š” ๋„์ค‘์— ๋Š๋ผ์…จ๊ฒ ์ง€๋งŒ ๋™์‹œ์— update ๋˜๋Š” SGD ์— ๋น„ํ•ด์„œ๋Š” ์†๋„๊ฐ€ ๋Š๋ฆฝ๋‹ˆ๋‹ค.
๊ทธ๋Ÿผ์—๋„ ALS ๊ฐ€ ์„ ํ˜ธ๋˜๋Š” ์ƒํ™ฉ์ด ํฌ๊ฒŒ ๋‘ ๊ฐ€์ง€ ๊ฒฝ์šฐ๊ฐ€ ์žˆ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
1.
Parallelization ์ด ๊ฐ€๋Šฅํ•  ๋•Œ ALS ๋Š” ๋‹ค๋ฅธ item ์˜ q_i ๋ฅผ, ๋‹ค๋ฅธ user ์˜ p_u ๋ฅผ ๋ณ‘๋ ฌ์ ์œผ๋กœ ๊ณ„์‚ฐํ•ด๋‚ผ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋น ๋ฅธ ์—ฐ์‚ฐ์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
2.
Recommendation System ์ด implicit data ์— ์ค‘์ ์ ์ธ ํŒ๋‹จ์„ ํ•ด์•ผํ•  ๊ฒฝ์šฐ, ground truth data ๊ฐ€ sparse ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— SGD ์‚ฌ์šฉ ์‹œ ์—ฐ์‚ฐ๋Ÿ‰์ด ๋งŽ์•„์ง€๋Š” ๋‹จ์ ์ด ์žˆ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
ALS ์‚ฌ์šฉ์„ ์ ์ ˆํ•˜๊ฒŒ ๊ณ ๋ คํ•ด๋ณด๋Š” ๊ฒƒ์ด, ๋…ผ๋ฌธ์—์„œ ์ œ์‹œํ•œ ์ฒซ ๋ฒˆ์งธ ๊ฐœ์„  ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.

Improvement โ€” Adding Biases

์•ž์„œ Matrix Factorization ์„ธ์…˜์—์„œ ์„ค๋ช…๋“œ๋ ธ๋˜ loss function ์˜ ๊ฒฝ์šฐ์—๋Š” ์ˆ˜์‹ ์„ค๊ณ„๊ฐ€ user-item ๊ฐ„์˜ interaction ์— ์ดˆ์ ์„ ๋งž์ถ”์–ด ํ•™์Šต์„ ํ•˜๋ ค๊ณ  ํ–ˆ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์ฆ‰, user ์ž์ฒด, item ์ž์ฒด๊ฐ€ rating value ์— ๋ฏธ์น˜๋Š” variation ์— ๋Œ€ํ•œ ๊ณ ๋ ค๊ฐ€ ์ด๋ฃจ์–ด์ง€์ง€ ์•Š์€ loss function ์„ค๊ณ„๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
user1 ์ด ๋‹ค๋ฅธ ์œ ์ €๋“ค์— ๋น„ํ•ด์„œ ์›๋ž˜ ํ‰๊ฐ€๋ฅผ ํ›„ํ•˜๊ฒŒ ํ•˜๋Š” ์Šคํƒ€์ผ์ธ ๊ฒฝ์šฐ, item1 ์ด ๋‹ค๋ฅธ item์— ๋น„ํ•ด์„œ ๊ธฐ๋ณธ์ ์œผ๋กœ ํ‰๊ฐ€๊ฐ€ ์ข‹์€ item ์ธ ๊ฒฝ์šฐ์— ๋Œ€ํ•œ ๊ณ ๋ ค๊ฐ€ ์ด๋ฃจ์–ด์ง€์ง€ ์•Š์•˜๋‹ค๋Š” ์ ์„ ๊ฐœ์„ ํ•˜๋ ค๊ณ  ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด๋ฅผ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•ด ๊ฐœ๋ณ„ user, ๊ฐœ๋ณ„ item ์ด ๊ฐ€์ง€๋Š” bias term ์„ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฐœ์„ ์‚ฌํ•ญ์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
bui=ฮผ+bi+bub_{ui}=\mu+b_i+b_u
์ด bias ๋Š” ์œ„์™€ ๊ฐ™์ด item ์˜ bias b_i ์™€ user ์˜ bias b_u ์— ๋ชจ๋“  user ์˜ ๋ชจ๋“  item ์— ๋Œ€ํ•œ ํ‰๊ท ์ ์ธ ํ‰๊ฐ€ mu ๋ฅผ ๋”ํ•œ ๊ฐ’์œผ๋กœ ์„ค์ •ํ–ˆ์Šต๋‹ˆ๋‹ค.
r^ui=ฮผ+bi+bu+qiTpu\hat{r}_{ui}=\mu+b_i+b_u+q_i^Tp_u
๊ทธ๋ฆฌ๊ณ , ๋…ผ๋ฌธ์—์„œ๋Š” ์ด๋ฅผ inferred recommendation ์— ๋ฐ˜์˜ํ•˜๋Š” ํ˜•ํƒœ๋กœ ๊ฐœ์„ ์„ ์ง„ํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ œ์‹œํ–ˆ๊ณ , ์ด ๊ฒƒ์ด ๋‘ ๋ฒˆ์งธ ๊ฐœ์„  ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.

Improvement โ€” Additional Input Sources

์•ž์„œ ์ด์•ผ๊ธฐํ–ˆ๋˜ user ์˜ rating ์€ explicit ํ•œ rating ์— ํ•œ์ •๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ, user ๊ฐ€ implicit ํ•˜๊ฒŒ system ์— ์ฃผ๋Š” feedback ๋„ ์žˆ๊ณ , ๋…ผ๋ฌธ์—์„œ๋Š” ์ด๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๊ฐœ์„ ์„ ์ง„ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค.
โˆฃN(u)โˆฃโˆ’0.5โˆ‘iโˆˆN(u)xi|N(u)|^{-0.5}\sum_{i\in N(u)}x_i
N(u) ๋ฅผ user u ๊ฐ€ implicit feedback ์„ ํ‘œํ˜„ํ•œ item ์˜ ์ง‘ํ•ฉ์ด๋ผ๊ณ  ํ•˜๊ณ , x_i ๊ฐ€ ๊ทธ item ์— ์ค€ feedback ์ด๋ผ๊ณ  ํ–ˆ์„ ๋•Œ ์ด๋ฅผ ํ•™์Šต์— ๋ฐ˜์˜ํ•˜๋Š” ๊ฒƒ์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค. ์ด ๋•Œ, ๋…ผ๋ฌธ์—์„œ๋Š” ํŽธํ–ฅ๋œ feedback ์ด ๋“ค์–ด๊ฐ€๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด normalize ๋œ ์œ„์™€ ๊ฐ™์€ ํ˜•ํƒœ๋กœ loss function ์— ๋ฐ˜์˜ํ•˜๋Š” ๊ฒƒ์„ ์„ค๊ณ„ํ–ˆ์Šต๋‹ˆ๋‹ค.
โˆ‘aโˆˆA(u)ya\sum_{a\in A(u)} y_a
A(u) ๋ฅผ user ์˜ gender, age group, zip code, income level ๊ณผ ๊ฐ™์€ ์š”์†Œ์˜ ์ง‘ํ•ฉ์ด๋ผ๊ณ  ํ•˜๊ณ , y_a ๊ฐ€ ๊ทธ ์š”์†Œ๊ฐ€ ์ค€ feedback ์ด๋ผ๊ณ  ํ–ˆ์„ ๋•Œ ์ด ๋˜ํ•œ ํ•™์Šต์— ๋ฐ˜์˜ํ•˜๋Š” ๊ฒƒ์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
r^ui=ฮผ+bi+bu+qiT[pu+โˆฃN(u)โˆฃโˆ’0.5โˆ‘iโˆˆN(u)xi+โˆ‘aโˆˆA(u)ya]\hat{r}_{ui}=\mu+b_i+b_u+q_i^T[p_u+|N(u)|^{-0.5}\sum_{i\in N(u)}x_i+\sum_{a\in A(u)} y_a]
๊ทธ๋ ‡๊ฒŒ ์ตœ์ข…์ ์œผ๋กœ inferred recommendation ๋ฅผ ์œ„์™€ ๊ฐ™์€ ํ˜•ํƒœ๋กœ ๋ฐ”๊พธ๋Š” ๊ฒƒ์ด ๋…ผ๋ฌธ์—์„œ ์ œ์‹œํ•œ ์„ธ ๋ฒˆ์งธ ๊ฐœ์„  ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.

Improvement โ€” Temporal Dynamics

์ง€๊ธˆ๊นŒ์ง€ ์ด์•ผ๊ธฐ ํ–ˆ๋˜ System ์€ ์ •์ ์ด์—ˆ์Šต๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ๋ง๋กœ ํ‘œํ˜„ํ•˜๋ฉด ์‹œ๊ฐ„๊ณผ๋Š” ๋ฌด๊ด€ํ•œ ์š”์†Œ๋“ค๋กœ ์„ค๊ณ„๋ฅผ ์ง„ํ–‰ํ–ˆ์—ˆ์Šต๋‹ˆ๋‹ค.
ํ•˜์ง€๋งŒ, ์‹ค์ œ user ์˜ ์„ฑํ–ฅ์€ ์‹œ๊ฐ„์ด ์ง€๋‚จ์— ๋”ฐ๋ผ ์ง€์†์ ์œผ๋กœ ๋ณ€ํ™”ํ•˜๊ณ , item ์˜ ์œ ํ–‰์ด๋‚˜ ์ธ์‹, ์„ ํ˜ธ๋„ ๋˜ํ•œ ์‹œ๊ฐ„์— ๋”ฐ๋ผ์„œ ๋™์ ์œผ๋กœ ๋ณ€ํ™”ํ•ฉ๋‹ˆ๋‹ค.
์ด ๋•Œ๋ฌธ์— ๋…ผ๋ฌธ์—์„œ๋Š” recommender system ์ด dynamic, time-drifting effect ์— ๋Œ€ํ•ด์„œ ๋‹ค๋ฃฐ ํ•„์š”์„ฑ์ด ์žˆ๋‹ค๊ณ  ํŒ๋‹จํ–ˆ๊ณ  ํ•ด๋‹น ์‚ฌํ•ญ์— ๋Œ€ํ•œ ๊ฐœ์„ ์ฑ…์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
์ด์— ๋”ฐ๋ผ ์‹œ๊ฐ„์— ๋Œ€ํ•œ ํ•จ์ˆ˜๋กœ ๋ณ€ํ™”ํ•ด์•ผ ํ•  ๊ฐ’๋“ค๋กœ b_i, b_u, p_u ๋ฅผ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
๊ฐ๊ฐ์— ๋Œ€ํ•œ ์˜ˆ์‹œ๋ฅผ ๋“ค์ž๋ฉด,b_i ๋Š” ๋‹ค๋ฅธ ์˜ํ™”์— ๋“ฑ์žฅํ•œ ๋ฐฐ์šฐ์— ๋”ฐ๋ผ์„œ ์‹ค์‹œ๊ฐ„์œผ๋กœ item ์˜ bias ์„ ํ˜ธ๋„๊ฐ€ ๋‹ฌ๋ผ์ง€๋Š” ํ˜„์ƒ ๋“ฑ์ด ์žˆ์„ ์ˆ˜ ์žˆ๊ณ ,b_u ๋Š” user ๊ฐ€ ๋‚˜์ด๋ฅผ ๋“ค๋ฉด์„œ ํ‰๊ท ์ ์œผ๋กœ ๊ด€๋Œ€ํ•˜๊ฑฐ๋‚˜ ์—„๊ฒฉํ•ด์ง€๋Š” ํ˜„์ƒ ๋“ฑ์ด ์žˆ์„ ์ˆ˜ ์žˆ๊ณ ,p_u ๋Š” user ๊ฐ€ ์Šค๋ฆด๋Ÿฌ ์˜ํ™”๋ฅผ ์ข‹์•„ํ•˜๋‹ค๊ฐ€ ์„ ํ˜ธ๋„๊ฐ€ ๋กœ๋งจ์Šค ์˜ํ™”๋กœ ๋‹ฌ๋ผ์ง€๋Š” ํ˜„์ƒ ๋“ฑ์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
r^ui(t)=ฮผ+bi(t)+bu(t)+qiTpu(t)\hat{r}_{ui}(t)=\mu+b_i(t)+b_u(t)+q_i^Tp_u(t)
๊ทธ๋ ‡๊ฒŒ ์ตœ์ข…์ ์œผ๋กœ inferred recommendation ๋ฅผ ์œ„์™€ ๊ฐ™์€ ํ˜•ํƒœ๋กœ ๋ฐ”๊พธ๋Š” ๊ฒƒ์ด ๋…ผ๋ฌธ์—์„œ ์ œ์‹œํ•œ ๋„ค๋ฒˆ์งธ ๊ฐœ์„  ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.

Improvement โ€”Varying Confidence Levels

๋งˆ์ง€๋ง‰์œผ๋กœ ๋…ผ๋ฌธ์—์„œ ์ œ์‹œํ•œ ๊ฐœ์„ ์ ์€ ๊ด€์ธก๋œ user ์˜ rating ์— ๋Œ€ํ•œ ์‹ ๋ขฐ๋„๊ฐ€ ๋ชจ๋‘ ์ผ์ •ํ•˜๊ฒŒ ๋น„์Šทํ•˜๊ฑฐ๋‚˜ ๊ฐ™์€ ์ˆ˜์ค€์ด๋ผ๊ณ  ๋ณด์žฅํ•  ์ˆ˜ ์—†๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์˜ˆ์‹œ๋กœ ๊ณผ๋Œ€ ๊ด‘๊ณ ๋ฅผ ํ†ตํ•ด์„œ ํŠน์ • item ์— ๋Œ€ํ•ด์„œ ์ข‹์€ rating ์„ ๋ณด์ด๋Š” ํ˜„์ƒ์ด ๋ฐœ์ƒํ–ˆ๋‹ค๋ฉด, ์ด๋Š” ์ผ์‹œ์ ์ธ ํ˜„์ƒ์œผ๋กœ ๋ณด์•„์•ผ ํ•˜๊ณ , ์‹ ๋ขฐ๋„๊ฐ€ ๋‚ฎ์œผ๋ฉฐ, Latent Factor ๋กœ ๋ณด๊ธฐ์—๋Š” ๋ฌด๋ฆฌ๊ฐ€ ์žˆ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.๋˜ํ•œ, recommender system ์ด implicit feedback ์ด ์ฃผ๋œ ์š”์†Œ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์„ ๋•Œ ์ด๋“ค์— ๋Œ€ํ•œ ์‹ ๋ขฐ๋„๋ฅผ ์ •๋Ÿ‰ํ™”ํ•  ํ•„์š”์„ฑ์ด ์žˆ๋‹ค๊ณ  ๋ณธ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด๋ฅผ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•ด ๋…ผ๋ฌธ์—์„œ๋Š” ์ถ”๋ก ํ•œ error ์— confidence score ๋ผ๋Š” ๊ฐ’์„ ๋ถ™์—ฌ์„œ ์‹ ๋ขฐ๋„์— ๋”ฐ๋ฅธ loss function ์— ๋Œ€ํ•œ ๊ธฐ์—ฌ๋„๋ฅผ ์ฐจ๋ณ„ํ™”ํ–ˆ์Šต๋‹ˆ๋‹ค.
minโกpโˆ—,qโˆ—,bโˆ—โˆ‘(u,i)โˆˆKcui(ruiโˆ’ฮผโˆ’buโˆ’biโˆ’puTqi)2+ฮป(โˆฃโˆฃqiโˆฃโˆฃ2+โˆฃโˆฃpuโˆฃโˆฃ2+bu2+bi2)\min_{p^*,q^*,b^*}\sum_{(u,i)\in K}c_{ui}(r_{ui}-\mu-b_u-b_i-p_u^Tq_i)^2+\lambda(\left|| q_i|\right|^2+\left|| p_u|\right|^2+b_u^2+b_i^2)
๋…ผ๋ฌธ์—์„œ๋Š” ์œ„์™€ ๊ฐ™์€ ํ˜•ํƒœ๋กœ loss function ์„ ๋ณ€๊ฒฝํ•˜๋Š” ๊ฒƒ์„ ๋งˆ์ง€๋ง‰ ๊ฐœ์„ ์ ์œผ๋กœ ์ œ์‹œํ–ˆ๊ณ  ์ด๋ฅผ ํ†ตํ•ด ๋ถ€๊ฐ€์ ์œผ๋กœ confidence score ๊ฐ€ action ์— ๋Œ€ํ•œ frequency ๋ฅผ ๋ฌ˜์‚ฌํ•˜๋Š” ์ฒ™๋„๋กœ๋„ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ œ์‹œํ•ด์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค.

Result

์ง€๊ธˆ๊นŒ์ง€ย Matrix Factorizationย ๊ณผ ๊ทธ ๊ฐœ์„ ๋ฐฉํ–ฅ์— ๋Œ€ํ•ด์„œ ์‚ดํŽด๋ณด์•˜์Šต๋‹ˆ๋‹ค.
๋…ผ๋ฌธ์—์„œ๋Š” ์ œ๊ฐ€ ์ด ํฌ์ŠคํŠธ์˜ ์ œ์ผ ์ฒ˜์Œ์—์„œ ์„ค๋ช…ํ•œ ๊ฒƒ์ฒ˜๋Ÿผ Netflix Prize Competition ์— ์ฐธ๊ฐ€ํ•˜์—ฌ 1์œ„๋ฅผ ํ•œ ์ด๋ ฅ์ด ์žˆ๊ณ , ๊ทธ ๊ฒฐ๊ณผ์— ๋Œ€ํ•ด์„œ ์„ค๋ช…์„ ํ•ฉ๋‹ˆ๋‹ค.
์œ„ ๊ทธ๋ฆผ์€ ๊ฒฐ๊ณผ๋กœ ์–ป์–ด๋‚ธ ์˜ํ™”๋“ค์˜ Latent Factor ๋“ค ์ค‘ ๋‘ ๊ฐœ๋ฅผ ๋ฝ‘์•„ ๊ทธ ๋‘˜์„ x, y ์ถ•์œผ๋กœ ํ•˜์—ฌ ์˜ํ™”๋“ค์„ plotting ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด๋ฅผ ๋ฐ˜๋Œ€๋กœ ํ•ด์„ํ•ด๋ณด๋ฉด ์ƒ๋‹นํžˆ ์žฌ๋ฏธ์žˆ๋Š” ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ต๋‹ˆ๋‹ค.
x ์ถ•์€ ์ขŒ์ธก์œผ๋กœ ๊ฐˆ ์ˆ˜๋ก ๋‚จ์„ฑ/์ฒญ์†Œ๋…„์„ ๊ฒจ๋ƒฅํ•œ ๊ต์–‘์ด ๋ถ€์กฑํ•œ ์ฝ”๋ฏธ๋””๋ฅ˜๋กœ, ์šฐ์ธก์œผ๋กœ ๊ฐˆ ์ˆ˜๋ก ์—ฌ์„ฑ์ด ์ค‘์‹ฌ์ด ๋œ ์ง„์ง€ํ•œ ๋ฐฐ๊ฒฝ์„ ๊ฐ€์ง„ ๋“œ๋ผ๋งˆ/์ฝ”๋ฏธ๋””๋ฅ˜๋กœ ๋ณผ ์ˆ˜ ์žˆ์œผ๋ฉฐ,
y ์ถ•์€ ์œ„์ชฝ์œผ๋กœ ๊ฐˆ ์ˆ˜๋ก ๋…๋ฆฝ์ ์ด๋ฉฐ, ํ‰๋ก ๊ฐ€์— ํ˜ธํ‰์„ ๋ฐ›์€ ๊ธฐ๋ฐœํ•œ ์˜ํ™”๋“ค๋กœ, ์•„๋ž˜์ชฝ์œผ๋กœ ๊ฐˆ์ˆ˜๋ก ์ฃผ๋ฅ˜ ์˜ํ™”๋กœ ๋ณผ ์ˆ˜ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.
๊ทธ๋ž˜์„œ์ธ์ง€ ์ขŒ์ธก ์ƒ๋‹จ์˜ ์˜ํ™”๋“ค์€ ์ธ๋”” ์˜ํ™”์™€ ๊ต์–‘ ์—†๋Š” ๋ถ„์œ„๊ธฐ์˜ ์˜ํ™”๊ฐ€ ์„ž์ธ ์˜ํ™”๋“ค์ด ์œ„์น˜ํ–ˆ๊ณ , ์šฐ์ธก ํ•˜๋‹จ์˜ ์˜ํ™”๋“ค์€ ์—ฌ์„ฑ ์ค‘์‹ฌ์˜ ์ง„์ง€ํ•œ ์ฃผ๋ฅ˜ ์˜ํ™”๊ฐ€ ์œ„์น˜ํ–ˆ์Šต๋‹ˆ๋‹ค.
๋”๋ถˆ์–ด Annie Hall, Citizen Kane ๊ณผ ๊ฐ™์€ ์˜ํ™”๋Š” ๊ฒ‰๋ณด๊ธฐ์— ์Šคํƒ€์ผ์ด ๋งŽ์ด ๋‹ค๋ฅธ๋ฐ ๋น„์Šทํ•œ ์œ„์น˜์— ์žˆ์–ด ์˜์•„ํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ ์‹ค์ œ๋กœ ์ด๋“ค์€ ์„ธ ๋ฒˆ์งธ Factor ๋กœ ์ธํ•ด์„œ ๋ถ„๋ฆฌ๊ฐ€ ๋œ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
์œ„ ๊ทธ๋ฆผ์€ Matrix Factorization ์˜ ๊ฐœ์„ ๋ฐฉํ–ฅ์„ ์ง์ ‘ ์ ์šฉํ•ด ๋ณธ ๋’ค ์ธก์ •ํ•œ RMSE ์— ๋Œ€ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด๋ฅผ ํ•ด์„ํ•˜๋ฉด,
1.
ํ•™์Šต์„ ์œ„ํ•œ Parameter ์ฆ๊ฐ€,
2.
Bias ํ•ญ๋ชฉ์˜ ๊ณ ๋ ค,
3.
Implicit feedback ํ•ญ๋ชฉ์˜ ๊ณ ๋ ค,
4.
Temporal Dynamics ํ•ญ๋ชฉ์˜ ๊ณ ๋ ค
๊ฐ€ ์‹ค์ œ๋กœ๋„ RMSE ๋ฅผ ์ค„์ผ ์ˆ˜ ์žˆ์—ˆ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

Conclusion

์ด๊ฒƒ์œผ๋กœ ๋…ผ๋ฌธย โ€œMatrix Factorization Techniques For Recommender Systemsโ€ย ์˜ ๋‚ด์šฉ์„ ๊ฐ„๋‹จํ•˜๊ฒŒ ์š”์•ฝํ•ด๋ณด์•˜์Šต๋‹ˆ๋‹ค.
์ถ”์ฒœ ์‹œ์Šคํ…œ์— ๋Œ€ํ•ด์„œ ๊ฐ„๋‹จํžˆ collaborative filtering ์ •๋„๋งŒ ์•Œ๊ณ  ์žˆ์—ˆ๋Š”๋ฐ, ์ด ๋…ผ๋ฌธ์ด ๋น„๋ก ์˜ค๋ž˜๋˜์—ˆ์ง€๋งŒ, ์ถ”์ฒœ ์‹œ์Šคํ…œ์˜ ์ „๋ฐ˜์ ์ธ ๋ฐœ์ „์— ๋Œ€ํ•ด์„œ ์ดํ•ดํ•˜๊ธฐ์—๋Š” ๊ต‰์žฅํžˆ ํฐ ๋„์›€์ด ๋˜์—ˆ๋˜ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.
์ฒ˜์Œ์—๋Š” YouTube ์˜ ์ถ”์ฒœ ์‹œ์Šคํ…œ์— ๋Œ€ํ•œ ๋…ผ๋ฌธ์„ ์ฝ์œผ๋ ค๊ณ  ํ–ˆ์—ˆ๋Š”๋ฐ ์„ ํ–‰ ์ง€์‹์ด ๋งŽ์ด ๋ถ€์กฑํ•˜๋‹ค๊ณ  ์ƒ๊ฐ๋˜์–ด ๊ทธ ์ด์ „์˜ ๋…ผ๋ฌธ์„ ์ฐพ์•„ ์ฝ์–ด๋ณด์•˜์Šต๋‹ˆ๋‹ค. ์ƒ๊ฐ๋ณด๋‹ค ๋งŽ์€ ์ง€์‹์„ ์•Œ๊ฒŒ๋˜์–ด ๋‚˜๋ฆ„ ๋งŒ์กฑ๋„๊ฐ€ ๋†’์€ ๋…ผ๋ฌธ์ด์—ˆ๋˜ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๊ธฐํšŒ๊ฐ€ ๋œ๋‹ค๋ฉด ๋‹ค์‹œ YouTube ์˜ ์ถ”์ฒœ ์‹œ์Šคํ…œ์„ ์ •๋ณตํ•˜๋Ÿฌ ๋„์ „ํ•ด๋ณด์•„์•ผ๊ฒ ์Šต๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ๋ถ„๋“ค๊ป˜๋„ ์ด ํฌ์ŠคํŠธ๋ฅผ ์ถฉ๋ถ„ํžˆ ์ˆ™์ง€ํ•˜์‹  ๋’ค ํ•œ ๋ฒˆ ๋„์ „ํ•ด๋ณด์‹œ๋Š” ๊ฒƒ์„ ์ถ”์ฒœํ•ด๋ด…๋‹ˆ๋‹ค.