By: Susheel

Susheel — Thu, 13 Nov 2008 07:12:44 +0000

There is a paper published on this:
http://www.esprockets.com/papers/adsorption-yt.pdf

By: Lawrence Miao

Lawrence Miao — Tue, 20 May 2008 14:55:10 +0000

I read a paper today solving the same problem in a similar site. The model is built in a bipartite graph, one set is video set, another set is keyword set. The keywords are parsed by natural language processing tools from tags provided by user, and also the title. The paper co-clustered the graph by using some information theory based metrics. After getting the co-clusters, by ranking, we can get the ‘hot’ topics and by traversal property or connectivity of the bipartite graph, we can get related videos.

according to the tech:
The tags of Material Girl are:
self-parody economic metaphor high camp golddigga bling ice cashmoney holla OG madonna material girl

The tags of Girls Just Want to Have Fun:
Girls Just Want to Have Fun Cyndi Lauper Music

The tags of Diamonds Are A Girl’s Best Friend:
marilyn monroe diamonds girl best friend gentlemen prefer blondes song songs movie movies

All of them have ‘girl’, from this, all of the three are connected in the bipartite graph, other techniques can be used to do ranking. There are lots of ranking schemes available. For example, if in the ranking, Youtube use data mining techs like (sequence) frequent patterns of user click sequence history, there’ll be high probability, user like all of the three and watch them all in a sequence.

This is the first time, I made comment here. Your site is very useful for me. Thanks for your work. : )

btw, the paper title is
Web video topic discovery and tracking via bipartite graph reinforcement model,
by
Lu Liu, Lifeng Sun, Yong Rui, Yao Shi, Shiqiang Yang,
appeared in WWW 2008

By: Curt Monash

Curt Monash — Tue, 20 May 2008 06:54:51 +0000

Better theory than any I came up with. Thanks!

CAM

By: rtl

rtl — Tue, 20 May 2008 02:50:44 +0000

almost certainly normalized co-occurrences of favorites or views.

The tags are completely different so it’s not mutual tags. No freaking way on the visual comparison. There’s no place to manually enter the “historical” connection and that’s obviously not scalable. Mining the comments text is messy. One other option would be show candidate videos randomly in that space and rank them by eCTR; but the candidate list would anyways have to be generated using one of these techniques.

Comments on: How is YouTube relating videos?

By: Susheel

By: Lawrence Miao

By: Curt Monash

By: rtl