Bootstrapping a recommendation product
Recently, I've been working on OpenGraph entities recommendation since Feb. Specifically, the vertical I focus on is bootstrapping recomendation product of Facebook Events (news here). Each of our backend ranking engineer has one vertical across: Restaurant, Movie, Event, Place, Books. It's quite fun that I can work alone on Event vertical. I had a lot of autonomy to try different ML techniques. Although I worked day and night in past couple months, it's a really rewarding experience to see how this product got shipped! More importantly, previously, I gained lots of knowledge about ML algorithm. This time, I learned something beyond algorithm or system, called Product :)
To build a recommendation model, you need to start with some hand-tune model to first accumulate training data as well as make the initial recommendation has decent quality. (Because it's a recommendation product, we couldn't just randomly suggest Events to users)
The way we cold started it is using linear scoring function with prelimary tuned weights on first set of selected features like: Overall Popularity, Similarity score with user's previous attended events and poplularity among friends etc.
This gave us a decent recommendation quality and gathered initial training data.
As soon as we collected training data, we started to notice the model has bias towards what we've biased towards at initial cold start. Because for positive label, it would always have those biased feature distribution.
We tried couple solutions but the most effective way is to allocate small portion of randomizedly picked Event among those FB Events feeds. This gave us more signals
Internal dogfooding is our major feedback channel. Very often, we found Explainability is one major part for suggesting events to users. User would ask questions :"Why you suggest this event to me? I'm not very interested" Although, the event is same topic/host as the one the user previously attended.
This put a challenging problem to the model selection itself as we are using gradient boosted decision trees. The way we solved this explainability was to add a post process to compute a "reason" backward. This change siganificantly boosted users product satisfaction and action conversion.
Recommendation quality has 2 level interpretation: a. Relevance b. Spam. Relevance could be solved by our explainability work. As the product rolled out more broadly, spammy hosts ramped up their events creation/ads to get traffic from our recommendation entry point. This harms our product's user experience and hurt the impression of good events.
I spent 1 month on creating separate text classifiers and image classifiers to flag spammy event content from their description and event photo. One thing I learned is you can affect your model performance by changing the product flow. A large portion of good events don't have decent event cover photo. We actually made event creation flow cover photo a big step before submitting. This will give us a lot feature coverage among all our events inventory.
The Key to scale & model improvement
The more time I spent on training and tunning, the more I appreciate the training infra and tooling provided by FB learner. I can easily push out new model any time I want, which made my model iterations much faster.
Another aspect is the volume of data. Due to the scale of Facebook, I could easily accumulate enough trainning data just 10 - 20 mins after rolling out new model! Literally, I can see CTR and Action Conversion metrics started moving up if the model performance is better. And I could easily rollback if the trend is not good. It's all because the scale of FB could give instant feedback to your model update.