Thoughts on RecSys 2015
And so with RecSys 2015 in Vienna, my conference season comes to an end. Time now to buckle down to running experiments and writing papers. Hopefully next year I'll be presenting at a conference and not just attending them!
RecSys is fundamentally different to the other academic ML conferences I've attended (NIPS, KDD, ICML, ICLR). Pretty much everything you see at RecSys you could start using straight away in your own system / application if it meets your needs. The other conferences are more theoretical and a paper / poster will often be a dense discourse on the theory / math without any discussion on the application. You leave those conferences gestating on a lot of abstract material and then applying it to your data / domain. That's not an endorsement of RecSys or a criticism of the other conferences, just something to note if you're looking for pointers on how applied RecSys is.
The focus of RecSys is also, well.. focused on Recommender Systems (RS)! For sure, there are tracks organised along facets such as locality / mobile, crowd-sourcing, user interface design, content vs user, cold start, social signals / data, but they all converge on the core task of suggesting relevant content or actions for the end-user based on filling in the latent blanks in a user+content+environment model. For me personally, it's a great way to ramp up on state of the art in just a few days, since RS is a big part of Holistic.
This year there were two trends that I spotted:
A move away from a slavish quest to improve accuracy and instead improve explainability
More effort expended on user interface / experience (UI/UX) design to give the end-user more control over the RS they are using and also to elicit more information from them to help improve algorithm performance - a sign I think that the best RS algorithm needs some end-user feedback to do a good job.
A few papers caught my eye and I call them out here. The conference proceedings are here:
Gaussian Ranking by Matrix Factorization by Harald Steck from Netflix. This paper reformulates the problem of ranking / matrix factorisation as a neural network training problem which is interesting. Clearly of course the next step is to now stack loads of these networks together so Netflix too can be learning deeply :)
HyPER: A Flexible and Extensible Probabilistic Framework for Hybrid Recommender Systems by Pigi Kouki et al. I like this paper simply because of the PSL (Probabilistic Soft Logic) modelling language used which generate hinge-loss Markov random fields (HL-MRFs). It's a nice idea and somehow reminds me of the promise of PROLOG rules that are easy to understand but powerful / high capacity at the same time. There is also software on GitHub to clone, run and evaluate which is always good!
RecSys Challenge 2015: ensemble learning with categorical features by Peter Romov et al from Yandex. These guys won the challenge this year but used a partially black-box algorithm. I'm not sure about this.. I understand the need for opaque data sets, but surely opaque algorithms goes against the point of an academic conference.. Nevertheless represents state of the art in predicting buy events from click-stream data in 2015..
Neural Modeling of Buying Behaviour for E-Commerce from Clicking Patterns by Wu et al. This paper applied LSTM to the same problem as the previous paper and came 10th in the competition. Given that most of the other papers used ensemble learning methods / Gradient Boosting machines, the novelty factor here is nice. LSTM is racking up state of the art scores in sequence-to-sequence translation, handwriting recognition and generative models so the logical question is how well it can transfer out of these domains and into domains like RS. Maksim Volkovs also used neural nets but not specifically LSTM (the paper doesn't seem to mention if he even used RNNs or regular feed-forward nets).
In summary, a great few days to really double-down on the latest developments and focus of the RS community.