Presentation: People You May Know: Fast Recommendations Over Massive Data

Track: Predictive Architectures in the Real World

Location: Cyril Magnin I + II

Duration: 1:20pm - 2:00pm

Day of week: Tuesday

Share this on:


The “People You May Know” (PYMK) recommendation service helps LinkedIn’s members identify other members that they might want to connect to and is the major driver for growing LinkedIn's social network. The principal challenge in developing a service like PYMK is dealing with the sheer scale of computation needed to make precise recommendations with a high recall. PYMK service at LinkedIn has been operational for over a decade, during which it has evolved from an Oracle-backed system that took weeks to compute recommendations to a Hadoop backed system that took a few days to compute recommendations to its most modern embodiment where it can compute recommendations in near real time.

This talk will present the evolution of PYMK to its current architecture. We will focus on various systems we built along the way, with an emphasis on systems we built for our most recent architecture, namely Gaia, our real-time graph computing capability, and Venice our online feature store with scoring capability, and how we integrate these individual systems to generate recommendations in a timely and agile manner, while still being cost-efficient. We will briefly talk about the lessons learned about scalability limits of our past and current design choices and how we plan to tackle the scalability challenges for the next phase of growth.

Speaker: Sumit Rangwala

Staff Software Engineer - Artificial Intelligence @LinkedIn

Find Sumit Rangwala at

Speaker: Felix GV

Staff Software Engineer @LinkedIn

Felix GV is a software engineer working on LinkedIn's data infrastructure. He leads the Venice project and keeps a close eye on Hadoop, Kafka, Samza, Azkaban, Zookeeper, Helix and Avro.

Find Felix GV at

2019 Tracks

  • Groking Timeseries & Sequential Data

    Techniques, practices, and approaches around time series and sequential data. Expect topics including image recognition, NLP/NLU, preprocess, & crunching of related algorithms.

  • Deep Learning in Practice

    Deep learning use cases around edge computing, deep learning for search, explainability, fairness, and perception.