(no title)
eshvk | 7 years ago
1. Decide whether you are okay with a batch approach or an online learning approach or a hybrid.
2. Start simple with a batch approach (similar to what you are doing):
a) Get features ready from your dataset (assuming you have interaction data) : Pre-processing via some big data framework (Map Reduce, Data flow etc)
b) Build a vector space and nearest neighbors datastructures.
c) Stick both into a database optimized for reads
d) Stick a service in front of it and serve.
Once you are happy with 2, you can try out variations involving either online updates to your recommender system which involves changes to the type of database you might want to optimize. etc
n_siddharth|7 years ago
eshvk|7 years ago
In the past, I have helped build Lambda architectures where we use a batch model to build a content vector space, build estimates of users in batch, update those in realtime (using PubSub/Kafka) based on user feedback.
Other online mechanisms could be to use Contextual Bandits: e.g. use context in terms of user interactions with the several arms of the bandits being recommendation choices etc. This interaction data can be used to continuously improve your policy. Of course, the key benefit over a Matrix Factorization setup where the interaction matrix is continuously rebuilt over time based on new data, is the in built exploration which minimizes regret.
chudi|7 years ago