Recommendation Systems

Chap 1: Introduction
  • Algorithmic Personalization:
    • Understanding Customer Preference, Interest and Intend
    • Provide them Relevant Tips to help them
  • Taxonomy: 
    • Data 
      • Limited Data from Experience and Feedback Loop 
    • Compute 
      • Limited Computed for Learn and Adopt simultaneously 
    • Interest 
      • Conflicting interests to understand customer real interests 
    • Action 
      • Incongruity between true desired action and given 
    • Content 
      • Getting relevant contents to serve up intentions of customers   
  • Differences
    • Prediction - Suggestion for actions (Rating) on certain past actions 
      • Help quantify items
    • Recommendation - Suggestion of Items (Top n) of certain Category 
      • Provide good choice to start with
Non Personalized Recommendation:

Chap 2: Information Retrieval 
  • Concept 
    • Item to Item Attributes
      • Similarity, Adjacency
    • Customer to Customer Attributes 
      • Similarity, Adjacency
  • Algorithm 
  • Formula 
  • Pros 
  • Cons 
Chap 3: Content Based Filtering (CBF)
  • Concept 
    • Based building attribute around product and their preferences for Customer 
    • Build using like, purchase, clicks 
  • Algorithm 
    • Calculate Affinity of customer for each traction
    • Assign weight to each attribute (Positive or Negative)  
    • Decay old profile, keep introducing rating new with higher weight
  • Formula 
    • TFIDF (Term Frequency and Inverse Document Frequency)
      • Term Frequency - No of occurrences of a term appeared in document
      • Inverse Document Frequency - How few document contains this term  
        • # of occurrences in doc *  log ( Total # Doc / # Docs with occurrences  )
      • Used to Stop Words and Bringing Core Word Set 
  • Pros 
    • Structured way of building Customer Profile 
    • Based on Content 
    • Simpler Computation  
    • Query Based System ; A Case Based System
  • Cons 
    • Attribute and Weight Factors 
      • Too many attributes could lead to confusion at algorithm 
      • Attribute structure is rigid and require manual filtration
    • Cold Start Problem - Without interaction model, there is nothing to recommend. 
    • Recency Factory -  Does not adjust quickly to changed user behaviour 
    • Computationally Expensive - as it require recalculation at each change in rating or transaction 
    • Cant handle Abstract Concept - It is exact science
Chap 4: User to User Collaborative Filtering (U-U CF) 
  • Concept
    • Recommend by looking at rating similarity between users for a set of items. 
    • (User * Rating) 
      • To find similar users by looking at their rating patterns around a set of items
      • To predict user rating for a set of items (without rating) 
      • Rows for similarity of rating pattern between users 
    • Based on User to User similarity - Which tends to change frequently  and widely
  • Formula 
    • Sum for Each Customer ((Mean Rating - Rating for Item) * Weighted Factor) / Sum (Rating) 
    • Weighted Factor = 
  • Algorithm 
    • Selecting Customer set using
      • Similarity Factor
      • Neighbor Factor 
  • Pros
    • Suites where there is higher nos of user rating available
  • Cons 
    • Sparsity of Data:
      • Lower nos of users ratings/reviews per product items   
      • Smaller set of user set to predict rating for an item (unless adjustment taken into consideration) 
    • Computationally Expensive:
      • Need to be computed regularly as user behavior changes,
      • higher number of user will  make these
Chap 5: Evaluation
  • Methods
    • Accuracy Metrics 
      • Mean Absolute Error  = (P - R)/ # Ratings
      • Mean Squared Error = (P - R)^2 / # Ratings
      • Root Mean Squared Error = v/ (P - R)^2/ # Ratings
    • Error Metrics
    • Decision Support 
    • User/Usage Centric Metrics 
  • Prediction vs Top N 
    • Decision Support 
    • Accuracy vs Ranking
    • Focus - Locally vs Comparatively 
  • Accuracy Matrix 
    • Prediction Accuracy - Estimating Preference
    • Decision Support Accuracy  - Finding useful/good things
    • Rank Accuracy  - Estimate Relative Preferences 
  • Testing 
    • Live vs Dead Recs
Chap 6: Item to Item Collaborative Filtering (I-I CF)   
  • Concept
    • Recommend by looking at rating similarity between users for a set of items. 
    • (User * Rating) 
      • To find similar items by looking at rating patterns around a set of users 
      • To predict Items rating for a set of user
      • Columns for similarity of rating pattern between Items  
    • Based on Item to Item similarity - Which does not change much over time
  • Algorithm 
    •  
  • Formula
    •  
  • Pros
    • Computationally Economical: 
      • Can be Pre-computed and need not to be computed frequency as Item to Item similarity stays
    • Sparsity of Data: 
      • Can work with lesser no of Rating
  • Cons  
    •  
Chap 7: Dimensional Reduction Recommendation  
  • Concept 
    • Instead of User to Item Rating based Matrix calculation, it works with reduced set of features. 
  • Algorithm 
    • Identity Concept instead Keyword from Information Retrieval
    • Calculate their weight for Equation 
    • Calculate User rating for Concepts (K Feature) 
    • Calculate Item contents for Concepts (K Feature) - Information Filtering
  • Formula 
    • Singular Value Decomposition (SVD): 
      • Breaking the matrix around k feature vectors 
        • User Matrix for K Feature Vector 
        • Item Matrix for K Feature Vector
        • Weight Diagonal Matrix for K Feature
  • Pros 
    • Work on Concept instead of Keyword intensity 
    • Computationally Economical at Run-time
      • Time complexity O(m*n + n^3)
      • Expensive in totality 
  • Cons 
    • Model Refresh Frequency 
      • Dependency on User Rating, will require 
    • Tolerance for Missing Values 
      • Assumes Matrix is full
      • Need to apply Impute - (Mean Value )
Chap 8: Other Recommended Viewpoint  
  • Context Aware Recommend
    • Types 
      • Personal Context (Mood, preferences, ....)
      • External Context (Weather, office, driving, location)
      • Social Context (People around you)
    • Interface 
      • Live Interaction 
      • Mobile Interfaces 
      • Implicit Behaviour vs Explicit Rating 
    • Technique 
      • User, Item, Context and Rating
      • Pre Context Filtering 
        • Filter on Context 
        • Recommend on U, I, R 
      • Post Context Filtering 
        • Recommend UIR 
        • Filter on Context 
      • Modelled Context Filtering 
        • Considering all four 
        • Building Multi Dimensional Model Processing  
  • Netflix Recommender
    • Learning to Rank 
      • Pairwise and List-wise Approaches
    • Core 
      • Category - Personalized 
      • Rating - Personalized  
    • Function 
      • Popularity 
      • Rating (Implicit and Explicit) 
    • Formula
      • Linear Regression - Determine
        • f(u, v) = w * p(v)
          • w * f ( P * R) 
          • P - Popularity 
          • R - Rating 
        • Weight - Giving preference between axis 
        • Determine Weight - Classification = Logistic Regression
      • Classification -
      • Decision trees -
      • Gradient Dissent - 
    • Notes 
      • Learning from Implicit Action 
      • Explicit Rating r corrupted
  • LinkedIn Recommender
    • Types 
      • Content Filtering 
      • Collaboration Filtering (SVD)
      • Popularity (Trending)
      • Social
    • Approach 
      • Feature Extraction 
      • Entity Resolution 
      • Meta Data Enrichment  
    • Technique 
      • Interaction Splits 
        • Selecting Model based on Features - Using Decision Tree Mechanism 
        • Algorithm Families
          • Decision Tree
          • Simple Tree - Learner Regression
        • Model Coefficient
          • Demographics Models 
    • Evaluation
      • Model Fitting Technique Matrices(Quantitatively) :
        • CV Error (Cross Validation)
        • Precision@ K
        • AUC
        • PR-AUS
        • RMSE
        • Multi Varient Testing
      • A/B Testing
        • Presentation Biased Effect
        • Impression Discounting - Removing not responded to 
        • Effect of other A/B testing
          • Role > Divide A/B Testing
          • Lazy Orthogonal Multi Varient Testing - Quantify the effect of other testing.
        • Novelty Effect:
          • New Algorithm gets spike in interaction - Need to ignore it for Evaluation
          • Burn In Period - Let it get normalized
        • Network Effect:
          • Effect on one cluster of customer getting effected
        • Power Analysis:
          • To determine Duration and Amount of Traffic allocation to run this test.
            • Depending Factors
            • Variance of metrics
            • Sample Size
            • Effect Size
    • Notes 
      • Feature vs Models 
    • Technology
      • Mahout / R Hadoop
  • Dialogue Based Recommended (Critique / Case Based) 
    • Useful 
      • Large Cost Items - Where Purchase Cycle demands Research and longer cycle 
    • Technique 
      • Breaks the Category in Features 
      • Ask user Requirement and Importance for Each 
      • Internally assign weights to interrelations 
      • Based on this, Recommend  
      • Adjusted Recommendation - If not liked by Users 
Appendix:  
Terminology: 
  • Unary: 
    • Represented as a series of 1 nos 
    • Analytic Types 
    • Exploratory
    • Descriptive 
  • Vector 
    • Characteristic Variable to show a behaviour - Could be a set of values representing 
  • Model
Machine Learning
  • Feature - Each Token Occurrence 

Comments

Popular posts from this blog

ML Algirithms - Cheat Sheets

McKinsey Innovation - Horizon Model

Go To Market Strategy