Learning To Rank在个性化电商搜索中的应用

Learning To Rank在个性化电商搜索中的应用

1. Learning To Rank在个性化电商搜索中的应用 吴晨 (搜索BG: Natural Artificial Intelligence) 2016.10.22
2. Outline Œ  Background   Learning to Rank Ž  Personalized E-Commerce Search   Summary   Reference 2
3. Background Predict relevance scores and re-rank products returned by an ecommerce search engine on the search engine result page (SERP) •  Data Using –  Search, browsing, and transac2on histories for all users and specifically the user interac2ng with the search engine in the current session –  Product proper2es and meta-data •  Method Using –  Machine Learning (e.g. RankSVM, LambdaMart) –  Ranking Func2on (e.g. BM25, Cosine Similarity) 3
4. LEARNING TO RANK 4
5. Introduc:on •  Ranking Problem –  Learning to Match? •  Methods –  Pointwise –  Pairwise –  Listwise •  Theory (PAC) –  Generalization –  Stability •  Applications –  –  –  –  Search Recommender System Question Answering Sentiment Analysis 5
6. Formula:on •  Machine Learning –  Supervised learning with labeled data •  Ranking of objects by subject –  Feature based ranking function •  Approach –  Traditional •  BM25 (Probabilistic Model) –  New •  •  •  •  Query and associated products form Group (Train Data) Groups are i.i.d Features (query and product) in Group are not i.i.d Model is a function of features 6
7. Issues •  Data Labeling –  Relevance metric (Point) –  Ordered pairs –  Ordered list •  Feature Extraction –  Relevance (User/Query-Prod Feature) –  Semantic (User/Query-Prod Feature) –  Importance (Prod Feature) •  Learning Method –  Model –  Loss Function –  Optimized Algorithms •  Evaluation Measure –  NDCG@k 7
8. Methods •  Machine Learning –  Classification –  Regression –  Ordinal Classification/Regression •  Ordinal Regression –  Pointwise •  Transfer ranking to regression •  Ignore group info •  Learning to Rank –  Pairwise •  Transfer ranking to binary classification –  Listwise •  Straightforward represent learning 8
9. Pointwise Model •  McRank (2007) •  Ordinal Liner Regression –  (Staged) Logistic Regression 9
10. Pairwise Model •  RankSVM (2000) –  Pairwise classification •  IR SVM (2006) –  Cost-sensitive Pairwise –  Using modified hinge loss •  RankBoost (2003) •  RankNet (2005) •  LambdaMart (2008) 10
11. Listwise Model •  Plackett-Luce Model •  ListMLE (2008) •  ListNet (2007) –  Parameterized Plackett-Luce Model •  AdaRank (2007) •  PermuRank (2008) •  SVM-Map (2007) 11
12. Op:mize •  Direct Optimization –  AdaRank –  SVM Map •  Approximation –  Soft Rank –  Lambda Rank •  Learning Framework –  –  –  –  Data Representation Expected Risk Empirical Risk Generalization Analysis •  Evaluation –  Pairwise approach and Listwise approach perform better than Pointwise approach 12
13. Applica:ons •  Search –  Re-Ranking •  Recommender System –  Collaborating Filter 13
14. PERSONALIZED E-COMMERCE SEARCH 14
15. Introduc:on •  Pertinence –  Log Analysis –  Conversion in E-commerce •  give a greater score to clicks that eventually got converted into a sale •  Data –  –  –  –  –  –  User info List of the terms that forms the query Displayed items and their domains Items on which the user clicked Timing of all of these actions History Behaviors Day 28 to Day 30 •  Ensemble Model –  Boosting –  Bagging –  Stacking •  Trap –  Position Bias 15
16. Related Work •  Clicks feedback •  When to do personalize ? –  Long Term –  Short Term •  Past interaction timescales •  Search behaviors beyond clicks •  Learning from all repeated results 16
17. Features •  Aggregate Features –  User-specific / Anonymous •  Query features •  User click habits –  Number of times the user clicked on the item in the past •  Session features •  Non-Personalized Rank –  Read linearly –  Computed with information •  Inhibiting/Promoting features –  Query click entropy 17
18. Methodology •  Classification will be used –  Parameter of the classifier should be tuned to op2mize the NDCG score on the cross valida2on set •  Query Full –  SERPs returned in response to a query •  Query Less –  SERPs returned in response to the user click on some product category 18
19. Ensemble Model A very powerful technique to increase accuracy on a variety of ML tasks •  Boosting •  Bagging Ensemble Correlation •  Voting •  Weighing •  Averaging •  Rank averaging •  Stacking/Blending –  –  –  –  –  Split the train set in A and B Fit a first-stage models on A and create predictions for B Fit the same model on B and create predictions for A Finally fit the model on the entire train set and create predictions for test set Train a second-stage stacker model on the probabilities from the first-stage model(s). Stacking with logistic regression is one of the more basic and traditional ways of stacking 19
20. Scenario I: Query Full Relevance Match + Semantic Match Lambda Mart Deep Match Model Stacking Ensemble Rank SVM 20
21. Scenario I: Query Full – DNN Model Pairwise Loss Func2on Word2Vec Matrix Seman2c Score 1 Seman2c Score N Neural Network Layer Neural Network Layer Neural Network Layer Neural Network Layer Neural Network Layer Neural Network Layer Query Vector 1 Item Vector 1 Query Vector N Item Vector N Query Token 1 Item Token 1 Query Token N Item Token N 21
22. Scenario II: Query Less Click Model + Recommender System Lambda Mart TreeLink Stacking Ensemble (Linear) Rank SVM 22
23. Work Flow Data Construc2on Feature Engineering Model Training Model Valida2on
24. ACM CIKM 2016 Compe::on Improved the challenging non-personalized baseline by 21.28% 24
25. Further work •  Learning from implicit data –  Labeled Data Generate •  Model (Feature) learning –  Model as Feature •  Scenario-dependent ranking 25
26. Summary •  •  •  •  Branch of Machine Learning Feature Extraction Ensemble Method Engineering –  Dataflow –  Workflow •  Production –  New sort is greatly influenced by the initial sort. –  Initial sort can probably be considered as not holding much pertinence information –  Practical solution is zero all your rank feature before prediction 26
27. Reference •  •  •  •  •  •  •  •  C. J. C. Burges. From RankNet to LambdaRank to LambdaMART: An overview. Technical report, Microsoft Research, 2010. Hang Li. Learning to Rank. In ACML Tutorial, 2009 Zhengdong Lu, Hang Li, A Deep Architecture for Matching Short Texts, In Proceedings of Neural Information Processing Systems 26 (NIPS), 1367-1375, 2013. Wei Wu, Hang Li, Jun Xu, Learning Query and Document Similarities from Click-through Bipartite Graph with Metadata, In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining (WSDM), 687-696, 2013. 周志华. 机器学习, 171-190, 267-287, 2016 Hang Li & Zhengdong Lu. Deep Learning for Information Retrieval. In SIGIR Tutorial, 2016 http://mlwave.com/kaggle-ensembling-guide/ http://machinelearningmastery.com/machine-learning-ensembles-with-r/ 27
28. E-Mail: wuchen.wc@alibaba-inc.com 28