Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track European Conference, ECML PKDD 2020, Ghent, Belgium, September 14-18, 2020, Proceedings, Part V

Saved in:
Bibliographic Details
Main Authors Dong, Yuxiao, Ifrim, Georgiana, Mladenić, Dunja, Saunders, Craig, Van Hoecke, Sofie
Format eBook
LanguageEnglish
Published Cham Springer International Publishing AG 2021
Edition1
Subjects
Online AccessGet full text

Cover

Loading…
Table of Contents:
  • B Choosing Hyperparameter -- References -- Multi-future Merchant Transaction Prediction -- 1 Introduction -- 2 Notations and Problem Definition -- 3 Model Architecture -- 3.1 Shape Sub-network -- 3.2 Scale Sub-network -- 4 Training Algorithm -- 5 Experiments -- 5.1 Description of the Datasets -- 5.2 Evaluation of Architecture Design Choice -- 5.3 Evaluation of the Multi-future Learning Scheme -- 6 Related Work -- 7 Conclusion -- References -- Think Out of the Package: Recommending Package Types for E-Commerce Shipments -- 1 Introduction -- 1.1 Contributions -- 2 Related Work -- 2.1 Existing Packaging Selection Process -- 2.2 Why Not Ordinal Regression? -- 2.3 Comparison with Standard Machine Learning Approaches for Package Planning -- 3 Two-Stage Approach for Optimal Package Selection -- 3.1 Stage 1: Estimating the Transit Damage Probability of a Product Given a Package Type -- 3.2 Stage 2: Identifying the Optimal Package Type for Each Product -- 4 Experimental Results -- 4.1 Calibration -- 4.2 Package Type Recommendation -- 4.3 Impact Analysis from Actual Shipment Data -- 5 Conclusion and Future Work -- References -- Topics in Financial Filings and Bankruptcy Prediction with Distributed Representations of Textual Data -- 1 Introduction -- 2 Literature Reviews -- 3 Data and Methods -- 3.1 Data -- 3.2 Latent Dirichlet Allocation -- 3.3 Topic Modelling in Embedding Spaces -- 3.4 Number of Topics Assessments -- 3.5 Bankruptcy Prediction Feature Sets -- 3.6 Experimental Setup -- 4 Experimental Results and Discussions -- 4.1 Number of Topics in MD&amp -- A -- 4.2 Topics in MD&amp -- A and Their Evolution -- 4.3 Predictive Performance -- 5 Conclusions -- References -- Why Did My Consumer Shop? Learning an Efficient Distance Metric for Retailer Transaction Data -- 1 Introduction -- 2 A Framework for Learning the Distance Metric
  • Intro -- Preface -- Organization -- Contents - Part V -- Applied Data Science: Social Good -- Confound Removal and Normalization in Practice: A Neuroimaging Based Sex Prediction Case Study -- 1 Introduction -- 2 Sex Classification and Brain Size -- 3 Experimental Setup -- 3.1 Study Design -- 3.2 Confound Regression -- 3.3 Predictive Modelling -- 4 Data Samples and Features -- 4.1 Data Samples -- 4.2 Pre-processing and Feature Extraction -- 5 Results -- 5.1 Generalization Performance Estimates -- 5.2 Predictive Features -- 5.3 Out-of-Sample Performance -- 6 Conclusion -- References -- Energy Consumption Forecasting Using a Stacked Nonparametric Bayesian Approach -- 1 Introduction -- 1.1 Energy Forecasting Across Major Australian States -- 1.2 Challenges and Related Work -- 1.3 Contribution of Our Approach -- 2 Overview of Gaussian Process -- 3 Proposed Data Analytic Model for Energy Prediction -- 3.1 Feature Selection -- 3.2 Modelling for Energy Forecasting -- 4 Experimental Setup -- 4.1 Data Description and Preparation -- 4.2 Baselines and Other Machine Learning Models Used for Comparison -- 5 Results and Discussion -- 6 Conclusion -- References -- Reconstructing the Past: Applying Deep Learning to Reconstruct Pottery from Thousands Shards -- 1 Introduction -- 2 Related Work -- 3 Our Approach -- 3.1 Dataset Generation Method -- 3.2 Our Proposed Model -- 4 Experimental Setup -- 5 Result -- 6 Discussions and Limitations -- 7 Conclusion -- References -- CrimeForecaster: Crime Prediction by Exploiting the Geographical Neighborhoods' Spatiotemporal Dependencies -- 1 Introduction -- 2 Datasets -- 3 Problem Definition -- 4 Methodology -- 4.1 CrimeForecaster Framework Overview -- 5 Experiment -- 5.1 CrimeForecaster Experiment Data and Setup -- 5.2 Performance Comparison -- 5.3 Parameter Study -- 6 Related Work -- 7 Conclusion -- References
  • 4.3 Lagrangian Dual Framework for Constrained Predictors -- 5 Experiments -- 5.1 Constrained Optimization Problems -- 5.2 Constrained Predictor Problems -- 6 Related Work -- 7 Conclusions -- References -- Applied Data Science: Healthcare -- Few-Shot Microscopy Image Cell Segmentation -- 1 Introduction -- 2 Related Work -- 2.1 Cell Segmentation -- 2.2 Few-Shot Learning -- 3 Few-Shot Cell Segmentation -- 3.1 Problem Definition -- 3.2 Few-Shot Meta-learning Approach -- 3.3 Meta-learning Algorithm -- 3.4 Task Objective Functions -- 3.5 Fine-Tuning -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Microscopy Image Databases -- 4.3 Assessment Protocol -- 4.4 Results Discussion -- 5 Conclusion -- References -- Deep Reinforcement Learning for Large-Scale Epidemic Control -- 1 Introduction -- 2 Related Work -- 3 Epidemiological Model -- 3.1 Intra-patch Model -- 3.2 Inter-patch Model -- 3.3 Calibration and Validation -- 4 Learning Environment -- 5 PPO Versus Ground Truth -- 6 Multi-district Reinforcement Learning -- 7 Discussion -- References -- GLUECK: Growth Pattern Learning for Unsupervised Extraction of Cancer Kinetics -- 1 Background -- 1.1 Tumor Growth and Its Implications -- 1.2 Mechanistic Models of Tumor Growth -- 1.3 Predictive Models of Tumor Growth -- 1.4 Peculiarities of Tumor Growth Data -- 2 Materials and Methods -- 2.1 Introducing GLUECK -- 2.2 Datasets -- 2.3 Procedures -- 3 Results -- 4 Conclusion -- References -- Automated Integration of Genomic Metadata with Sequence-to-Sequence Models -- 1 Introduction -- 2 Related Work -- 3 Approaches -- 3.1 Multi-label Classification Approach -- 3.2 Translation-Based Approach -- 4 Experiments -- 4.1 Datasets: GEO, Cistrome and ENCODE -- 4.2 Experimental Setup -- 4.3 Experiments 1 and 2 -- 4.4 Experiment 3: Randomly Chosen GEO Instances -- 5 Conclusions and Future Work -- References
  • 2.1 Weighing a Product Hierarchy
  • PS3: Partition-Based Skew-Specialized Sampling for Batch Mode Active Learning in Imbalanced Text Data -- 1 Introduction -- 2 Related Work -- 3 PS3: Partition-Based Skew-Specialized Sampling for Batch-Mode Active Learning -- 3.1 Batch-Mode Imbalance Learning -- 3.2 Human in the Loop - Assessment of Labeling Effort for Hate-Speech Detection -- 4 Experimental Methodology -- 4.1 Datasets -- 4.2 Baseline Methods -- 4.3 Evaluation Criteria -- 5 Results and Discussion -- 5.1 Performance Evaluation -- 5.2 Computational Time Analysis -- 6 Conclusion and Future Direction -- References -- An Uncertainty-Based Human-in-the-Loop System for Industrial Tool Wear Analysis -- 1 Introduction -- 2 Foundations and Related Work -- 3 Methodology -- 3.1 Image Segmentation -- 3.2 Model: Dropout U-Net -- 3.3 Loss Function and Performance Evaluation Metric -- 3.4 Uncertainty Estimation -- 4 Experiments -- 4.1 Datasets, Preprocessing and Training Procedure -- 4.2 Performance Results -- 4.3 Evaluation: Uncertainty-Based Human-in-the-Loop System -- 5 Discussion and Outlook -- References -- Filling Gaps in Micro-meteorological Data -- 1 Introduction -- 2 Related Work -- 3 Filling Gaps -- 3.1 Architecture -- 3.2 Feed Forward Layer and Copy Task -- 3.3 Positional Encoding -- 3.4 Training -- 4 Experiments -- 4.1 Datasets -- 4.2 Evaluation -- 4.3 Neural Networks and Training -- 5 Results -- 5.1 Toy Case -- 5.2 Evapotranspiration Data -- 6 Conclusion -- References -- Lagrangian Duality for Constrained Deep Learning -- 1 Introduction -- 2 Preliminaries: Lagrangian Duality -- 3 Learning Constrained Optimization Problems -- 3.1 Motivating Applications -- 3.2 The Learning Task -- 3.3 Lagrangian Dual Framework for Constrained Optimization Problems -- 4 Learning Constrained Predictors -- 4.1 Motivating Applications -- 4.2 The Learning Task
  • Explaining End-to-End ECG Automated Diagnosis Using Contextual Features -- 1 Introduction -- 2 Related Works -- 3 Methodology -- 3.1 Case Study Method -- 3.2 Segmentation-Based Noise Insertion -- 4 Contextual Features for a Convolutional Network -- 4.1 Model Description -- 4.2 Model Evaluation -- 5 Discussion -- 6 Conclusions -- References -- Applied Data Science: E-Commerce and Finance -- A Deep Reinforcement Learning Framework for Optimal Trade Execution -- 1 Introduction -- 2 Limit Order Book and Market Microstructure -- 3 A DQN Formulation to Optimal Trade Execution -- 3.1 Preliminaries -- 3.2 Problem Formulation -- 3.3 DQN Architecture and Extensions -- 3.4 Experimental Methodology and Settings -- 4 Experimental Results -- 4.1 Data Sources -- 4.2 Training and Stability -- 4.3 Main Evaluation and Backtesting -- 5 Conclusion and Future Work -- A Hyperparameters -- References -- Detecting and Predicting Evidences of Insider Trading in the Brazilian Market -- 1 Introduction -- 2 Literature Review -- 3 Dataset Preparation -- 3.1 Impactful Events in 2017 -- 3.2 Classifying Possible Evidences of Insider Trading -- 3.3 Expanding the Dataset for 2018 -- 4 Recognising Suspicious Trades Before Events Unfold -- 4.1 Monitoring Negotiations -- 4.2 Predicting Relevant Events -- 5 Conclusion and Future Work -- References -- Mend the Learning Approach, Not the Data: Insights for Ranking E-Commerce Products -- 1 Introduction -- 2 Related Work -- 3 E-Com Dataset for LTR -- 3.1 Need of a New Dataset -- 3.2 Scope of the Dataset -- 3.3 Dataset Construction -- 4 Problem Formulation -- 5 Experiments -- 5.1 Experimental Setup -- 5.2 Comparison of CRM and Full-Info Approaches (RQ1) -- 5.3 Learning Progress with Increasing Number of Bandit Feedback (RQ2) -- 5.4 Effect of the DNN Architecture (RQ3) -- 6 Conclusion -- A Comparison of Counterfactual Risk Estimators