Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track European Conference, ECML PKDD 2020, Ghent, Belgium, September 14-18, 2020, Proceedings, Part V

Saved in:

Bibliographic Details
Main Authors	Dong, Yuxiao, Ifrim, Georgiana, Mladenić, Dunja, Saunders, Craig, Van Hoecke, Sofie
Format	eBook
Language	English
Published	Cham Springer International Publishing AG 2021
Edition	1
Subjects	Data mining-Congresses Machine learning-Congresses
Online Access	Get full text

Cover

Loading…

Table of Contents:

B Choosing Hyperparameter -- References -- Multi-future Merchant Transaction Prediction -- 1 Introduction -- 2 Notations and Problem Definition -- 3 Model Architecture -- 3.1 Shape Sub-network -- 3.2 Scale Sub-network -- 4 Training Algorithm -- 5 Experiments -- 5.1 Description of the Datasets -- 5.2 Evaluation of Architecture Design Choice -- 5.3 Evaluation of the Multi-future Learning Scheme -- 6 Related Work -- 7 Conclusion -- References -- Think Out of the Package: Recommending Package Types for E-Commerce Shipments -- 1 Introduction -- 1.1 Contributions -- 2 Related Work -- 2.1 Existing Packaging Selection Process -- 2.2 Why Not Ordinal Regression? -- 2.3 Comparison with Standard Machine Learning Approaches for Package Planning -- 3 Two-Stage Approach for Optimal Package Selection -- 3.1 Stage 1: Estimating the Transit Damage Probability of a Product Given a Package Type -- 3.2 Stage 2: Identifying the Optimal Package Type for Each Product -- 4 Experimental Results -- 4.1 Calibration -- 4.2 Package Type Recommendation -- 4.3 Impact Analysis from Actual Shipment Data -- 5 Conclusion and Future Work -- References -- Topics in Financial Filings and Bankruptcy Prediction with Distributed Representations of Textual Data -- 1 Introduction -- 2 Literature Reviews -- 3 Data and Methods -- 3.1 Data -- 3.2 Latent Dirichlet Allocation -- 3.3 Topic Modelling in Embedding Spaces -- 3.4 Number of Topics Assessments -- 3.5 Bankruptcy Prediction Feature Sets -- 3.6 Experimental Setup -- 4 Experimental Results and Discussions -- 4.1 Number of Topics in MD&amp -- A -- 4.2 Topics in MD&amp -- A and Their Evolution -- 4.3 Predictive Performance -- 5 Conclusions -- References -- Why Did My Consumer Shop? Learning an Efficient Distance Metric for Retailer Transaction Data -- 1 Introduction -- 2 A Framework for Learning the Distance Metric
Intro -- Preface -- Organization -- Contents - Part V -- Applied Data Science: Social Good -- Confound Removal and Normalization in Practice: A Neuroimaging Based Sex Prediction Case Study -- 1 Introduction -- 2 Sex Classification and Brain Size -- 3 Experimental Setup -- 3.1 Study Design -- 3.2 Confound Regression -- 3.3 Predictive Modelling -- 4 Data Samples and Features -- 4.1 Data Samples -- 4.2 Pre-processing and Feature Extraction -- 5 Results -- 5.1 Generalization Performance Estimates -- 5.2 Predictive Features -- 5.3 Out-of-Sample Performance -- 6 Conclusion -- References -- Energy Consumption Forecasting Using a Stacked Nonparametric Bayesian Approach -- 1 Introduction -- 1.1 Energy Forecasting Across Major Australian States -- 1.2 Challenges and Related Work -- 1.3 Contribution of Our Approach -- 2 Overview of Gaussian Process -- 3 Proposed Data Analytic Model for Energy Prediction -- 3.1 Feature Selection -- 3.2 Modelling for Energy Forecasting -- 4 Experimental Setup -- 4.1 Data Description and Preparation -- 4.2 Baselines and Other Machine Learning Models Used for Comparison -- 5 Results and Discussion -- 6 Conclusion -- References -- Reconstructing the Past: Applying Deep Learning to Reconstruct Pottery from Thousands Shards -- 1 Introduction -- 2 Related Work -- 3 Our Approach -- 3.1 Dataset Generation Method -- 3.2 Our Proposed Model -- 4 Experimental Setup -- 5 Result -- 6 Discussions and Limitations -- 7 Conclusion -- References -- CrimeForecaster: Crime Prediction by Exploiting the Geographical Neighborhoods' Spatiotemporal Dependencies -- 1 Introduction -- 2 Datasets -- 3 Problem Definition -- 4 Methodology -- 4.1 CrimeForecaster Framework Overview -- 5 Experiment -- 5.1 CrimeForecaster Experiment Data and Setup -- 5.2 Performance Comparison -- 5.3 Parameter Study -- 6 Related Work -- 7 Conclusion -- References
4.3 Lagrangian Dual Framework for Constrained Predictors -- 5 Experiments -- 5.1 Constrained Optimization Problems -- 5.2 Constrained Predictor Problems -- 6 Related Work -- 7 Conclusions -- References -- Applied Data Science: Healthcare -- Few-Shot Microscopy Image Cell Segmentation -- 1 Introduction -- 2 Related Work -- 2.1 Cell Segmentation -- 2.2 Few-Shot Learning -- 3 Few-Shot Cell Segmentation -- 3.1 Problem Definition -- 3.2 Few-Shot Meta-learning Approach -- 3.3 Meta-learning Algorithm -- 3.4 Task Objective Functions -- 3.5 Fine-Tuning -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Microscopy Image Databases -- 4.3 Assessment Protocol -- 4.4 Results Discussion -- 5 Conclusion -- References -- Deep Reinforcement Learning for Large-Scale Epidemic Control -- 1 Introduction -- 2 Related Work -- 3 Epidemiological Model -- 3.1 Intra-patch Model -- 3.2 Inter-patch Model -- 3.3 Calibration and Validation -- 4 Learning Environment -- 5 PPO Versus Ground Truth -- 6 Multi-district Reinforcement Learning -- 7 Discussion -- References -- GLUECK: Growth Pattern Learning for Unsupervised Extraction of Cancer Kinetics -- 1 Background -- 1.1 Tumor Growth and Its Implications -- 1.2 Mechanistic Models of Tumor Growth -- 1.3 Predictive Models of Tumor Growth -- 1.4 Peculiarities of Tumor Growth Data -- 2 Materials and Methods -- 2.1 Introducing GLUECK -- 2.2 Datasets -- 2.3 Procedures -- 3 Results -- 4 Conclusion -- References -- Automated Integration of Genomic Metadata with Sequence-to-Sequence Models -- 1 Introduction -- 2 Related Work -- 3 Approaches -- 3.1 Multi-label Classification Approach -- 3.2 Translation-Based Approach -- 4 Experiments -- 4.1 Datasets: GEO, Cistrome and ENCODE -- 4.2 Experimental Setup -- 4.3 Experiments 1 and 2 -- 4.4 Experiment 3: Randomly Chosen GEO Instances -- 5 Conclusions and Future Work -- References
2.1 Weighing a Product Hierarchy
PS3: Partition-Based Skew-Specialized Sampling for Batch Mode Active Learning in Imbalanced Text Data -- 1 Introduction -- 2 Related Work -- 3 PS3: Partition-Based Skew-Specialized Sampling for Batch-Mode Active Learning -- 3.1 Batch-Mode Imbalance Learning -- 3.2 Human in the Loop - Assessment of Labeling Effort for Hate-Speech Detection -- 4 Experimental Methodology -- 4.1 Datasets -- 4.2 Baseline Methods -- 4.3 Evaluation Criteria -- 5 Results and Discussion -- 5.1 Performance Evaluation -- 5.2 Computational Time Analysis -- 6 Conclusion and Future Direction -- References -- An Uncertainty-Based Human-in-the-Loop System for Industrial Tool Wear Analysis -- 1 Introduction -- 2 Foundations and Related Work -- 3 Methodology -- 3.1 Image Segmentation -- 3.2 Model: Dropout U-Net -- 3.3 Loss Function and Performance Evaluation Metric -- 3.4 Uncertainty Estimation -- 4 Experiments -- 4.1 Datasets, Preprocessing and Training Procedure -- 4.2 Performance Results -- 4.3 Evaluation: Uncertainty-Based Human-in-the-Loop System -- 5 Discussion and Outlook -- References -- Filling Gaps in Micro-meteorological Data -- 1 Introduction -- 2 Related Work -- 3 Filling Gaps -- 3.1 Architecture -- 3.2 Feed Forward Layer and Copy Task -- 3.3 Positional Encoding -- 3.4 Training -- 4 Experiments -- 4.1 Datasets -- 4.2 Evaluation -- 4.3 Neural Networks and Training -- 5 Results -- 5.1 Toy Case -- 5.2 Evapotranspiration Data -- 6 Conclusion -- References -- Lagrangian Duality for Constrained Deep Learning -- 1 Introduction -- 2 Preliminaries: Lagrangian Duality -- 3 Learning Constrained Optimization Problems -- 3.1 Motivating Applications -- 3.2 The Learning Task -- 3.3 Lagrangian Dual Framework for Constrained Optimization Problems -- 4 Learning Constrained Predictors -- 4.1 Motivating Applications -- 4.2 The Learning Task
Explaining End-to-End ECG Automated Diagnosis Using Contextual Features -- 1 Introduction -- 2 Related Works -- 3 Methodology -- 3.1 Case Study Method -- 3.2 Segmentation-Based Noise Insertion -- 4 Contextual Features for a Convolutional Network -- 4.1 Model Description -- 4.2 Model Evaluation -- 5 Discussion -- 6 Conclusions -- References -- Applied Data Science: E-Commerce and Finance -- A Deep Reinforcement Learning Framework for Optimal Trade Execution -- 1 Introduction -- 2 Limit Order Book and Market Microstructure -- 3 A DQN Formulation to Optimal Trade Execution -- 3.1 Preliminaries -- 3.2 Problem Formulation -- 3.3 DQN Architecture and Extensions -- 3.4 Experimental Methodology and Settings -- 4 Experimental Results -- 4.1 Data Sources -- 4.2 Training and Stability -- 4.3 Main Evaluation and Backtesting -- 5 Conclusion and Future Work -- A Hyperparameters -- References -- Detecting and Predicting Evidences of Insider Trading in the Brazilian Market -- 1 Introduction -- 2 Literature Review -- 3 Dataset Preparation -- 3.1 Impactful Events in 2017 -- 3.2 Classifying Possible Evidences of Insider Trading -- 3.3 Expanding the Dataset for 2018 -- 4 Recognising Suspicious Trades Before Events Unfold -- 4.1 Monitoring Negotiations -- 4.2 Predicting Relevant Events -- 5 Conclusion and Future Work -- References -- Mend the Learning Approach, Not the Data: Insights for Ranking E-Commerce Products -- 1 Introduction -- 2 Related Work -- 3 E-Com Dataset for LTR -- 3.1 Need of a New Dataset -- 3.2 Scope of the Dataset -- 3.3 Dataset Construction -- 4 Problem Formulation -- 5 Experiments -- 5.1 Experimental Setup -- 5.2 Comparison of CRM and Full-Info Approaches (RQ1) -- 5.3 Learning Progress with Increasing Number of Bandit Feedback (RQ2) -- 5.4 Effect of the DNN Architecture (RQ3) -- 6 Conclusion -- A Comparison of Counterfactual Risk Estimators