Data Quality and Model Under-Specification Issues
Nowadays, we are witnessing an increasing demand in both industry and academia for exploiting Deep Learning (DL) to solve complex real-world problems. However, the performance of these high-capacity learners is currently bounded by the quality and volume of their underlying training data. The use of incomplete, erroneous, or inappropriate training data, and the implementation of poor data management practices in a training pipeline often result into unreliable, biased, or under specified models. In this talk, I will report about some recent research works that we have conducted to identify best practices of data management for DL. I will also report about recent techniques and tools that we have developed to help detect the root cause of model under-specification issues early on during a DL training process.
Foutse Khomh is a Full Professor, a Canada CIFAR AI Chair, and FRQ-IVADO Research Chair at Polytechnique Montréal, where he heads the SWAT Lab (http://swat.polymtl.ca/). He received a Ph.D. in Software Engineering from the University of Montreal in 2011. His research interests include software maintenance and evolution, cloud engineering, machine learning systems engineering, empirical software engineering, software analytics, and dependable and trustworthy AI/ML. He has published over 180 conferences and journal papers. His work has received four ten-year Most Influential Paper (MIP) Awards, and six Best/Distinguished Paper Awards. He has served on the program committees of several international conferences including ICSE, FSE, ICSM(E), SANER, MSR, ICPC, SCAM, ESEM and has reviewed for top international journals such as SQJ, JSS, EMSE, TSE, and TOSEM. He is program chair for Satellite Events at SANER 2015, program co-chair of SCAM 2015, ICSME 2018, PROMISE 2019, and ICPC 2019, and general chair of ICPC 2018, SCAM 2020, and general co-chair of SANER 2020. He initiated and co-organized the Software Engineering for Machine Learning Applications (SEMLA) symposium. He is one of the organizers of the RELENG workshop series (http://releng.polymtl.ca) and Associate Editor for IEEE Software, EMSE, and JSEP.
Thu 17 NovDisplayed time zone: Beijing, Chongqing, Hong Kong, Urumqi change
14:00 - 15:30 | |||
14:00 60mKeynote | Data Quality and Model Under-Specification Issues SEA4DQ Foutse Khomh Polytechnique Montréal | ||
15:00 15mPaper | Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production SEA4DQ Maryna Waszak SINTEF, Terje Moen SINTEF, Sølve Eidnes SINTEF, Alexander Stasik SINTEF, Anders Hansen SINTEF, Gregory Bouquet SINTEF, Antoine Pultier SINTEF, Xiang Ma SINTEF, Idar Tørlen Elkem, Bjørn Rune Henriksen Elkem, Arianeh Aamodt Elkem, Dumitru Roman SINTEF | ||
15:15 15mTalk | InterQ Research Project Presentation SEA4DQ Nicolas Jourdan Technical University of Darmstadt |