Maliheh (Mali) Izadi, PhD, Andrea Di Sorbo, and Sebastiano Panichella co-chaired the 3rd Intl. Workshop on NL-based Software Engineering
April 20 2024, Lisbon, Portugal.
The 3rd Intl. Workshop on NL-based Software Engineering
1. NL-based Software Engineering (NLBSE) '24
April 20th, 2024, 09:15 - 17:35 UTC+1
https://nlbse2024.github.io/
@NLBSE_workshop
Sebastiano
Panichella
Andrea
Di Sorbo
Maliheh
Izadi
2. General format
Presentations:
Live presentations
In-person: Rooms Maria Helena Vieira da Silva & Glicínia Quartin
Questions:
After each presentation we have allocated time for Q & A.
Participants:
- We encourage you to ask questions to the speakers
Mali
5. Thanks to our sponsors!
IEEE Technical Community on Software Engineering
(TCSE)
ACM Special Interest Group on Software Engineering
(SIGSOFT)
Andrea
6. Schedule (UTC+1)
09:15 - 9:30 → Opening
09:30 - 10:30 → Live Keynote: "Neuro-Symbolic Developer Tools for Analyzing,
Executing, and Repairing Code" by Michael Pradel
10:30 - 11:00 → Break
Sebastiano Plenary Session (Maria Helena Vieira da Silva)
https://conf.researchr.org/program/icse-2024/program-icse-2024/ https://nlbse2024.github.io/
7. Schedule (UTC+1)
Morning Parallel Session - Maria Helena Vieira da Silva
11:00 - 12:10 → Research Session: Language and Code Dynamics:
Morning Parallel Session - Glicínia Quartin
11:00 - 12:30 → Tool Competition
→ Opening (11:00 - 11:15)
→ Tool Paper Presentations (11:15 - 12:15)
→ Closing (12:15 - 12:30)
12:30 - 14:15 → Lunch
Sebastiano Parallel Sessions
https://conf.researchr.org/program/icse-2024/program-icse-2024/ https://nlbse2024.github.io/
8. Schedule (UTC+1)
14:15 - 15:30 → Discussion Panel - Challenges and Opportunities of LLMs
14:15 - 14:20 → Panel Introduction
14:20 - 14:35 → Presentation of a Book on Security for LLMs
by Andrei Kucharavy
14:35 - 15:30 → Panel discussion
15:30 - 16:00 → Coffee Break
16:00 - 17:25 → Research Session: Frontiers of Collaborative Development - -
17:25 - 17:35 → Closing and Awards
Sebastiano
https://conf.researchr.org/program/icse-2024/program-icse-2024/ https://nlbse2024.github.io/
Plenary Session (Maria Helena Vieira da Silva)
9. Keynote (09:30)
Neuro-Symbolic Developer Tools for Analyzing,
Executing, and Repairing Code
Developer productivity and software quality critically depend on
effective software development tools. Traditional, symbolic
program analysis tools are often limited in their ability to
understand developer intention and rely on various hand-crafted
heuristics. Neural software analysis addresses these limitations, but
remains unaware of the formal semantics of a program and hence
easily misses facts and rules that are actually well known. This talk
argues that carefully combining neural and symbolic reasoning
provides an effective means to address various challenging
software development problems. To illustrate this point, I will
describe our 8-year long journey of creating neuro-symbolic
developer tools, ranging from learning-based bug detectors and
type predictors, to our most recent work on learning-guided
execution and program repair based on an autonomous
LLM-based agent. I will discuss lessons learned on this journey and
conclude with an outline of open challenges waiting to be
addressed in order to close the gap between symbolic and neural
software developer tools.
Bio:
Michael Pradel is a full professor at the
University of Stuttgart His research interests
span software engineering, programming
languages, security, and machine learning, with
a focus on tools and techniques for building
reliable, efficient, and secure software. Michael
has been recognized through the Ernst-Denert
Software Engineering Award, an Emmy
Noether grant by the German Research
Foundation (DFG), an ERC Starting Grant,
best/distinguished paper awards at FSE (3x),
ISSTA, ASE, and ASPLOS, and by being named
an ACM Distinguished Member.
Mali
Michael Pradel
11. Research papers
Full papers (20 minutes):
- 15 minutes for talk
- 5 minutes for questions
Short papers (15 minutes):
- 10 minutes for talk
- 5 minutes for questions
Tool papers (10 minutes):
- 7 minutes for talk
- 3 minutes for questions
Sebastiano
12. Session 1 - Language and Code Dynamics: schedule (11:00)
Aligning Programming Language and Natural Language: Exploring Design Choices in Multi-Modal Transformer-Based
Embedding for Bug Localization (full)
Partha Chakraborty, Venkatraman Arumugam and Meiyappan Nagappan University of Waterloo
What’s in a Display Name? An Empirical Study on the Use of Display Names in Open-Source JUnit Tests (full)
Yining Qiao and José Miguel Rojas University of Sheffield
Software Vulnerability and Functionality Assessment using Large Language Models (short)
Rasmus Ingemann Tuffveson Jensen, Vali Tawosi and Salwa Alamir J.P.Morgan AI Research
Towards Automatic Translation of Machine Learning Visual Insights to Analytical Assertions (short)
Arumoy Shome, Luis Cruz and Arie van Deursen Delft University of Technology
Andrea/Seba
13. Tool competition (11:00) - Glicínia Quartin
Competition reports:
- 10 minutes per paper (inclusive)
Tool Competition Co-chairs
Tool Chairs
14. Tool Competition schedule
Opening
Rafael Kallis1
, Giuseppe Colavito2
, Pooja Rani3
, Ali Al-Kaswan4
, Luca Pascarella5
, Oscar Chaparro6
[1] Rafael Kallis Consulting, [2]
University of Bari, [3] University of Zurich, [4] Delft University of Technology, [5] ETH Zurich, [6] College of William and Mary
Few-Shot Issue Report Classification with Adapters
Fahad Ebrahim and Mike Joy University of Warwick
Lessons from the NLBSE 2024 Competition: Towards Building Efficient Models for GitHub Issue Classification
Daniel Gómez-Barrera, Luccas Rojas Becerra, Juan Pinzón Roncancio, David Ortiz Almanza, Juan Arboleda, Mario Linares and
Ruben Manrique Universidad de los Andes
ClassifAI: Automating Issue Reports Classification using Pre-Trained BERT (Bidirectional Encoder Representations from
Transformers) Models
Khubaib Amjad Alam, Ashish Jumani, Harris Aamir and Muhammad Uzair National University of Computer and Emerging Sciences
Text-To-Text Generation for Issue Report Classification
Gokul Rejithkumar, Preethu Rose Anish and Smita Ghaisas TCS Research
Applying Large Language Models API to Issue Classification Problem
Gabriel Aracena1
, Kyle Luster1
, Fabio Santos1
, Igor Steinmacher2
and Marco A. Gerosa2
[1] Grand Canyon University, [2] Northern
Arizona University
Dopamin: Transformer-based Comment Classifiers through Domain Post-Training and Multi-level Layer Aggregation
Nam Le Hai1
and Nghi D. Q. Bui2
[1] FPT Software AI Center, [2] Fulbright University
Closing
Rafael Kallis1
, Giuseppe Colavito2
, Pooja Rani3
, Ali Al-Kaswan4
, Luca Pascarella5
, Oscar Chaparro6
[1] Rafael Kallis Consulting, [2]
University of Bari, [3] University of Zurich, [4] Delft University of Technology, [5] ETH Zurich, [6] College of William and Mary
Tool Chairs
18. Session 2 - Frontiers of collaborative development: schedule (16:00)
Unveiling Disparities: NLP Analysis of Software Industry and Vocational Education Gaps (full)
Emil Bäckstrand1
, Rasmus Djupedal1
, Lena-Maria Öberg1
and Francisco Gomes de Oliveira Neto2
[1] Mid Sweden University, [2] Chalmers and the University of Gothenburg
Towards LLM-Generated Code Tours for Onboarding (short)
Martin Balfroid, Benoît Vanderose and Xavier Devroey NADI, University of Namur
Automated Extraction of Compliance Elements in Software Engineering Contracts Using Natural Language Generation
(short)
Gokul Rejithkumar, Preethu Rose Anish, Pratik Sonar and Smita Ghaisas TCS Research
Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language
Models (full)
Mia Mohammad Imran Virginia Commonwealth University
Understanding Emojis :) in Useful Code Review Comments (short) (video)
Sharif Ahmed and Nasir Eisty Boise State University
Sebastiano
22. Thanks to: Our Keynote Speaker
for giving an enlightening and instructive talk!
Andrea
23. Thanks to: the Tool Competition Co-chairs
Rafael Kallis Giuseppe Colavito Pooja Rani Luca Pascarella Oscar Chaparro Ali Al-Kaswan
for organizing two exciting and relevant tool competitions!
Andrea
25. Thanks to: our Web Chairs
Arnaldo Sgueglia Tamara Toma
for their support with the website!
Sebastiano
26. Thanks to: the Program Committee members
for their support in reviewing papers!
Sebastiano
27. What’s Next?
Special issue at Science of Computer Programming 2024:
“NLBSE’24: Natural Language-based Software to
Support Software Engineering Processes”
Open Call!
Short papers with a great focus on software and replication packages
Submission Dates: October 1st, 2024
Slides of our Workshop will be made available in the webpage
Link to submission page will be posted on Twitter and the NLBSE web page
Sebastiano
Sebastiano
28. What’s Next?
• Coordinate with similar workshops (e.g., LLM4Code) to continuously promote
research in the field.
• Organize a symposium exploring the intersection of language and code
models to foster knowledge sharing and community growth.
• Involve more industrial subjects and practitioners.
• Promote discussion around current and relevant themes (e.g., AI-language
models) and new competition in other relevant NLBSE areas.
• Encourage the design, implementation, and public availability of usable and
high-quality tools to deal with NLBSE-related challenges.
• We are generally open to ideas or new NLBSE tool competition/challenges
(contact us)!
Sebastiano
29. Thank you all for participating!
See you next year in Ottawa
at NLBSE 2025!