Master Thesis Topic Bank
Welcome to the Master Thesis Topic Bank at GPT-Lab! This resource provides a curated list of thesis topics for master’s students at Tampere University who wish to conduct their thesis in the fields of artificial intelligence and software engineering. Here, you’ll find a variety of predefined topics that reflect the cutting-edge research happening at GPT-Lab. Each topic is designed to align with our core focus areas, providing you with opportunities to contribute to innovative projects.
You are encouraged to select a topic from the list or propose your own in the registration form, provided it falls within the scope of AI and software engineering. Whether you’re passionate about natural language processing, machine learning, or software development, you’ll find exciting opportunities that match your academic and research interests.
For those proposing their own topics, please ensure they meet the criteria and are relevant to our ongoing research themes.
How It Works:
– Browse Topics: Review the list of available thesis topics, each with a brief description.
– Select or Propose: Choose a topic or propose your own idea during the application process.
– Get Started: Once your application is accepted, you will be paired with an advisor to begin your research journey.
Disclaimer: For the best viewing experience, please access this table on a desktop device.
Last updated: 23.10.2025
Status | Thesis topic | Description | Supervisor/Thesis advisor |
|---|---|---|---|
Available | From RPA Code to BPMN Process Diagrams | This thesis investigates methods for automatically transforming robotic process automation (RPA) code into BPMN process diagrams to improve process transparency, maintainability, and integration with business process management tools. The work involves analysing RPA scripts, defining mappings to BPMN constructs, and prototyping a converter. The topic comes from Seinäjoki-based IT-consultation company River IT and details of collaboration can be agreed upon. Prerequisites include solid programming skills, understanding of RPA platforms and BPMN modelling, and ability to read and formalise code-to-model transformations. | Dr. Jussi Rasku |
Available | AI-BASS Real Options Subsystem for Startup Decision Modeling | This thesis focuses on developing a microservice-based subsystem within the AI-BASS platform to help startups make smarter investment and resource allocation decisions. The system will identify decision points during AI-BASS chat sessions, model them as real options, and visualize their dependencies to support data-driven decision-making. The student should understand strategic management and AI in business, be able to conduct systematic or grounded literature analysis using qualitative or text analysis tools such as NVivo, Atlas.ti, or Python, and be capable of synthesizing findings into a conceptual framework. Strong academic writing skills and independent analytical thinking are essential. | Dr. Jussi Rasku Mikko Auranen |
Available | AI-Supported Strategy in Software Companies: A Qualitative Mapping of Practices and Implications | This thesis explores how artificial intelligence is used in strategic planning and decision-making within software companies. The student will review academic and industry sources, analyze patterns in how AI supports strategic processes, and develop a conceptual framework linking AI-assisted strategy to decision-making quality and outcomes. The student should be fluent in Python, LLM APIs, and comfortable with microservice architecture and MongoDB. They must understand and willing to learn of basic decision analysis and real options theory, and be able to translate these into computational models. Experience with data integration, network models, visualization frameworks, and applied AI systems is essential. The work requires technical autonomy, analytical rigor, and the ability to validate and document a novel subsystem through prototype development. | Dr. Jussi Rasku Mikko Auranen |
Available | Adaptive Language Models for Game Rule Generation and Quality Evaluation | This thesis will explore how language models can be adapted to generate and evaluate novel game rules in the context of both card and abstract board games. Inspired by Browne and Maire's work on evolutionary game design, the research aims to create a simulation environment where language models generate game rules and assess the "quality" of these games based on parameters like drama, uncertainty, and playability. The models will be benchmarked against a suite of metrics to optimize for engaging gameplay, informed by principles from recent studies in game design and AGI experimentation (e.g., ARC-AGI, AlphaCode). Prerequisites: Proficiency in LLM use and APIs, experience with Python, and an interest towards evolutionary algorithms. Familiarity with game design concepts. | Dr. Jussi Rasku |
Available | Smart and Sustainable Energy Consumption Optimization with IoT and LLMs | Are you interested in developing AI-powered solutions for energy efficiency? This Master's thesis project focuses on creating an IoT-based energy management system where Large Language Models (LLMs) analyze and optimize energy consumption. The project could involve integrating energy meters and IoT sensors, such as temperature, humidity, and electricity usage monitors. It could also include developing AI-driven analytics to provide smart recommendations for optimizing energy consumption. Additionally, a user-friendly interface could be created for energy management and forecasting. By reducing energy consumption and improving decision-making, this project contributes to sustainability and the circular economy. If you are passionate about AI, IoT, and green technology, this is your chance to make an impact. Prerequisites: Python, LLM. IoT -devices Possibility to use/implement APIs: https://github.com/Green-Software-Foundation | Dr. Mika Saari |
Available | Evaluation Agent for Fact-Checking in RAG Systems | This thesis focuses on creating an evaluation agent, a separate AI component that reviews the generated answer, checks it against the original documents, and determines whether each part is factually supported. In this thesis, student will design and implement this agent, test it in different domains or only one domain such as software engineering, healthcare, agriculture, or governance, and measure how well it can detect unsupported claims compared to human evaluation. The final outcome will be a functional prototype and a clear understanding of how such agents can improve the reliability and trustworthiness of RAG systems. Prerequisites: Python programming skills, basic knowledge of ML with natural language processing tools such as Hugging Face Transformers, familiarity with information retrieval concepts and Vector DB, ability to work with LLMs APIs, and basic skills in text data processing. | Toufique Hasan |
Available | Optimizing RAG Pipelines for Real-Time Query Processing in Software Engineering | Retrieval-Augmented Generation (RAG) helps AI models provide more accurate and context-aware answers by pulling in relevant external information. In software engineering, quick and reliable query responses are crucial for tasks like debugging, code generation, and documentation retrieval. This thesis will focus on optimizing RAG pipelines to improve speed, accuracy, and efficiency in real-time query processing. You’ll explore ways to reduce latency using caching, fine-tune vector search, and optimize LLM prompts. The goal is to develop a more efficient RAG system tailored for software engineering use cases. Prerequisites: Python programming skills. Basic understanding of NLP and LLMs. Familiarity with vector databases. | Toufique Hasan |
Available | Gameful LLM-Based Multi-Agent Systems | This thesis aims to assess what gamefull elements can effectively influence a Multi-Agent system's performance in coding tasks. By introducing gamefull elements in a Multi-Agent system, we hope to encourage collaboration (or competition), leading to more refined output. Prerequisites: Familiarity with how ML/LLM models work (training, evaluation, datasets, etc.) Willingness to learn or experience with interviews, thematic analysis, or survey design | José Siqueira de Cerqueira |
Available | Can I Trust This Model? A Study on How Documentation Shapes Trust in Open-Source AI | This thesis explores how open-source AI model documentation — especially model cards — influences developers’ trust in the models themselves. As developers rely on documentation to assess risks and reliability, the study investigates which elements (e.g., training data, use cases, benchmarks, limitations, licensing) most impact perceived trust. Through semi-structured interviews, it will assess how developers interpret documentation. The study also examines whether transparency and explainability increase model trust, and whether developers distinguish between trusting the documentation versus the model. Findings will inform better documentation practices to support trustworthy AI development. Prerequisites: Familiarity with how ML/LLM models work (training, evaluation, datasets, etc.) Willingness to learn or experience with interviews, thematic analysis, or survey design. | José Siqueira de Cerqueira |
Available | Continuous Data Integration in RAG Systems | Continuous Data Integration (CDI) in Retrieval-Augmented Generation (RAG) systems refers to the real-time, automatic update and synchronization of data sources used for retrieval, which is then leveraged to improve the generative model's responses. Prerequisites: Experience with Data Engineering and LLMs with an understanding of RAG | Ayman Khan
|
Available | Automation of assignment evaluation using AI | AI can be effectively used as a teaching aid for evaluating student assignments. As a teacher, I have participated in such experiments, and now there is an opportunity to offer similar projects as master's thesis topics for students. The topic can be approached through practical experiments in courses or by examining how legislation and regulations impact the implementation of such solutions. Prerequisites: Especially, suitable for students worked earlier as a teaching assistant. | Dr. Mika Saari |
Available | Benchmarking ethics in Code generated by LLM: literature review | Research in this area primarily focuses on ensuring fairness, mitigating biases, and addressing the risks associated with hallucinations in generated code. When it comes to ethical benchmarking of code generated by LLMs it’s a complex and considerably a very new area of research that requires considering multiple dimensions, from fairness and bias mitigation to communication and contextual understanding. Students can have a comparison study of all the existing literature work on benchmarking ethics and maybe propose their own. Prerequisites: Understanding of EU AI ACT, LLMS and benchmarking. | Ayman Khan |
Available | Impact of generated PlantUML diagrams on large language model generated code accuracy | The performance of an LLM generating or fixing code depends on the correct context in which the edits are made. Hence, the research question is: Can the performance of the coder AI be improved by enriching the prompt with generated UML diagrams? The diagrams will be generated from relevant project source files and given to the LLM using a suitable DSL such as PlantUML. The feasibility of the approach is tested on an SWE- bench, some subset of that, or some other coding benchmark. Prerequisites: The student has experience ising coding LLMs such as Github Copilot, Qwen 2.5, Mistral ,or DeepSeek Coder. Also, one should be fluent in creating and reading UML, know how to write PlantUML, and be interested in empirical testing of software tools. | Dr. Jussi Rasku
|
Available | Reducing Hallucinations in LLMs Using Knowledge Graphs and RAG | This thesis will focus on a problem that many large language models (LLMs) have, which is providing incorrect or confusing information, known as hallucinations. To solve this, the study will explore how to use knowledge graphs and retrieval-augmented generation (RAG) methods together. Knowledge graphs are models that store facts in a structured way, helping LLMs base their answers on real information. RAG allows LLMs to pull relevant information from external sources before generating text. This research will study how well different types of knowledge graphs and RAG techniques reduce hallucinations and may suggest new ways to combine these methods for better results. The goal is to make LLMs more reliable and trustworthy in areas like customer service, content creation, and information access by addressing the hallucination problem. Prerequisites: This thesis assumes familiarity with AI concepts, particularly LLMs. Basic understanding of knowledge graphs and experience with information retrieval techniques, especially RAG, are beneficial. While no deep expertise is required, an analytical mindset and willingness to engage with these methods are crucial.
| Toufique Hasan |
Available | Ethical issues in AI agents and multi-agent collaboration: a systematic literature review | As AI continues to progress, hypothetical future risks become the reality of today. AI agents, both lone agents and AI agents working collaboratively, present various promises across a wide variety of industries, but also various risks. Discussion on these systems has accelerated with recent advances in Generative AI (GenAI) and Large Language Models (LLMs). A study systematically reviewing this research, so as to gain an overview of ethical risks and mitigation measures already acknowledged in literature, presents a timely contribution to the field of AI ethics. | Dr. Kai-Kristian Kemell |
Available | Assessment of transparency in multi-modal generative models | Transparency in AI relates to approaches that make it possible for humans to understand the results of AI systems and the motivations or reasons behind such results. Research on transparency and explainability of generative AI solutions is in its infancy, with some work reported on the transparency of LLMs. A proposed thesis topic would focus on transparency approaches for multi-modal generative AI models. The preliminary study might survey what has been done concretely for such models, and the main topic would then be to propose and explore/evaluate a good candidate approach. See this for a similar focus but only on LLMs: https://hdsr.mitpress.mit.edu/pub/aelql9qy/release/2 Prerequisites: Strong foundation in machine learning and uses of deep learning, particularly with generative models such as GANs, LLMs, and multi-modal networks, as well as familiarity with AI ethics, transparency, and explainability in AI systems. | Prof. Niklas Lavesson |
Available | Speech-to-text transcription of live audio by using an LLM | Live captioning of audio and video streams is a challenging and common problem faced in many use cases. Examples are news and sports broadcasts. Many of the current (automatic) solutions face issues in transcribing terms, places and names. The use of modern LLM technologies is one of the more promising solutions to these challenges. Prerequisites: Knowledge of Python or similar programming language used for implementing AI systems. Experience of using REST APIs. Basic knowledge on how LLMs work. A knowledge of video and audio processing can be seen as an advantage. | Dr. Petri Rantanen |
Available | Investigating the Impact of LLM Quantization on Trustworthiness and Ethical Outcomes | This thesis will explore how quantization—a key method for compressing large language models (LLMs) to run on constrained hardware—affects the ethical behavior and trustworthiness of such models. The student will conduct an empirical study to determine whether quantization introduces or amplifies biases, reduces fairness across demographic groups, or compromises the reliability of LLM outputs in high-stakes use cases such as healthcare, finance, or education. The project will benchmark quantized vs. full-precision models using fairness and trustworthiness metrics (e.g., counterfactual fairness, demographic parity, calibration drift). The outcome will inform best practices for ethically deploying compressed LLMs. Prerequisites: Experience with Python and common LLM frameworks (e.g., Hugging Face, OpenAI, or similar). Interest in AI ethics and bias evaluation. Desired: familiarity with quantization and benchmarking. | José Siqueira de Cerqueira |
Available | Automated SOFTWARE QUALITY ASSURANCE using LLM | Use of Large Language Models to automate various aspects of Software Quality Assurance, such as code review, bug detection, and test case generation. The goal is to enhance the efficiency and accuracy of the SQA process by integrating LLMs to analyze codebases, detect vulnerabilities, and improve overall software quality. The project will involve experimenting with state-of-the-art LLMs like GPT to fine-tune their capabilities in understanding and validating code. The research will also focus on developing practical methods for integrating LLMs into existing software development pipelines, making the process more seamless and accessible for software engineers. The student will work with industry-standard tools and frameworks, collaborating closely with experts in software engineering and AI/ML to achieve innovative solutions for automating quality assurance tasks. Prerequisites: Strong programming Skill, understanding of Quality Assurance and proficiency in software testing, knowledge of LLMs and familiarity with AI/ML | Shahbaz Siddeeq |
Available
| Level Up Your AI Assistant: A Gamified VS Code GenAI Extension to Support Developer Motivation | This thesis will focus on the design and development of a gamified Visual Studio Code (VSCode) Generative AI (GenAI) Extension tool aimed at making the integration of AI-based tools more engaging for developers. While generative AI has the potential to significantly enhance productivity, its integration into existing workflows can often feel cumbersome or uninspiring. By introducing game-like elements—such as points, badges, and levels—into the extension, this research seeks to incentivize developers to interact with AI-based features more fluidly, thus improving their overall experience. Prerequisites: This thesis requires a foundational understanding of AI, particularly LLMs, and experience with software development workflows. Familiarity with Visual Studio Code and knowledge of gamification techniques is beneficial. Strong problem-solving skills and the ability to engage with both technical development and user experience design will be key to success in this project. | José Siqueira de Cerqueira
|
Available | Empowering Companies to Integrate AI into their Business Processes using BPMN | The student will develop a prototype low-code interface that uses LLMs to automate business processes modeled in BPMN. The goal is to investigate how LLM based tooling built with frameworks like LangChain or LangGraph, can be used to translate process models into Python code for automation and agent orchestration. Prerequisites: Understanding of UML, BPMN and AI, as well as experience with Python programming and REST APIs, is required. | Dr. Jussi Rasku |
Available | Comparative Study of Finnish capable Language Models over several NLP Tasks | The thesis will focus on evaluating language models fluent in Finnish, like Poro and Viking, across various tasks such as text generation, translation, and question answering. They are compared against the state-of-the-art commercial offering such as ChatGPT models. The student will benchmark the models in both CPU and GPU environments and assess their performance and accuracy. Prerequisites: Familiarity with large language models, natural language processing and machine learning concepts. Experience with benchmarking tools and Hugging Face models is recommended. | Dr. Jussi Rasku |
Available | Fast-Slow LLM Chimera: Real-Time Language Model Switching | Real-Time Language Model Switching for Interactive Low-Latency Applications This thesis will investigate a hybrid approach where an LLM with minimal latency initiates a response, and then a more powerful LLM continues the response after a few words. The goal is to provide quick initial feedback while maintaining the quality of the response in applications such as real-time communication systems. Prerequisites: Knowledge of LLM architectures and local LLMs, as well as experience with Python and latency optimization, is needed. Access to multiple LLMs and computational resources is recommended. | Dr. Jussi Rasku |
Available
| A Lightweight Command Line GPU Time Allocation System for a Multi-Organization Shared Server
| This project aims to design and implement a lightweight, single-server, command-line-based GPU allocation system used by multiple organizations. The system will enable users to reserve GPU time efficiently, show the reservations, and prevent the use of unallocated resources, ensuring fair usage across various projects. The student will also compare the solution against existing systems in terms of performance and ease of use. Prerequisites: Strong programming skills (preferably C), knowledge of Linux systems and command line interfaces, and experience with distributed computing environments.
| Dr. Jussi Rasku |
Available | Sustainable Software Engineering (with AI) | Examples of possible topics: "Reducing the Carbon Footprint of AI Model Training: A Study on Pruning, Quantization, and Transfer Learning", "Life Cycle Assessment of AI Models: From Training to Deployment", "AI-Specific Hardware for Green Computing: A Comparative Study of GPUs, TPUs, and AI Chips" Prerequisites: https://github.com/Green-Software-Foundation | Dr. Mika Saari |
