Project ideas for MSc and BSc projects

My interests:

  • Artificial intelligence, particularly hard mathematical search problems such as searching for an optimal route or schedule; learning of optimisation algorithms
  • Large Language Models (LLMs) and their applications for automation of processes
  • Data science and machine learning, particularly interpretability (so that people could understand the model) and hyper-parameter optimisation (finding the parameter values that make a learning algorithm perform better)
  • Algorithm engineering, particularly optimisation algorithms and combinatorics
  • Data visualisation, analysis and manipulation
  • Other interests: physics; engineering; aviation; public transport; space exploration; chess; coffee

Below are a few specific project ideas. I am also happy to discuss your ideas or come up with new projects based our joint interests.

Please note that all these projects will involve substantial amounts of programming and research.

LLM-based personal tutor

Target programmes: CS, CS(AI)

Large Language Models (LLMs) have seen major advancements over the last few years demonstrating a certain level of intelligence, a broad range of knowledge, and the ability to interpret natural-language prompts and generate text that closely resembles writing.

In this project, you will utilise an LLM to develop an artificial personal tutor. You will focus on one subject (e.g., teaching Python but you can choose any subject). The artificial personal tutor will assess the student, looking for gaps in their knowledge and skills, provide teaching materials tailored to the student, produce exercises, mark the answers and provide feedback. All the content (lectures, exercises, marking, feedback, etc.) will be generated by an LLM whereas your software will provide a GUI, maintain a database to store the tutoring history, generate LLM prompts and parse LLM responses to feed the GUI. It will also verify the responses of the LLM whenever possible.

Programming
Algorithms and Data Structures
Artificial Intelligence
Mathematics
Research

Querying unstructured data with the help of LLMs

Target programmes: CS, CS(AI)

Much of the existing data is unstructured or semi-structured; in other words, it is not stored in neat tables, ready to be processed by SQL-like queries. Usually, handling unstructured data requires human input and so is time-consuming. This project is about using LLM to intellegently query unstructured data.

There are two possible directions for this project:

  • Query semi-structured data that is provided in the form of tables however those tables may not be fully coherent.
  • Query unstructured data (think of the Student Services section of the UoN website).
You will use existing (pre-trained) LLMs in conjunction with your own code to create an intelligent querying system capable of completing tasks that LLMs cannot do on their own.

Programming
Algorithms and Data Structures
Artificial Intelligence
Mathematics
Research

Automated implementation of software using LLM

Target programmes: CS, CS(AI), Data Science

LLMs are capable of generating fragments of program code when given precise enough specification. By combining a user-provided discription with predefined instructions and guiding an LLM, it is possible to build a relatively advanced software system.

In this project, you will focus on one class of software and develop a system that will take a user description and, using LLM, build a corresponding software. Examples:

  • Your system will take a description of game rules and develop a video-game (including the engine, GUI and an AI agent).
  • Your system will take a dataset and a description of what needs to be learnt, and output an ML model. Your system will have to automatically apply data cleaning (utilising LLM where necessary), use Auto-ML and, if appropriate, visualise data.

Programming
Algorithms and Data Structures
Artificial Intelligence
Mathematics
Research

Learning Interpretable Strategies for the 2048 Videogame

Target programmes: CS, CS(AI), Data Science

2048 is a simple yet addictive videogame (http://2048game.com/). It is not too difficult to play although learning how to play it well takes a lot of time.

There have been very successful attempts to develop AI agents to play the game. The problem is that we do not understand how these agents play, and so we cannot use them to teach a human player.

In this project, you will design an automated method to learn simple (interpretable) yet efficient policy for an AI agent to play 2048. The hypothesis is that an agent based on a small set of easy-to-understand rules can play the game relatively well. Testing this hypothesis and developing the learning methodology will contribute to the game industry and to the solution of the long-standing issue of interpretability in AI.

Instead of 2048, you can suggest some other one-player videogame, or we can consider two-player games.

Programming
Algorithms and Data Structures
Artificial Intelligence
Data Science
Mathematics
Research

Automated Design of Interpretable Stock Market Strategies

Target programmes: CS, CS(AI)

Trading strategy is key to one's success in stock markets, and so a lot of research has been done to support trading decisions or even implement completely automated trading systems. However, most of this research focused on maximising the profits; the predictions and decisions are usually impossible to understand for a human.

This project is to apply artificial intelligence (AI) methods to stock market trading. Specifically, you will develop a system that can automatically learn efficient yet interpretable long-term investment policies. For example, the system could identify that buying a stock that has raised in price by more than 5% within a day and then did not change its price for the next two days is, on average, an efficient policy. You will use back-testing to evaluate policies and an optimisation algorithm to search in the space of possible policies.

Programming
Algorithms and Data Structures
Artificial Intelligence, Optimisation
Finances
Data Science
Mathematics
Research

Chess Game Assistant for Beginners

Target programmes: CS, CS(AI)

Computers are exceptionally good at playing chess; the estimated rating of the AlphaZero (developed by DeepMind) is above 3500 whereas best human players can only achieve around 2700–2900. The move that computers make are often hard to understand, particularly for beginners.

This project is to apply artificial intelligence (AI) methods to explain, in English, moves suggested by AI chess engines. For example, you could develop an assistant for beginners that explains why certain moves are good or bad, e.g. ‘Moving the rook to e6 would leave it hanging.’ You will most likely use an existing AI chess engine to identify the tactics that make certain moves good or bad.

Programming
Algorithms and Data Structures
Artificial Intelligence
Chess
Data Science
Mathematics
Research

Semi-Automated Plagiarism Detection Tool for Programming Languages

Target programmes: CS, CS(AI), Data Science, HCI

Standard plagiarism detection tools are designed for natural languages and are mostly looking for precise matches of text fragments. Detecting plagiarism in programming languages is much more challenging mainly for three reasons: (i) two programs may share a lot of code simply because it is some standard code (e.g., a piece of code given in a lecture or auto-generated code); (ii) there may exist the most natural way to implement a function, and different people may arrive to this implementation independently; (iii) it is easy to modify a piece of code by reshuffling lines, renaming variables, etc., to make it less recognisable. There exist automated tools to tackle point (iii) but they cannot offer a solution to points (i) and (ii).

In this project, you will develop a tool that uses a completely different approach to plagiarism detection. The plagiarism will be detected by searching for similar anomalies in program codes. For example, an unusual solution method or an identical mistake found in several programs is a good indicator of plagiarism. If several such anomalies are shared by two programs, this is a strong indicator of plagiarism.

Your tool will load all the programs that need to be analysed and let the user select anomalies by hand. It can then search for similar anomalies across all the programs, identify the most likely plagiarism cases and generate reports.

Programming
Algorithms and Data Structures
Artificial Intelligence
Research

AI-based search for compact formula approximations

Target programmes: CS, CS(AI)

Some formulas do not exist in a compact form. For example, the logarithmic function can only be computed as a sum of an infinite series. Similarly, Prime-counting function and Bell number do not have compact formulas.

In this project, you will develop a system to automatically search for compact approximations of formulas. You will define the problem as an optimisation problem and design an optimisation algorithm to search the space of formulas.

Inspired by this video.

Programming
Algorithms and Data Structures
Artificial Intelligence
Mathematics
Machine Learning
Research

Automated Design of Optimisation Algorithms

Target programmes: CS, CS(AI)

Mathematical optimisation algorithms are at the core of many decision support/making systems. Development of optimisation algorithms is resource-expensive and time-consuming, and hence there are a lot of benefits of automated generation of optimisation algorithms.

In this project, you will apply a recent artificial intelligence framework to automatically develop an optimisation algorithm for some combinatorial optimisation problem.

Programming
Algorithms and Data Structures
Artificial Intelligence
Mathematics
Research