graduation_projects

This page may give you an idea of the type of graduation projects you could do under my supervision. Suggested topics involve various combinations of geometry, computational geometry, graph algorithms, (evolutionary?) search algorithms, memory/cache utilization, geoinformatics, cartography, architecture (of buildings), automated driving, and sonification (“visualization” through sound). This list is by no means exhaustive: if you would like to do a graduation project on a specific algorithmic topic that interests you, you are welcome to come and see me to discuss whether I could be your advisor on that project (or whether I know another professor whom you should talk to).

Many graduation projects would be done in the form of an internship with the research group at the university of Bonn. However, looking at the list below, you will see that graduation projects may also be done at companies and/or abroad. Projects in the list below are marked B, M, or E, indicating Bachelor's project, Master's project, or Elective, that is, a project that was done as a special elective module. Some suggested topics for future projects are marked BM, indicating that they could be researched in a Bachelor's project or, more thoroughly, in a Master's project. Master's theses should be written in English, Bachelor's theses either in German or (with special permission) in English.

An exterior shape and a floor plan of a building can be regarded as a solution to a challenging design problem to satisfy criteria of usability and style. Theoretically, one could try to define a mathematical function that summarizes the quality of each possible design as a single number. One might then use standard optimization algorithms to try to find the best possible design. However, in practice it is impossible to define such a function such that all matters of usability and style are taken into account and no ingenious designs are passed over for trivial reasons.

To circumvent this problem, we want to develop an algorithm that searches the space of possible designs while asking the user to evaluate designs to guide the search. The algorithm should find a balance between conflicting demands: it should make progress in the direction desired by the user as apparent from the user's feedback, but it should also propose designs the user would not have thought of. The algorithm should poll the user in such a way that the user can evaluate proposed designs fast; nevertheless the algorithm should extract a lot of information about the desired direction of the search from these responses. Moreover, to make progress on this problem, the first question we need to answer is this: how will we evaluate how effective our search algorithm is, given that, by definition, we cannot calculate the quality of an optimal solution?

For this project, we would collaborate with Sergio Figueiredo from the architecture department of the TU Eindhoven.

Similar to the project mentioned above, we would design an algorithm to search a space of possible designs guided by feedback solicited from the user. However, in this project, it is not building designs we want to explore, but schematizations of public transportation networks, in order to find a most appealing design for a schematized map.

A space-filling curve is essentially a continuous, surjective function *f* from the unit interval to some two- or higher-dimensional volume. As *t* goes from 0 to 1, the image *f*(*t*) traces out a curve that eventually visits each point of the higher-dimensional volume. One way to build a data structure for approximate nearest-neighbour queries among *d*-dimensional points is the following: put all data points in each of *k* sorted lists corresponding to *k* well-chosen space-filling curves. In each list, the points are sorted by the order in which they appear along the corresponding space-filling curve. To find a nearest neighbour to a point *q*, one first finds the predecessor and successor of *q* in each of the *k* lists. Now report, from these 2*k* points, the point that is closest to *q*. Liao et al. describe how, with *k*=*d*+1 well-chosen curves, one can guarantee that the Euclidean distance from *q* to the reported point is at most a factor *O*(*d*²) worse than the distance from *q* to the true closest point in the complete data set.

That guarantee is not as good as we would like it to be, and this may have something to do with the following fact: at many points of the space-filling curves used by Liao et al., 2^{d} regions meet, and most of these are very far from each other along the curve. However, it is possible to construct space-filling curves where, at any point, at most only *d*+1 regions meet that may be far away from each other along the curve. Could we use such curves to get a better approximation ratio on the distance to the nearest neighbour? Or to achieve the same approximation ratio with fewer curves?

A space-filling curve is essentially a continuous, surjective function *f* from the unit interval to some two- or higher-dimensional volume. As *t* goes from 0 to 1, the image *f*(*t*) traces out a curve that eventually visits each point of the higher-dimensional volume. Gary Teachout considers Gosper's plane-filling curve “one of the most convoluted, yet one of the smoothest fills”. It is probably for this reason that Auber et al. use this curve as the basis for visualizations that consist of plane-filling curves cut into pieces. Can you find an even smoother plane-filling curve? To answer this question, we should start with developing a metric for how smoothly a plane- or space-filling curve fills the plane (or space), and we should calculate the smoothness metric for known space-filling curves. For a deeper understanding of the problem, we may look at physicists' attempts at achieving the opposite: deliberately rough space-filling curves that may serve as a model for DNA molecules.

Human beings tend to be extremely good at picking up patterns in sounds. In some ways, the communication channel from computer to human via the speakers and our ears may even have higher bandwidth than the communication channel via the screen and our eyes. How can hearing be used to monitor the execution of algorithms and discover bottlenecks, anomalies, and/or differences between different algorithms? For this purpose, properties of running algorithms should be *sonified*, that is, converted to sound. In particular, we could sonify memory access patterns of various algorithms, or the state of incremental optimization algorithms.

Similar to the project mentioned above, we could try sonification of geographic data. For this project, we would collaborate with an expert on geographic data and an expert on sound.

Recently I have been co-supervising projects on environment perception for automated driving. These projects were executed in the research laboratories of Audi in Ingolstadt. If you are interested in this topic, feel free to contact me and we can see if we can set up a similar project for you. Note: this may take several months to set up, and includes an application procedure. So if you are interested, it is best to contact me in the beginning of the semester that precedes the semester in which you want to start your graduation project.

Schematic maps of transportation networks are designed to show the connections between different train or metro lines clearly. However, from such maps it may be hard to get an impression which routes are costly (in terms of travel time, or otherwise) and which routes are preferable. To remedy this shortcoming, one may add a background to the map that gives an impression of travel cost. I am exploring two ways to do this. The first approach is to divide the map into zones, such that the travel cost on each route is roughly proportional to the number of times one crosses a zone boundary. The second approach is to calculate a weight function on the plane, such that the travel cost on each route corresponds to the integral of the weight function over that route. By rendering low weights as bright background and high weights as dark background, low-cost routes will stand out due to high contrast with the background. This line of research leads to three possible topics for graduation projects:

The goal of this project is to decide which edges of the transportation network should cross zone boundaries, such that map users who choose their routes so as to minimize the number of zone boundary crossings, end up choosing the routes that indeed have the lowest total cost. The problem lends itself to theoretical research (is it NP-hard or not?) and/or practical research (what heuristic solution gives satisfactory results on realistic networks?).

Given which edges cross zone boundaries, and thus, which groups of stations must lie in the same zone, we need to draw the zones. To minimize clutter on the map, the zones should have simple shapes—ideally, they are disjoint disks. Again this problem lends itself to theoretical research: is it NP-hard to decide whether we can draw the zones as disjoint disks, given the locations of the stations and their grouping into zones? But we can also research it from a more practical perspective: what heuristics can we apply to get well-shaped zones when disks do not fit, or when a solution with disks cannot be computed efficiently?

We are given a network drawn in the plane, with a cost associated with each edge. The goal is to assign a weight to each point in the plane, such that the integral of the weight function over the set of points covered by each edge matches that edge's specified cost. A technically correct solution would be nearly trivial to obtain, but to obtain a readable map, we should satisfy additional criteria that make the problem considerably more challenging. For example, the weight function should be as smooth as possible, and it should not have steep gradients where they would be obscured by the network drawing. In a Master project, we could also try to investigate how the computed background affects users' perception of the network and how it affects what routes they would choose to travel.

Many algorithms have been designed for the automatic drawing of schematic maps of public transportation networks. Except in a few special cases, these algorithms first attempt to find an optimal drawing of the network, and then try to put the station labels in. This results, practically always, in problems with the labelling that are hard to solve. Alternatively, as suggested by the experienced map drawer Maxwell Roberts, we could first find an arrangement for the labels on the map and then try to find a good network drawing that follows the labels. Our first task in this approach is to formulate boundary conditions and optimization criteria for the arrangement of the labels, taking into account what is needed to ensure that a good network drawing is possible. Second, we need the solve the label arrangement problem according to the specified criteria.

Schematic maps of transportation networks in which train/metro lines are depicted as smooth curves can be very effective, as studies by the psychologist Maxwell Roberts have shown. Previously we experimented with a force-directed algorithm to produce such maps automatically; however, the results seem to lack the clear structure that can be seen in professional human-drawn maps. To bring more structure into the map, we may experiment with the following approach: first identify which metro lines, or which sections thereof, are most important; then adapt the force-directed method such that these lines are drawn straighter or smoother, while allowing less important lines to become more twisted.

Below you find a list of projects that I supervised or co-supervised when I worked at the Eindhoven University of Technology.

Wouter van der Stoel did his graduation project at the research laboratories of Audi in Ingolstadt. His task was to design an algorithm to determine relations between linear roadway features (such as lane markings and crash barriers): a highly-automated vehicle could use this to build a model of how many lanes there are and where they split or merge. We kept in touch about the progress of his project by regular conference calls with Wouter, me, and his advisors at Audi.

Jeroen Hoogers experimented with algorithms to design the exterior shape and the floor plan of a house. His solutions are essentially genetic algorithms that generate successive sets of designs from which the user can choose the best candidates. To not overburden the user with bad designs, Jeroen experimented with different selections of design parameters that are encoded in the genome, combined with algorithms that generate automatically fine-tuned designs for given values of these parameters. This project was co-supervised with Sergio Figueiredo from the TUE architecture department.

Various algorithms have been designed to draw schematized maps of metro networks. To facilitate further progress, it is desirable that we can assess the results of different algorithms and/or parameter settings automatically. The cognitive psychologist and metro map expert Maxwell Roberts from the University of Essex suggested five abstract criteria for good map design: simplicity, coherence, harmony, balance, and topographicity. Hein Beers set out to design algorithms that calculate indicators for coherence and balance automatically for any given metro map.

Scientists from the TU Münich showed how scientific calculations on three-dimensional meshes can be done very efficiently using only stack operations to store and retrieve intermediate results. However, their method requires a mesh that consists of cube-shaped cells. Gert van der Plas investigated how to make a similar approach work for meshes based on tetrahedral cells. In particular, Gert determined how many stacks are needed, in what order one should traverse the cells of the mesh, and which variables should be stored on which stacks, so that all variables are accessible can be retrieved from the top of one of the stacks when they are needed.

Scientists from the TU Münich showed how scientific calculations on two-dimensional triangular meshes can be done very efficiently using only stack operations to store and retrieve intermediate results. However, the method requires that the mesh is structured in a specific way—the method cannot be applied to just any given triangular mesh. Hugo Snel studied how to extend the method to arbitrary meshes of triangles or tetrahedra. He found that for three-dimensional, tetrahedral meshes, there is a potential to reduce cash misses by as much as 95% as compared to standard approaches.

Schematic maps of transportation networks are designed to show the connections between different train or metro lines clearly. Roel van Happen investigated an approach to construct zone boundaries around groups of stations on such maps, to give the user more information about which stations lie close to each other and which stations do not. His approach is based on an energy function, which, for each zone, gives high values to points that are close to a station within the zone, while reducing the values of points that are close to stations outside the zone. The zone is then defined as the set of points with high value.

Michel Cijsouw experimented with improving the usability of schematic metro maps. To this end, he wrote algorithms that emphasize railway sections depending on the number of destinations for which these sections are part of the shortest route. The required emphasis is created by drawing railway sections with thicker strokes and by creating more space for them on the map. For this purpose, Michel worked with an existing mixed-integer programming formulation that aims to describe the optimal map, and adapted this formulation to take requirements for additional space into account.

Most computer scientists know Dijkstra's algorithm to compute the shortest path from one point to another in a graph. Unfortunately, if the graph is too large to fit in main memory and nodes and edges have to be read from disk during the computation, Dijkstra's algorithm accesses the disk in a way that can be very inefficient. Known solutions that are theoretically efficient with respect to disk access, were known to be complicated or inefficient with respect to computation time. Wilco van Maanen investigated several simple alternatives and found that these solve the problem in practice, at least for graphs with a regular grid structure.

On maps of public transportation networks, metro lines are usually drawn with straight line segments. However, studies show that lines are often easier to follow when drawn as smooth curves. How can we draw clear and beautiful maps with smooth curves automatically? In his Master's project, Mathijs Miermans developed a new approach to do this with a force-directed method. The method starts from a drawing that is easy to obtain but is not so nice, and improves the map by iteratively applying forces between different parts of objects on the map that result in pushing and pulling the map into a better shape. Mathijs's results were presented in the international Schematic Mapping Workshop that took place in England in 2014.

A curve in the plane is a function *f* from the unit interval [0,1] to a set of points in the plane. With one of Pólya's triangle-filling curves, the image of *f* fills a right-angled triangle, and the curve is recursively composed of *k* smaller sections that each fill a similar triangle that is a factor *k* smaller than the original triangle. Moreover, the curve is *face-continuous*: this means that the boundary of the image of a contiguous part of the curve is always a simple shape, without self-intersections. These properties (recursive construction and face-continuity) are valued in certain applications in scientific computing. Marinus Gottschau and Kilian Matzke investigated whether a curve with these properties could also be constructed with *acute* triangles. We published the answer in one of the world's top journals on discrete geometry.

Given a point *q* in a terrain, we would like to be able to compute the watershed of *q*, that is, the set of points in the terrain from which water flows to *q*, assuming that water always follows the direction of steepest descent. We assume that our terrains are specified by a triangulated network of vertices. The longitude and latitude of each vertex are known exactly, but the elevation of each vertex is an estimate with a certain error margin. This leads to uncertainty about which points lie in the watershed of *q*: some points can never lie in the watershed of *q*, some points will definitely lie in the watershed of *q*, and some points may lie in the watershed of *q* depending on the unknown exact elevations of the terrain vertices. Marlous Theunissen investigated the theoretical properties and practical performance of algorithms to compute the minimum and maximum extents of watersheds in this challenging setting.

Nicky Gerritsen developed, implemented and tested algorithms to compute minimum spanning trees and shortest paths on large graphs that have a simple grid structure. His algorithms are optimized for efficient disk access and to exploit the parallel computing power of graphics processing units.

Marvin Raaijmakers did his graduation project in the research laboratories of Audi in Ingolstadt, while keeping in touch with me by weekly “meetings” using Skype. Marvin studied how a moving car can detect other cars that are overtaking or driving next to it. His results were published in the international Intelligent Vehicles Symposium. After his graduation project, Marvin continued to work for Audi, obtained his PhD there, and now supervises students himself.

In geographic information systems, digital terrain models are mostly based on a regular grid of squares. Ed Schouten investigated if, for certain computations, higher precision could be obtained by using a grid of hexagons, and developed techniques to store and traverse such grids. After his graduation project, Ed went to work for Google in München.

Space-filling curves are continuous surjective functions that map the unit interval [0,1] to a higher-dimensional region, such as the unit (hyper-)cube. Their inverse can be used to define a linear order on high-dimensional points; this order can then be used, for example, to decide how to store the points in a data structure in memory or on disk. Not all curves are equally effective at optimizing, for example, the disk access patterns of queries in such data structures. Simon Sasburg developed and implemented algorithms to compute quality measures of certain classes of curves that may help in assessing how effective they are. After his graduation project, Simon went to work for MagnaView.

Maps of metro networks are usually schematized, which means that the connections in the network are depicted more clearly at the expense of topographic accuracy. As a result, what sometimes seems to be the shortest route on the map is not the shortest route in reality. Tal Milea and Okke Schrijvers did an honours project on quantifying the distortion. They embedded this quantification as an optimization criterion into a schematic network drawing algorithm, so that it produces metro maps that are more consistent with reality regarding which routes are shortest. The results were presented on a poster at the International Symposium on Graph Drawing.

Robert Leeuwestein did his project at Prodrive in Eindhoven. Robert developed and tested algorithms to optimize the paths followed by two consecutive solder machines that fix large electronic components on a circuit board—the bottleneck in the process of manufacturing these boards.

Quadtrees are data structures that can be used to store, for example, planar subdivisions. However, the traditional top-down algorithms to construct quadtrees are rather inefficient, especially when the planar subdivision is too big to fit in memory during the construction process. Joery Mens studied algorithms to build such trees efficiently with a bottom-up process, constructed from a few simple and efficient sorting and scanning passes.

Suppose a set of villages is built on a newly created artificial island, and you get to build a road network of a certain total length. The budget is not sufficient to build a direct road between each pair of villages, so typically, the shortest route from one village to another has to take a detour. Where should you build the roads, such that the maximum detour (distance along the road divided by distance as the crow flies) between any pair of villages is as small as possible? Peter Kooijmans, and, in a follow-up project, Marco Verstege, studied this problem for certain settings with a small of number of villages, implementing clever search strategies to find the optimal solution among a huge number of combinatorially different candidate solutions. After his graduation project, Marco followed the postgraduate programme of the 3TU.School for Technological Design.

Freek van Walderveen did his graduation project at the university of Aarhus in Denmark, while keeping in touch with me by weekly “meetings” using Skype. Oil companies use sonar to get an accurate picture of the ocean floor, which they need for the maintenance of oil pipes. Freek developed, implemented and tested algorithms that can automatically remove noise from huge clouds of sonar measurements of the ocean floor—something which, until then, was a very time-consuming job mostly done by hand. Freek published and presented his work at ACMGIS, one of the world's top conferences on computer science for geographic information systems. After his graduation project, Freek stayed in Aarhus and got a PhD there, and then continued to work for Scalgo.

Sjoerd Cranen did his graduation project at the University of Sheffield in England, while keeping in touch with me by weekly “meetings” using Skype. Sjoerd redesigned an experimental software package for speech recognition, so that new, previously infeasible, techniques for speech recognition in noisy circumstances could be implemented and tested. After his graduation project, Sjoerd became a PhD student at the TU Eindhoven.

A four-dimensional Hilbert curves is a function *f* on the unit interval [0,1], whose image is a four-dimensional cube. An axis-parallel rectangle in the plane is described by four coordinates, and can therefore also be treated as a point in four-dimensional space. Thus, if we want to store rectangles in a data structure such as an R-tree, we may store them in the order in which the corresponding four-dimensional points appear along a four-dimensional Hilbert curve. There are many different four-dimensional Hilbert curves. The choice of curve, in combination with characteristics of the rectangles that are stored, may influence the query efficiency of the resulting R-tree. Freek van Walderveen investigated this influence. We published the results in some of the world's leading venues on experimental algorithmics.

Jeffrey Janssen developed, implemented and tested algorithms to compute the flow of water on digital elevation models which are given as huge grids of elevation samples. His algorithms turned out to be an order of magnitude faster than the state of the art at the time, reducing computation times of days to hours. Jeffrey's results were published at the international Workshop on Massive Data Algorithms, and have since been used in work by other scientists. After his graduation project, Jeffrey went to work for Realworld Systems.

Ummar Abbas implemented and tested the logarithmic priority-R-tree, a data structure that can store rectangles and points in the plane. Theoretically, the data structure was known to support efficient insertions, deletions and queries efficiently, but nobody had ever implemented it. Ummar filled in the gaps in the published description of the data structure and did experiments to find out what the theoretical performance guarantees were really worth.

graduation_projects.txt · Last modified: 2018/12/20 12:35 by administrator