First of all, I’ve written two versions of this post, this one is a little easier to understand (less jargon) and more fun! The other version can be found under “Research”.
Motivation
When scientists discover something, there are three main things that we tend to ask ourselves.
- What is the building blocks of that thing?
- What is keeping the building blocks together in that particular structure?
- How did the thing get assembled?
One can think of it sort of like getting a new thing from IKEA. We discover a new object, say a galaxy, and it feels like we have discovered a new item in the IKEA catalogue that is our world.
But now we want to investigate it a bit more. We want to answer our three questions, and for that we just need to open the box and find the assembly manual, within which we find:
- What parts is included for building our galaxy, i.e. the building blocks. For galaxies this would be some stars, some dark matter, some gas and so on.
- What tools are included to build our galaxy. For our galaxy we mainly just need one screwdriver, gravity.
- The instructions for how to use our tools to put together our parts.
Question 1 is actually not as simple as it looks. When we go out and look at galaxies, we see that there seems to a building block that affects the galaxy a lot, but that we cannot see. This is what we call dark matter. While dark matter is weird, it is also quite simple. As far as we can tell, it only interacts with other things through gravity, but it is still fundamental to how galaxies end up being. This is because it collapses under its own gravity and makes gravitational wells where galaxies can form. These collapsed clumps are called halos.
Question 3 is by far the most complicated when it comes to galaxies, because it is really hard to observe. Galaxies evolve over v e r y long timescales. Luckily, we have simulations we can investigate this with. There are a couple of key problems with the simulations though.
- They are computationally expensive.
- They are hard interpret/understand.
Making a model that emulates a simulation can solve problem 1 and give a lot of new insights into problem 2. So that is what we have done by building Mangrove. Mangrove is a Machine Learning based emulator that takes into account all three core aspects of galaxy formations, building blocks, tools and assembly manual, to give us better galaxies faster.
Technical stuff
First of all, we don’t completely bypass the simulation step, since we still need to run a simulation, but only one with nothing but dark matter. Because dark matter is so simple, this can be done pretty quickly. Then, once we have that simulation, we can try to “paint” galaxies on top of the dark matter. This is by no means the first time this has been attempted, but earlier papers have only included features from the final time step of the simulation as their input to their model. This essentially means that these methods can learn answers to questions 1 and 2 (current building blocks/tools), but are agnostic about the assembly of the galaxies. Mangrove improves on this by using not just the final time step, but an encoding of the temporal evolution of the dark matter known as a merger tree.
The merger tree is made by simply stopping the simulation at different time steps and identifying the halos where we know that galaxies live. We then see which halos merge with other halos from timestep to timestep, and build a tree-like structure encoding this evolution (see Fig. 1).
We can then encode the merger tree as a mathematical graph, which in this context simply means a collection of nodes and edges, where each node represents a halo, and the edges determine how they have interacted with each other over time. When we have the graphs we can use a Graph Neural Network (GNN), to learn how the galaxies pertaining to those merger trees should be.
In order to build an emulator, we also have to find something to emulate. In this post I discuss results from emulating the outputs from a Semi-Analytic Model1 but it works just as well for other kinds of simulations, like magneto-hydrodynamical simulations like IllustrisTNG!
Mangrove is a Graph Neural Network (GNN). GNNs work by acting on both nodes, and the neighbourhood of that node, meaning all the nodes that is connected to it by an edge. The way we learn from both nodes and neighbourhoods is through Message Passing. Message passing means that each node sends information about its current state along the edges of the graph, and is then updated by a learnable2 function f. This information passed along the edges is then used to update the state of the node through another learnable2 function g.
After doing a couple of these message passing steps, we then sum over all halos in the merger tree and use yet another learnable function h, that decodes this sum and predicts the aspects of galaxies that we are interested in, such as the total mass of the stars in the galaxy, the mass of the black hole at the center and the amount of cold gas, as well as the uncertainties on these quantities.
We like having uncertainties on our predictions, since we recognize that there might be a little area around the true value that would also be reasonable to predict. Furthermore, it gives the model the possiblity of simply letting us know if it doesn’t know what’s going on, instead of making a foolhardy guess.
Results
How much better does Mangrove do compared to methods you ask?
Well, let’s compare how well assign stellar masses to halos work with different methods!
The first comparison is with the traditional method for assigning galaxy properties based on their dark matter halos, Abundance Matching. Abundance Matching is quite simple, assuming only that there exists a monotonic relationship between halo and galaxy properties. That means that if we were to assign stellar masses to the halos, the most massive halo would get the highest stellar mass, the least massive the least stellar mass and so on. If we compare the abundance matched relationship to the one we predict, we see a dramatic difference.
But, that is perhaps an unfair comparison. We use machine learning after all, and abundance matching uses nothing but a single number per halo.
We should then maybe compare to other machine learning methods that use a lot of information but only from the final halo. That seems more fair.
But as is apparent, Mangrove still outperforms this by about a factor of 2 when measured by how much the two methods deviate from perfect predictions.
Mangrove can also predict a lot of other things, like the amount of cold gas in a galaxy (Mcold), how fast it is making stars right now (Star Formation Rate/SFR), how much metal there is in the gas (Zgas) and the mass of the black hole at its center (MBH).
Mangrove does better than all of the above mentioned methods across the board.
Other fun things
Now that we have a well-working model, we can then try to figure out where the improvement comes from. Is it in the first few timesteps, the initial conditions? Can we make galaxies at different spots in time?
The answer to the last question turns out to be yes. If we take a single Mangrove model and train it at several different times (we encode this as a redshift, usually written as a z), we see that it both does well at the new times, but also in between!
In terms of removing things from the model, we tried to simply cut down the merger tree from above, leaving out the very earliest halos. We found that the closer you get to the present day, the more each halo matters. The effect is so strong that removing the first half of the merger tree doesn’t matter much, but removing the final 1% is measurable!
Conclusion
If there is one thing that one should remember from this paper, it is that the long-standing ’nature versus nurture’ debate is not just for humans. It also matters for galaxies, and with our code, Mangrove, we can take both aspects into account, and make big improvements!
It matters how we build our galaxies!
Code
Anyone who wishes to do anything similar to this project is encouraged read the paper and check out the Github. All the code used for the projects is free to use, although I do not guarantee that it will be easy to find your way around in it!
-
Semi Analytic Models are a kind of simulation that is run after running a simulation of only the dark matter in the universe. A good early reference is https://arxiv.org/pdf/astro-ph/9802268.pdf ↩︎
-
Any time I mention a “learnable function”, think about it as a very basic neural network, what we call a Multi-Layer Perceptron (MLP). An illustration of an MLP can be found at https://en.wikipedia.org/wiki/Neural_network#/media/File:Neural_network_example.svg ↩︎