LCTP Fellow for Astrophysics/Cosmology
I am currently a LCTP fellow working on cosmology and astrophysics using Bayesian machine learning methods. Prior to this, I was a PhD student at UC Riverside, advised by Simeon Bird, working on multi-fidelity machine learning techniques on cosmological galaxy formation simulations (see PRIYA-MF: arXiv:2309.03943). I was a former Future Investigator and supported by a NASA FINESST award (Inside UCR story). Besides, I collaborate with Roman Garnett (WUSTL CSE) on developing Bayesian machine learning tools to find Damped Lyman-alpha (DLAs) absorbers (arXiv:2003.11036), currently used by DESI Lya analysis. Additionally, I am interested in using hierarchical population inference to analyze LIGO's GW data, shedding light on the formation channels for the binary black holes (with Will Dawson and Scott E. Perkins from LLNL). I was also a Graduate Fellow at Carnegie Observatories, working with Fakhri S. Zahedy to build Lyman-limit system finder, during the last year of my PhD.
I am currently working with Camille Avestruz (UMich) on propagating the uncertainty of blending effects (from the ML tool: BLISS) to the galaxy cluster's richness-mass relation. In UMich, I expect to spend the majority of my time in DESI (primarily Lyman alpha) and LSST DESC (Cluster cosmology). Also, continuing on my previous Lyman-alpha emulator works, I am also working on using PRIYA simulations to infer cosmological parameters using P1D of KODIAQ and XQ100. Besides astrophysics, as a side project, I am spending some time working with Shang-Min Tsai (UCR) on developing Physics-informed neural net for exoplanetary atmospheres. And I spend the rest of time thinking about a new project idea for binary black holes.
I believe a video is worth a million words. So instead of writing lots of words, I made some videos. There are ~6 million words below.
(Background image: Gas particles in Astrid simulation model (120 cMpc/h).)
(Demo: Multi-fidelity emulation in 1D toy data (See jibanCat/nargp_tensorflow).)
(Demo: Bayesian optimization demo, with an increasing observational noise.)
(Demo: Inferring quasar redshift using a data-driven Gaussian process (2006.07343).)
(Demo: Inferring damped Lyman alphas absorption (DLAs, typically neutral hydrogen gas around high-z dwarf galaxies) using Gaussian processes and Bayesian model selection (2003.11036).)
We have extended the idea of multi-fidelity modeling and developed MF-Box (arXiv:2306.03144), a method for modeling N-body simulations that incorporates multiple particle loads and physical scales. By using two types of cost-effective simulations, L1 and L2, we construct a model that accurately predicts summary statistics derived from the high-fidelity simulation across a wide range of cosmological parameter space. L1 simulates a low particle load in a large volume, while L2 simulates a low particle load in a small volume. By combining the simulations from different volumes, we achieve precise interpolation in a high-dimensional parameter space across scales, significantly reducing computational costs compared to running an additional high-fidelity simulation. MF-Box is versatile and can be applied to various simulation suites, providing an effective solution for interpolating simulations across multiple physical scales using affordable approximations.
Based on the concept of multi-fidelity emulation, our research group has designed PRIYA (arXiv:2306.05471), a multi-resolution simulation suite that spans a 9-dimensional parameter space, including cosmological and astrophysical/thermal parameters. We ran 48 low-resolution simulations with 1,536^3 particles in a 120 Mpc/h box, along with 3 high-fidelity simulations with 3,072^3 particles. These simulations include a comprehensive physics model for galaxy formation based on Astrid, including supernova and AGN feedback, resulting in a realistic population of DLAs. By combining the simulations from these 48 low-resolution simulations with the 3 high-resolution simulations, we have developed a multi-fidelity emulator that achieves interpolation with less than 1% error. This emulator provides a resource-efficient way to explore how summary statistics change based on our galaxy formation model and cosmology. Moreover, it will be used for Bayesian inference in future analysis of Lyman alpha forest data.
I have an interest in using machine learning techniques to model the emissions of quasars, which are the lights emitted by supermassive black holes. I have made improvements to a Gaussian process (GP) based model originally developed by Roman Garnett (arXiv:1605.04460). My modifications in Ho-Bird-Garnett (2021) allow for the identification of damped Lyman alpha absorbers (DLAs) in the intergalactic medium, accounting for sub-DLAs and also include a model for a quasar meanflux model (arXiv:2003.11036, arXiv:2103.10964). Furthermore, our research involves performing probabilistic inference on quasar redshifts using the Gaussian process based model, as detailed in the work by Fauber et al. (2020) (arXiv:2006.07343).
(Thanks Yongda Zhu for sharing the awesome XQR-30 image made by Dr. Anna-Christina Eilers, which I used it as a reference to make the above DLA figure.)
The concept of multi-fidelity modeling is straightforward. We employ many inexpensive approximations to explore different parameter settings, and only rely on a handful of costly examples to improve the accuracy of those approximations. To illustrate this, consider the hiring process at a university. We have numerous graduate student workers who tackle the groundwork of exploring various research topics, while only a few professors are needed to provide guidance and refine the outputs of the students. You can find more details in the paper by Ho-Bird-Shelton (2021) available at arXiv:2105.01081. Recently, we have applied the multi-fidelity technique to emulate Lyman-alpha forest, see Fernandez-Ho-Bird (2022) arXiv:2207.06445. As part of my FINESST grant, I will continue utilizing multi-fidelity methods to model halo mass functions and weak lensing statistics.
My research is driven by my passion to ease the workload in academia. Over time, I have noticed many people becoming discouraged with research because they were told that spending long hours on tedious tasks was the only way to produce results. However, I strongly believe that we can use automated tools and better models to replace repetitive work, allowing academia to focus more on creative thinking.
“To the Bayesian all things are Bayesian.”
- Good Thinking: The Foundations of Probability and Its Applications.