Have you considered throwing in one or two other languages here? I've done some work comparing performance on N-body simulations and even PyPy is several times slower than JavaScript, which is several times slower than Java and Go, which are several times slower than C, C++, and Rust. In my experience, if you care at all about the CPU performance of your code, you shouldn't have chosen Python in the first place. It is getting better, but for it is still ~100x slower than C/C++/Rust on numerically intensive tasks. (I will note that we even tried to use NumPy to speed up the Python performance and all attempts to do so with large N-body simulations produced slower code.)