What have I done this week?

So unfortunately I found out the hard way that a MongoDB-Postgres migration didn’t work out as great as I thought it would. Preliminary benchmarks showed the Doctrine/Postgres build was performing up to 40% worse than the existing MongoDB-based software.

Trying to figure out why this is, I point to the overhead of Doctrine. It wasn’t a problem with the queries – with what was tested, it was only querying on primary keys. From what I found when profiling, the performance loss seems to stem from a number of issues:

  • Object hydration took just over 5% of response time. While it is possible to change the hydration strategy to demarshal data to a different type (scalar, array, etc), we were already hydrating Mongo documents into our own objects without a major performance penalty. We’re still needing JSON in some places where we prefer to be schemaless; I think the MongoDB PHP library has an advantage here as a) it’s built around documents and b) it can defer to the lower-level C++ PHP extension.
  • Abstraction in general, since Doctrine is its own domain and queries are written in DQL and not a driver-specific dialect. It also makes use of a lot of cache to work around to needing to deal with your database schema manifest, however you have defined this.
  • backpack.tf’s MongoDB layer is incredibly thin. We are using the userland library directly.

I attempted many optimisations to remove the low-hanging fruit, and I did my best to stick to best practices for both Doctrine and Postgres. There were some micro-optimisations I did not try, such as writing queries directly in the Postgres dialect, but that kind of ruins the point of using an ORM.

backpack.tf is a complex piece of software. If we were starting from nothing, I’d be more inclined to be using Postgres. But is way too late to switch the entire DB stack without introducing new, unexpected issues. Thus, we’ll keep it as it is.

It’s a shame, because I really like Doctrine. I like the object persistence model as a pattern – the last time I used this, it was in the Python SQLAlchemy library, which I also enjoyed using.

In lieu of this, back to bugfixing. I think we’ll stick with Mongo, because I think there’s a few tricks I can do to boost the performance.

What am I going to do next?

Don’t feel like taking on any big projects for a little bit, since I need to recover from being disappointed over the results of what was tried.

Perhaps it would be interesting to write a performance-oriented persistence layer for PHP MongoDB – there is a Doctrine extension for it, but again, overhead, and it is targeting the older PHP5 library. Thus, there is additional overhead for PHP7 users who need to use an adapter. If designed right, a standalone high-performance persistence library might come in useful…

I’m taking out the “What should I try to improve?” section from these posts because I’m always stuck writing meaningful content there. I think it’s more effective if this can be carried through subtext.