Quantcast
Channel: dbi Blog
Viewing all articles
Browse latest Browse all 2879

What DEVs need to hear from a DBA and why SQL and RDBMS still matters…

$
0
0

The Average case

You are a good developer and you know all the hot trends, you know how to build an application completely. From the frontend part to the RESTful API backend, to the ORM and the database you’ve done it all and seeing it all.
But now that you participated in numerous projects and build substantial code, you started to receive feedback from your Sysadmins and Users of applications you coded a few years ago. The application is starting to have performance issues…

-“Simple ! The number of users increased ! The database is now 600GB ! We should provide more resources to the PODs and VMs (my code is good and don’t need rewriting; refactoring was done properly…).”

Makes sense, but the sysadmins tripled the number of CPU and Memory without any benefits whatsoever.

-“Look the database server is too slow the queries are not fast enough !
A DBA should be able to fix that !”
-“We don’t have any, we should call a consultant to make a performance review and help us out of this mess. Customers are still complaining, it is time to invest…”

That’s where a DBA consultant (me) comes along and performs required maintenance and tries to apply standard best practices, tune some parameters here or there and exposes the most intensive queries that need tuning….
Then the DEV Team explain they are using an ORM and can’t “tune Queries” or touch a the SQL code because they don’t want to, it would have to many implications on business logic and architecture, and also they don’t know SQL all that much; it is an old language they used back in their early days as developer.

1. Why SQL and RDBMS (still)

As a developer don’t overlook SQL and RDBMS like PostgreSQL. It is still the best way to store and access data when relation between data is important and when that relation can be defined before hand and is stable (which is usually the case in businesses).

In the following example there are several benefits of using a RDBMS :

  • Data integrity: Enforced by foreign keys and other constraints the table design ensures that the data remains accurate, and consistent, preventing issues like orphaned records.
    In this case, an order cannot exist without a customer, and a line item cannot exist without an associated product and order.
  • Complex Queries: RDBMS are made for JOINs between tables. All the architecture of an RDBMS is helping providing facilities to retrieve and store data efficiently.
  • Transaction support: If your requirements are like in this example that an order with multiple steps in it (updating inventory, creating an order record…) must complete su
   SELECT o.order_id, c.name, p.name, od.quantity, p.price, (od.quantity * p.price) AS total_cost
   FROM Orders o
   JOIN Customers c ON o.customer_id = c.customer_id
   JOIN Order_Details od ON o.order_id = od.order_id
   JOIN Products p ON od.product_id = p.product_id
   WHERE c.name = 'Bob Kowalski' 
   ORDER BY o.order_date DESC
   LIMIT 100;

2. What about NoSQL databases ?

NoSQL DB like MongoDB or Cassandra are designed for scalability and flexibility in storing unstructured data, complex joins and transactions are not supported in the same way. They are more suitable if your data structure changes frequently and the application demands high write throughput and horizontal scalability.
In our example an RDBMS like MySQL, MariaDB or PostgreSQL is the best choice to store the “statefullness” of your application but you could use NoSQL DBMS like Redis to cache some data and help not putting to much pressure on the RDBMS by making less calls to it. No one needs to retrieve the same data 50000 times per minute… Use the cache Luke… use the cache…

It would be silly to tell you “don’t use NoSQL, RDBMS is king !”.
Use them both and understand their limitations.
DEVs love their NoSQL because having a schema-less architecture help them scale easily and achieve better integration with CI/CD processes, which is traditionally difficult with RDBMS, bonus point for not having to talk to a DBA (which I understand, I talked to myself already enough:)…
In this case and maybe in life in general, if want to bypass standardized models and rules that were made by previous people, do that only if you understand the cost of doing so and why they exist in the first place.

3. Is the ORM my problem ?

Yes and no. ORMs are one of the good thing that happened to Developers and DBAs. It help creating better code in most of the case and they become quite tunable nowadays.
So please keep your ORM you need it today since it creates a level of abstraction that is helpful for simple queries and scalable logic and gets you faster to deadlines.
The think is that you have to understand there limitations. And as a DBA I am fine with using 100% of ORMs SQL, up until you have a performance issue.
If for some reason your application gets some success and is used enough so that you pushed by the business to do better, ask the DBA to provide you the top 10 queries of the last weeks and understand how you can tune those ones and maybe not use the ORM in those cases.
When the ORM is producing suboptimal queries (queries not performing well enough for business), it might be for several reasons :

  • Abstraction: To produce queries, an ORM has to generalize them is such a way that it can cause performance issues. Because the ORM can’t think of all the cases and doesn’t know your data.
  • N+1 Problem: Commonly known, this issue is generating more roundtrip calls than it is advisable to the RDBMS and is well documented in most documentation since the early 2000s. In general just think about the data you need and try to understand if you can solve it by creating a query with appropriate JOINs and fetch the required data in one go.
    ORMs (Hibernate or Entity for example) allow specifying a batch size for fetching related entities. This means instead of one query per related entity, the ORM will group several entities in fewer queries. Some other ways on the RDBMS side can mitigate those aspects as well like proper indexing, views, materialized views,…
  • Complex joins: what an RDBMS can’t handle a lot of JOINs ?! It depends what you mean by a lot, but generally RDBMS like SQL Server are having a hard time with more than 6 or 7 JOINs, with PostgreSQL you could go a bit further and use GEQO algorithm at the cost of planning time of your execution plan, but overall, an optimizer can’t produce a proper query plan when the cardinality tends towards infinity… which is the case when your ORM queries is generating queries with 86 JOINs !

    Note : Understand that it is not just about the number of JOINs. Schema Design indexes and the optimizer capabilities are critical aspects of performance levels, most of the time people are hitting limitations in a RDBMS because they didn’t recognize their existence.

If you want to get more info and best practices about ORM I suggest reading this : Hibernate Best Practices (thorben-janssen.com)

4. Performance optimization paths for RDBMS:

In addition to what has been said already, you can also optimize your instance to work better.
Earlier I talked about limitation on the number of JOINs an optimizer can handle, it is important to understand that optimizer capabilities are influenced by schema design, indexes and your queries ! Like said often by Brent Ozar, you have 3 buttons that you can play with to get better performance : TABLE design, QUERY design and Resources.
Most people are often play with the third one because it is easy to request for more CPU and Memory… cloud provider make you pay for that, it less the case nowadays though.
So for me you can request to get additional training for your team, numerous companies offer performance training dedicated for Oracle, SQL Server, PostgreSQL, MariaDB, MySQL,…. and DBI services is one of them.
But you could also take leverage first of modern monitoring and tools like Query Store on SQL Server or PGANALYZE on PostgreSQL to understand better where your performance bottleneck is.
In most case, it is easy to query for the top 20 resources intensive queries, usually in those in will have 3 or 4 that are consuming more resources by 10x. Try and understand why that is and get specialized help if you can’t.

Key takeaways

  • It is still a matter of using the proper tool for the proper job. Building an architecture based on ORM is a good practice and even if you need to store JSON in the database, I am still up for that ( PostgreSQL supports it in the best way possible ).
  • Be prepared that if along the way you need to get more performance at scale, you’ll need to be prepared for a hybrid approach. Using ORM for simple CRUD and raw SQL for the relevant queries. ORM do support writing Native SQL Queries, don’t be afraid to use it.
  • In addition use cache capabilities when you can.
  • Consult with your Sysadmins and DBAs, they know stuff on your app you want to hear. Trust me on that, they want to help (most of the time:).
    Often different teams don’t have the same monitoring tools and don’t look at the same metrics. It is important to understand why.
  • Be sure to update your knowledge. Often enough I still see DEVs that still are having hard time understanding key concepts or evolution of the best practices… (stored procedures, anti or semi-joins, ….etc).

I do understand that most DEVs are not building a new app every morning from scratch most of them inherit code and logic from old applications build some time ago, architectural decisions are not so often in there hands. Even then, I think we are at a corner stone of IT evolution, and the next years will be filled with opportunities and new tech, but for the past years most of the projects I have seen failed performance wise, were due to miss communication and over simplification of complex systems. Platform engineering should solve that an put away the overhead of managing all systems without knowing them all…

L’article What DEVs need to hear from a DBA and why SQL and RDBMS still matters… est apparu en premier sur dbi Blog.


Viewing all articles
Browse latest Browse all 2879

Trending Articles