Stored Procedures

Stored procedures in RDBMSs are very commonly pooh-pooh’d as being “bad”. These are some interesting anecdotes and more recent development practices that might them appropriate in suitable circumstances.

Started with these articles by Sivers:

and continued with discussions here: Simplify: move code into database functions | Hacker News, and some links from that discussion:

2a. Your database nodes are network IO bound and this is due to applications requesting more data than it needs, and that logic can be performed in the database.

  1. You have a very strong DBA/Developer that can manage your data access layer that your application uses as well as code the database. This can abstract the “how” of data storage from the application developer, which can be very useful in a data-centric large application.

However, SPROCs are bad if:

  1. You have to switch DB providers due to scalability, or cost issues, etc. The switching cost can be huge if there is a ton of logic buried in your database.

  2. Your database nodes are CPU bound. In this case you’d want to do as little logic as possible in your DB. This is more rare nowadays, but used to happen.

  3. SQL (or whatever-sproc-language) is not a core language you want your team to have to become experts in.

  4. You don’t like adding more code. You’ll end up generating and writing yet another layer of code, this time for stuff that lives inside your database. This code has to be maintained, versioned, etc.

On microsoft stack:

Microsoft’s answer (Visual Studio SQLPROJ/Data Dude) exposes migrations for the horrible metahistory hack that they are. SQL projects are stored in source control as though they were C++/C#/what-have-you. Just like the rest of our code, the 347kloc SQL portion is handled on by CI and the migration (as well as the creation) scripts are generated for us - based on the schema differences (that it determines for us) between our last release and the current release.

So far as SQL Server goes, maintenance and versioning woes are an outright myth.

More pros:

  1. A policy of using SPROCs only is a good way to significantly reduce the risk of SQL-related vulnerabilities. If developers are required to justify why they are sending raw queries to SQL they will have a hard time introducing e.g. an injection vulnerability.

  2. A lot of the overhead of query execution (parsing, resolving objects, etc.) is done when you CREATE or ALTER the procedure, this can results in a performance benefit (especially if the application is chatty).

More cons:

  1. Query plans for SPROCs are aggressively cached, in extremely rare circumstances this can play havoc with performance (direct contradiction of my own pro 5).

  2. Triggers+family are invisible logic. They can cause confusion during debugging.

Sivers adds comment here:

How would somebody implement this for a team with versioning, source code view, and deployment?:

  1. One can create the stored procedure and store it in an .sql file which can be version controlled like any piece of code. These can then be deployed in a number of ways.
  2. We keep our schema in an XML file which is kept our source control repository alongside the application code. We have a tool to generate the XML and to take the XML and compare with a database and altering it to match the XML schema. Though we never delete data due to backwards compatibility, so no dropping of objects, reduction of column widths etc. When deploying a new version the database is processed first, if successful the new executable is made available. If not it calls home so we can take a look. To make changes in development, we commit a new XML to dev branch, we have a commit hook to upgrade the dev database.

  3. I was this guy years ago in a waterfall style shop. We were a SQLServer c# all MS vertical and we did just this. All stored procedures, table definitions, etc were checked in. There was a whole release process and QA to bundle the right versions of things into deployables which I did. It didn’t work flawlessly but it worked and when it failed it was straightforward enough that figure out why was easy.

  4. If you are using SQL Server, you can create a SQL Server project in Visual Studio and keep all of the source code for your database objects (tables, views, functions, procedures, etc) in git. The tools in Visual Studio will also help you create migration scripts and do schema/data diffs.

  5. sqitch is great! https://github.com/sqitchers/sqitch

Concerns around Stored Procedures

Version Control

Testing

Debugging

Logging

Triggers