Today I am motivated to write a couple of notes on SQL after spotting an article on Slashdot proclaiming SQL is the most sort after programming skill in January 2020.
SQL is a constant
SQL first appeared in 1974, nearly 50 years ago. No other computer language has remained as popular and consistently extended its reach for 50 years.
As programmers we constantly learn new languages, variations and concepts to keep up. Nothing stays still. Web developers in particular have to stay on top of changes in browsers, HTML, JavaScript, security issues etc.
SQL is a beautiful, refreshing exception to this mayhem. I learnt SQL once, 28 years ago, and have applied the knowledge most days when I am working.
Granted, there are other programming languages which have been around longer than SQL, COBOL springs to mind. Although COBOL is deployed in many critical systems, it has not extended into new areas such as embedded devices, web, or mobile.
SQL pronounced SEQUEL
SQL stands for Structured Query Language. As my university professor was keen on emphasizing, the name is a bit meaningless.
a) SQL is not particularly structured.
b) SQL is used for more than querying, SQL is also used for inserting, updating and deleting data.
c) SQL is not really a language, SQL is really a means of accessing data in a relational database rather than a complete programming language.
SQL syntax is standardized
Database manufactures have done a reasonable job of maintaining SQL as a standard syntax across relational database platforms. Most manufactures have added their own extensions over the years, possibly motivated by the slow pace of changes in SQL ANSI standards.
If programmers stick with core SQL syntax, avoiding manufactures extensions, it is usually possible to switch relational database systems with few changes.
That said, in all my programming jobs, I have never come across a company porting from one SQL database to another.
For my own business, I cannot see us migrating database platforms. We use some propitiatory SQL extensions, notably an extension in MySQL which lets you write INSERT statements with the same SET syntax as update statements. We find this easier to read than the traditional INSERT syntax.
SQL for managers
SQL was intended for managers to query company data in an easy and flexible language, bypassing the need for programmers. I have only come across a handful of non technical people using SQL. There are some business analyst types who are happy to get the hands dirty with SQL, but mainly SQL has remained in the domain of the programmer.
SQL is arguably the best technical skill for a not technical manager or analysis to learn. SQL is easy to get started with and the manager who understands SQL can have a more productive experience communicating with programmers.
SQL for techies
All programmers should spend the time to master SQL, then use it and embrace it, for three great reasons
1) As stated above, learning SQL is likely to be the only technical skill you can rely on to remain useful, current and portable for a long time to come.
2) SQL is powerful. As a mostly declarative syntax, what can be achieved in a 3 simple lines of SQL may take 20 - 30 lines of procedural language.
3) It is highly likely your next job will require SQL skills. I have been a professional programmer for 26 years and every job I have had required some SQL skills.
Don't Abstract the SQL
With the advent of object oriented programming comes the issue of mapping relational data to objects/classes, IE. Object Relational Mapping (ORM). There are many tools and frameworks that will do the mapping automatically and abstract the SQL from the developer.
In my opinion, these tools should be avoided, as they often introduce whole new languages, paradigms and systems, but provide no real benefit in either simple or complex mappings. In a simple one to one Object Relational mapping, the task is simple and no tool is required. In complex mappings, the ORM tool adds more complexity and manual intervention is often required, negating the usefulness of the automated ORM tool.
SQL code should be visible in your model objects. Programmers need to understand what is going on when objects are being retrieved or data representing objects is manipulated in the database.
SQL Gota'yas for programmers
1) Missing database indexes are often the cause of significant performance issues. Performing a select command on a table which does not have an appropriate index usually results in the whole table being loaded into memory and scanned, which is a lot of work. Creating indexes is one of very few performance boosting silver bullets in the developers toolkit. Creating an index is a single command, that can improve an applications performance 10 fold.
2) Multiple SQL selects when one will do. A classic example, imagine looping over an SQL query result and for every row performing another SQL select. This is substantially slower than performing a single select statement with appropriate table joins. Consider the following:
$result = runQueryGetResults("select customer_name,customer_id from customer");
loop $result as $row {
$count_of_orders = runQueryGetSingleValue("select count(*) from order where customer_id = " + sqlEscape($row->customer_id));
print $row->customer_name + ' ' + $count_of_orders;
}
Each call from your programing language to the SQL database has an overhead. The above example, given 1000 customers in the system, requires 1001 database calls, which represents a significant performance hit.
3) Queries become too complex, basically the opposite of issue 2 above. Relational databases have an optimizer which converts SQL into a procedural plan to retrieve the data. Complex queries can confuse the optimizer and lead to poor performance. As programmers get more confidence with SQL, they will create more complex queries. It becomes a judgement call when to split a query into two or more queries, and use regular procedural programming to massage the query outputs.
4) SQL injection, from not escaping SQL. Every variable included in an SQL statement needs to be escaped. Even if you are sure you know a variable is safe today, there is no guarantee the the next guy wont fiddle with the variable and make it unsafe. The variable needs to be escaped at the point it is put into an SQL statement.
In Conclusion
Programmers need to hone their SQL skills and consider doing more with SQL rather than hiding it away and treating it as a chore. SQL is universal, powerful and not likely to go away any time soon.