Friday, 13 July 2012

What one can and cannot do with <cfqueryparam>

G'day:
Looking at how other languages handle DB interaction, <cfquery> is one of the best things about CFML.  It's just so easy.  We all know how it works, but here's a pointless reminder of what you already know:

<cfquery name="q" datasource="dsn" result="stResult">
    any old string you like
</cfquery>

In its most basic usage, ColdFusion takes the string and sends it to the DB driver.  The DB driver passes this to the DB, the DB does something with it and sends back a response to the DB driver and then to ColdFusion, and ColdFusion exposes via the variable named in the NAME and/or RESULT attributes (I'm not going to repeat the docs, which are here: go read 'em).

Another thing that ColdFusion does really nicely is play with strings, inline in the body of your CFM file.  We might have this SQL to pass to the DB:

SELECT  someCol, someOtherCol
FROM    someTable
WHERE   someDifferentCol = [some dynamic value needs to go here]

Obviously we can just swap the dynamic bit out for a ColdFusion variable:

SELECT  someCol, someOtherCol
FROM    someTable
WHERE   someDifferentCol = #someVariable#

It's something we do every day, but it's quite cool. This is leverages the fact that the bit in between the <cfquery> tags is just a string. And strings can be composed via variables.

We can go bananas with this, and end up with this sort of thing:

SELECT  #listOfCols#
FROM    #theTable#
#expressionConnector#    #theColumnToFilterOn#     #theComparisonOperator#     #someDynamicValueHere#
#expressionConnector#    #anotherColumnToFilterOn# #anotherComparisonOperator# #someOtherDynamicValueHere#
#entireOrderByClause#


(Note: I do not advocate the writing of generic SQL code like this; it's just for the sake of an egregious example).

Aside: why would I want to use a variable for expressionConnector? Well if the SQL is being built dynamically from an unknown sequence of optional column/filter pairings, only the first will be a WHERE, the subsequent ones will be an AND, so this variable could be initialised as WHERE and then changed to AND after the first filter is output in a loop over the optional filters.  I would not do this, but that's why one might.

And provided all those string variables have appropriate values to make a syntactically correct SQL statement, you're away laughing.

Now… 350-odd words into this post about <cfqueryparam>, I’m finally getting around to mentioning it.

OK, so there's <cfqueryparam>. We've all read a hundred posts like this wherein it's stated that one must use <cfqueryparam> whenever using dynamic values to protect from SQL injection.  And far fewer posts that explain the actual reason for parameterising one's SQL is because it's simply bad form to hard-code dynamic values into an SQL string, and it sucks for the poor DB as non-parameterised SQL is going to be different every time the DB gets it, so each statement needs to be separately compiled... resource overhead... performance... etc. That <cfqueryparam> also happens to mitigate some SQL injection risks is a side effect of parameterising one's SQL, not its raison d'ĂȘtre.

So what's this parameterisation malarcky?

There's two bits to an SQL statement: the SQL, and the values the SQL is acting with (the parameters).

In our first example (not the one that's just a stupid use of CFML variables... we'll get back to that shortly...), this is the SQL bit:

SELECT  someCol, someOtherCol
FROM    someTable
WHERE   someDifferentCol = 

And this is the parameter bit:

[some dynamic value needs to go here]

JDBC works in that these two bits can be sent to the DB separately: this allows the DB to compile the SQL bit, and then pass the parameters to the compiled statement (hey, it might not work exactly that way, before someone corrects me, but for all intents and purposes - and for the point I'm making here - that's a suitable analogy).

Thinking about the compilation bit, our DB can compile the SQL part and then we can pass "red" or "blue", or any other value we like for the [some dynamic value needs to go here] bit, but the compiled SQL is the same all the time. So the DB can cache that, and reuse it. There is actually a detectable difference in running a <cfquery> that needs to be compiled, and reusing one that hasn't been compiled, and needs to be before it is run.

OK, and so do this with <cfqueryparam>, we have this:

SELECT  someCol, someOtherCol
FROM    someTable
WHERE   someDifferentCol = <cfqueryparam value="#someVariable#">

CF passes the SQL and the parameter to the DB separately.

So far, so good, and all a bit obvious.

What I've seen happening a coupla times this week on StackOverflow is demonstrated by revisiting a variation of our ludicrously dynamic query, which I'll calm down a bit:

SELECT  #listOfCols#
FROM    someTable
WHERE   someDifferentCol = #someDynamicValueHere#

The person knows all about <cfqueryparam> being used for dynamic values to protect from SQL injection, so they tried this:

SELECT  <cfqueryparam value="#listOfCols#">
FROM    someTable
WHERE   someDifferentCol = <cfqueryparam value="#someDynamicValueHere#">


From a CFML perspective, this seems to make sense: instead of hard-coding the string, we can use variables, and then instead of hard-coding the variables, we can pass them as parameters.

Nope.

<cfqueryparam> is for passing parameters, not simply chunks of the SQL string.

A ColdFusion analogy of what we're trying to do here would be this:

<cfset theTagToUse = "CFSET">
<cfset theValueToUse = "bar"
<#theTagToUse# foo = "#theValueToUse#">

Obviously one can't do that. (Actually I do see people trying to do that too... but I'll leave that one for another day...)

So the thing to remember here is that <cfqueryparam> is for values, not simply for any old string. I guess it's CFML's flexibility with how <cfquery> works that muddies the water here.  I also think the whole notion of "<cfqueryparam> is for protecting against SQL injection" muddies the water here too. People should disabuse themselves of that way of thinking about it, then it all becomes more clear.

Bottom line: if you take it from the perspective that the value in the <cfqueryparam> tag needs to be something that is data then you should be OK: it needs to be either going into a column in the DB, or being compared to a value in a column of the DB.

And last but not least, more docs to read: <cfqueryparam>.

--
Adam

2 comments:

  1. Would you believe I just learned that you can do typeless cfqueryparams from your post? I've literally never written one sans cfsqltype, but after verifying that the att is optional in the docs I have to admit I'm subtly excited about the prospect. I'd love to see a follow up post demystifying ( presuming there's a similar mist ) why you would or wouldn't include a type, an argument for query-level data validation or always offloading it to another layer, etc. P.S. congrats on the shiny new blog!

    ReplyDelete
  2. Well, to be honest I only omitted the rest of the attributes so as to focus on what i was specifically talking about. I wasn't specifically meaning to leverage the optionality of the CFQSLTYPE attribute.

    In general I'd always specify the type, because I'm a pedant. I've written code that doesn't use them, but only for non-production code.

    I suppose a real-world situation might be if there's already application-level (ie: in CF code) validation on all the params being passed, so also type-checking via the parameter is over the top. Say you're passing a social security number and an email address and a date of birth. You'd've already validated for specific string patterns for the first two, so you are already sure that the values conform to those patterns: there's no need to specify a type and length. And with a DOB you've validated it's a past date, etc, so you already know it's a date.

    I'm thinking off the top of my head here.

    ReplyDelete