I'm working on a project for the School of Pharmacy on campus. They wanted me to write a program that pulls numbers from a database on Marijuana into a nicely formatted report that they could print off and send to government agencies that are interested in the data. One of the reports requires me to pull hundreds of individually calculated items from a MySQL database. A single MySQL query isn't able to do the job. When this report is completed, there will probably 75 to 100 queries. To think that this quarterly report has been done manually for almost 2 decades overwhelms me as I try to write this report software. I'm trying to reduce several weeks of work to prepare this report to a few seconds.
The entire project is done in PHP and MySQL. PHP and MySQL go hand-in-hand, but every time I write a query, I can't help but wonder how this could be made easier. Pulling one value out of a query looks like spaghetti code. Pulling 300 values from 100 different queries is an explosion of crappy code. MySQL queries required for this report look like this:
SELECT THC FROM samples HAVING MAX(THC);
This is just one query that will find the one sample in the database with the highest THC composition and report what that composition value is. There are hundreds more queries in the report similar to this one. Typically, when you run a MySQL query from PHP, here are the steps you have to take:
- Format the query command.
- Execute the query command.
- Retrieve the first row of results from the successful execution (usually in the form of an associative array).
- Transfer those values from the array to variables that actually mean something to the nature of the program (like $max_thc_composition).
If I have 100 queries, and I have to perform the same 4 steps for each query, that's 400 lines of code. Obviously, as any freshmen computer science instructor will tell you (because I am one), this needs to be regulated to a function. But how do you specify a query (which can return more than one value) and the variables associated with those results in a concise manner? If the function doesn't know the variables names going into the function, it can't assign those values coming out of the function.
SELECT THC AS max_thc_composition FROM samples HAVING MAX(THC);
The "AS" expression in MySQL will rename a column into something else. It can also be used to uniquely label a query. This may not seem useful in a single call, but when you have a few 100 calls to make, it's a real time saver.
//runQueries: Runs multiple queries. // Pre: An array of MySQL single-result queries where // each result has a unique "AS" expression. // Post: An associative array populated with all of those values. function runQueries($sqls) { $report = array(); //Execute each SQL query and drop the result in the report array foreach ($sqls as $query) { $result = mysql_query($query) or die("Query Failed: $query Reason: ".mysql_error()); $row = mysql_fetch_assoc($result); // Essentially, for each "AS" expression in the query, // look for that result in row returned and pull that value out if it exist. // If it does not exist, just give it a value of 0. preg_match_all("/ AS (\w+)/", $query, $keys); foreach ($keys[1] as $key) if (isset($row[$key])) $report[$key] = $row[$key]; else $report[$key] = 0; } return $report; }
This function takes an array of MySQL single-result query strings, where each result has an "AS" label identifying what this result means. This method executes each query and drops the result into a associative array where the "AS" labels double as the array keys. And of course, it uses a regular expression.
I'm suppose to be teaching a class on Python programming in the fall. Have you noticed that I haven't featured a Python script in this blog series yet? That bothers me.
2 comments:
That seems insane. Sounds like the tables aren't really storing things in a way that they need to be viewed if you have to do that many queries to produce a meaninful report.
Yeah that's a good idea - a Table View could simplify things for you.
Not sure how the performance will be though - depends the makeup of the table. You want to be sure that the fields in you where clause are indexed in someway.
Might be worth trying it out on one of your more simplier cases and then getting your db engine to explain the query to see what performace hit / boost you are getting by using views.
I expect though it will give you a productivity boost though if it takes less code to generate reports against your view rather then the way things are currently done.
Post a Comment