Generate SQL from Excel Spreadsheets or CSV Files

Darren Devitt
7 min readFeb 10, 2021

Gryphon Sql is a Windows desktop (64 bit) application that takes a CSV or Tab delimited text file and builds a SQL script from its contents. Version 1 has basic functionality that allows you to filter which rows and columns in your data file are used to build the script.

Download here

YOU provide the template SQL that’s used to build the script. YOU choose the columns in your file that form part of the final script. And YOU apply conditions to select the rows to use when building the script.

The app’s functionality is explained below using a sample CSV file built on the Mockaroo website, containing a list of 1,000 names and email addresses, along with the gender of each person and a status field that contains values between 1 and 6.

Let’s assume for this example that we want to create a SQL script that populates a new table with only male entries that have a status of 6. The status field may be used to flag whether an account is active, expired, closed, on hold, etc. The purpose of the SQL script we’re generating might then be to update a table in another database with the details of closed or dormant accounts.

Step 1: Load your Data File

When you open Gryphon Sql you’ll see the “Load Data File” button along with a few simple file options on the start page. Here you can specify whether the first row in the data file is a header record, and you can specify the delimiter of the data file: comma, tab, or something else.

Once you select a file to work with that file is loaded into the UI, and you proceed automatically to Step 2. The app has been tested with a large crime dataset from Chicago containing over 250,000 records.

Step 2: Choose the Important Columns in your Data file

The main body of the screen for Step 2 shows you the contents of the data file you’ve selected in a common grid format. If the grid looks incorrect, if the columns are not as expected, then return to Step 1 using the Previous button and ensure you’ve selected the correct delimiter. If this doesn’t work, chances are your data file is not in the format you think it is.

To the left of the grid is a section where you define variables. A variable in the Gryphon Sql context is a column in your data file that you want to use to populate your script, or a column that you want to use when applying conditions to a row.

Based on the script we want to generate, we will need to create variables for first_name, last_name and email. We’re also going to be creating conditions based on the gender and status of each record, which means we’ll need to create variables for gender and status as well.

After clicking the Add button, select the data column to use when creating the variable and ensure it has a unique and easily understandable name (Column1, Column2, Column3 are not good names).

Finally, choose a variable type from the drop down. This last step is important, as the variable type dictates what kind of conditional you can build around it.

The screenshot below shows our five variables correctly defined. At this point we can move on to Step 3. If you haven’t defined any variables, you can’t proceed to Step 3 as you have nothing to work with.

Step 3: Define Conditionals to Apply to These Columns

On this page, we begin by creating any conditionals we want to apply when selecting the rows to use in building our SQL script. The section to the left is initially populated with a read only list of the variables you defined on the previous page. This is for easy reference only, so you know which values are available to work with.

A conditional is a piece of logic that is applied to each row in your data file. If the condition passes, that row and its data are used to populate the SQL script. If it doesn’t pass, the row is excluded from the script.

You may or may not choose to use conditionals. In our example we want to create conditions based around the Gender and Status variables, but you may have decided to build a script based on all rows in the data file, irrespective of gender and status, in which case you should leave the conditional section blank.

Click the Add button to add a new conditional. A panel will appear allowing you to choose a variable to apply an IF statement to. You then select the operand to use and where necessary enter a value next to it.

In our example, we select Status=6, which will give us each record with a status of 6, and exclude all records where the status is between 1 and 5.

Next, add a new conditional. The first conditional is an IF statement, subsequent conditionals add to that in the form of an AND. This second conditional ensures that Gender=Male, giving us only Male records with a status of 6.

Step 4: Write SQL to Apply to Each Row

Once you’ve defined your conditionals, it’s time to write your SQL template. This is the SQL that you want to build for each row that meets the conditions you’ve already defined. It may be a simple INSERT or UPDATE statement, or it may be a more complex 10 line script, duplicated for each record in the data file.

The template you write uses replacement strings to extract data from columns in your data file. The format of the replacement string is {VariableName}, with no spaces between the curly braces and the name.

A right click menu exists inside the editor to allow you to insert a specific variable replacement string at any point in the template script. You can use the same replacement string as many times as you wish in your script.

You are responsible for building the correctly structured SQL around these replacement strings. For example, you need to add the single quotes around a string field and exclude them from integer fields.

The screenshot below shows the INSERT statement we’re building for our example. I’ve added a comment line so that we can check that the conditionals are being correctly applied in our final generated SQL, and a blank comment to break up each record’s SQL for readability.

Once your template script is ready click “Generate SQL” to create your complete script.

Step 5: Generate SQL Script

The final page shows your complete script in an edit window. There’s no syntax highlighting on this page, but you have an option to “Save SQL” to a file and then open that file in your favourite SQL editor or client.

At this point you should check the contents of the generated SQL to ensure it’s correct. If you had typos in your SQL template or in your replacement variables, this is where you should spot them. Any errors can be fixed by using the Previous button and going back to make changes on previous pages.

A final word of caution: The generated SQL is not validated in any way. You are responsible for checking that it is syntactically correct, and that the data included in the script is the data you wish to use when updating your database. For this reason, I recommend always wrapping your script in a transaction that can be rolled back, and that you first run any script against a test database and verify the results before running it on a Production or QA database.

Version 1 of Gryphon Sql is light on features. I do have plans to add much more functionality over time, so check back on a regular basis. Proposed additions include:

  • Building a project file to save settings per project
  • Multiple conditional/SQL template combinations per project
  • Generate scripts based on multiple input files in the same format
  • Allow for custom pre- and post- SQL to wrap around the generated SQL script
  • Full Excel file support (.xlsx)
  • Cross platform solution

To hear more from me, you can follow me on Twitter or on LinkedIn.

Originally published at http://darrendevitt.com on May 30th, 2020.

--

--

Darren Devitt

Building tools for developers working with healthcare data.