In the last days, I received an unusual request from a friend that is working on something curious because of an assignment of the University. For this assignment, it's necessary to find the answer or data as response of a query. Instead of a database, we are going to query plain data from an excel spreadsheet (yeah, just as it sounds). For example, for this article, we are going to use the following Sheet in Excel Plus 2016:
The goal of this task is to write raw SQL Queries against the available data in the spreadsheet to find the answer of the following questions:
- Which users live in Boston.
- Which users are boys and live in Boston.
- Which users were born in 2012.
- Which users were born in 2010 and were ranked in place #1.
Of course, finding such information as a regular user is quite easy and simple using filters and so, however the assignment requires to do the queries using SQL and Visual Basic for the job. In this article, I will explain you from scratch how to use Microsoft Visual Basic for Applications to develop your own macros and run some SQL queries against plain data in your excel spreadsheets.
1. Launch Microsoft Visual Basic For Applications
In order to launch the window of Visual Basic to run some code on your spreadsheets, you will need to enable the Developer tab on the excel Ribbon. You can do this easily opening the Excel options (File > Options) and searching for the Customize Ribbon tab, in this Tab you need to check the Developer checkbox to enable it in your regular interface:
Click on Ok and now you should be able to find the Developer tab on your excel ribbon. In this tab, launch the Visual Basic window:
In this new interface you will be able to run your VB code.
2. Building connection
In the Visual Basic window, open the code window of your sheet and let's type some code! According to your needs you may create a custom macro and assign them to the action of buttons or other kind of stuff. In this example, we are going to work with plain code and will run them independently to test them. You need to understand how to connect to the workbook data source that will be handled with the following code:
Dim connection As Object '--- Connect to the current datasource of the Excel file Set connection = CreateObject("ADODB.Connection") With connection .Provider = "Microsoft.ACE.OLEDB.12.0" .ConnectionString = "Data Source=" & ThisWorkbook.Path & "\" & ThisWorkbook.Name & ";" & _ "Extended Properties=""Excel 12.0 Xml;HDR=NO"";" .Open End With
The connection properties are described as follows:
- Provider: we will use the Microsoft Access Database Engine 2010 (Microsoft.ACE.OLEDB.12.0)
- ConnectionString: we will use the current excel file as the database.
HDR=Yes;: indicates that the first row contains the column names, not data.
HDR=No;indicates the opposite.
You will use this connection to run the SQL.
3. Printing whole table data
The following example, will use the mentioned logic to connect to the current spreadsheet and will query the range
A1:E6 (selecting the whole table in the example excel) and will print every row in the immediate window:
Sub MyMethod() '--- Declare Variables to store the connection, the result and the SQL query Dim connection As Object, result As Object, sql As String, recordCount As Integer '--- Connect to the current datasource of the Excel file Set connection = CreateObject("ADODB.Connection") With connection .Provider = "Microsoft.ACE.OLEDB.12.0" .ConnectionString = "Data Source=" & ThisWorkbook.Path & "\" & ThisWorkbook.Name & ";" & _ "Extended Properties=""Excel 12.0 Xml;HDR=YES"";" .Open End With '--- Write the SQL Query. In this case, we are going to select manually the data range '--- To print the whole information of the table sql = "SELECT * FROM [Sheet1$A1:E6]" '--- Run the SQL query Set result = connection.Execute(sql) '--- Fetch information Do ' Print the information of every column of the result Debug.Print result(0); ";" & result(1) & ";" & result(2) & ";" & result(3) & ";" & result(4) result.MoveNext recordCount = recordCount + 1 Loop Until result.EOF '--- Print the amount of results Debug.Print vbNewLine & recordCount & " results found." End Sub
Note that we are using HDR so the query will use the first row of data as the column headers, so the result will be the following one:
4. Query by columns
Now that you are able to connect to the worksheet, you may now customize the SQL to fit your needs. It is necessary to explain you the most basic thing you need to know about querying some data in your excel file. The range needs to specify the Sheet Name and the regular excel range (e.g. A1:Z1) and the whole data should be selected, not individual columns. You may filter by individual columns using regular SQL statements as WHERE, AND, OR, etc.
Depending if you use HDR (first row contains the column names), the query syntax will change:
If you have HDR enabled (in the extended properties of the connection), you may query through the column name, considering that you selected the appropriate range:
SELECT * FROM [Sheet1$A1:E6] WHERE [city] = 'Boston'
If you don't use HDR, the nomenclature of the columns will follow the F1, F2, F3, ..., FN pattern:
The following query would work perfectly if you don't have HDR enabled (note that the range changes):
SELECT * FROM [Sheet1$A2:E6] WHERE [F5] = 'Boston'
In both cases, the output will be the same in the immediate window:
Jacob;1;boy;2010;Boston Ethan;2;boy;2010;Boston Michael;3;boy;2010;Boston 3 results found.
5. Answering questions
The SQL that should solve the initial questions will be the following ones (with HDR disabled):
Which users live in Boston.
SELECT * FROM [Sheet1$A2:E6] WHERE [F5] = 'Boston'
Which users are boys and live in Boston.
SELECT * FROM [Sheet1$A2:E6] WHERE [F5] = 'Boston' and [F3] = 'boy'
Which users were born in 2012.
SELECT * FROM [Sheet1$A2:E6] WHERE [F4] = 2012
Which users were born in 2010 and were ranked in place #1.
SELECT * FROM [Sheet1$A2:E6] WHERE [F2] = 1 AND [F4] = 2010
Happy coding ❤️!
Become a more social person