Using regular expressions to simplify a submittal log

I had a co-worker this week complain that Procore, a project management software, wasn’t working as he wished for his management of submittals. This resulted in him exporting Procore’s submittal log from their web application to an excel sheet as he felt that if the data was in excel he would be able to manipulate it more easily.

However, when he exported the data. It was formatted in a way that again wasn’t useful to him. Rather than the data just having the key points, it had cluttering descriptions for each item. 

In order to resolve this, he thought that he must manually clean the data in order for him to make use of it in an excel sheet. But fortunately for him, I was nearby to explain that through the use of regular expressions his anticipated hours of work could be done in a few seconds.

Background

What is a submittal?

Submittal is construction jargon for a required transmission of information. The simplest submittal type is product data. Below is an example of a requirement from a project’s specifications for marker boards and an example of submission from a general contractor to satisfy the same.  

(Example from a project's specifications)
(Example of product data as submitted from a general contractor to an architect)

What is a submittal log?

A submittal log is simply a table containing a list of all of the required submittals for a project. Project managers use submittal logs to track the percentage complete of a project’s submittals and their statuses; submitted, approved, rejected, approved with comment.

Note: it is critical that a product be approved in order for it to be purchased and installed in a project and thus, project managers care deeply about tracking the statuses of submittals early on in their projects. 

Below is an example of a manually generated submittal log.

(Example of a manually created submittal log)

What are regular expressions?

Regular expressions are patterns enacted as rules against a string of text that allow for the extraction of subsets of text that satisfies the rule prescribed.

Complicated explanation I know…

It is much easily explained visually. 

(Example of regular expression arguments against text)

So, how can we use regular expressions to save us from manually editing hundreds of rows of data?

First let’s look at the data we’re working with. Below is an excerpt of the data that was exported from Procore. 

You can see that there is a structured format, but each data point is too busy with information.

It would be so much easier to read if we reduced the amount of information displayed in each cell of data and kept only the pertinent points. Ideally, we would want it to look something like the below.

And lucky for us. We can do this quite easily with Google Sheets RegexReplace() and RegexExtract() function.

Step 1: Add columns to the table

Add a column to receive the output from an applied formula next to each existing column containing data.

Step 2: Define the patterns of text to target

Review each input for patterns, define patterns to extract or replace the noise within your input, and receive your output into each new adjacent column.

Input 1
(Column A)

Pattern to target
("\d+ \- ")

(Pattern description: a number one or more times consecutively, a space, a hyphen, and a space)

Output 1
(Column B)

Input 2
(Column D)

Pattern to target
("\d+ \-")

(Pattern description: a number one or more times consecutively, a space, a hypen)

Output 2.1
(Column E)

Input 2.1
(Column E)

Pattern to target
("[A-Z]+ ")

(Pattern description: a capital letter one or more times with a space at the end)

Output 2.2
(Column F)

Input 3
(Column G)

Pattern to target
("^[A-Z]\. [a-zA-Z ]+")

(Pattern description: a capital letter followed by a period, a space, any combination of lower case, upper case, or spaces one or more times consecutively)

Output 3
(Column H)

Step 3: Copy and paste your formulas down

In step 2, we were working in the first row of data in our table. We need to copy and paste our formulas down so that no row is left empty. The resulting table should now look like the below.

Step 4: Copy and paste all data as values only, delete the now unnecessary columns, change headings

Since our table contains formulas, the first thing we need to do if we are going to reduce its size is to copy and paste all the data within our table as values only so that when we delete our now unnecessary columns the cells containing our functions don’t break. Likewise, we will need to transfer our column headings to their desired names instead of OUTPUT 1, OUTPUT2, OUTPUT3.

Below is how the table should look after doing so.

Step 5: Format your table

Add alternating colors, center your heading text, change the header row text color to something contrasting, and change the text family to something readable (I personally prefer Montserrat).

Step 6: You're done!

Now go sneak out of the office and do something fun! I mean after all, you did just save 4 hours of labor through the use of a little computer science. You deserve to treat yourself to a round of golf or something.

And when the boss calls, you can just let him know you’re hard at work updating that dreaded submittal log Procore screwed up. Lol.

Cheers!