I was working on some HTML automation recently and needed to extract the MIME type from an HTML Img Src Attribute value.
I could get into Left, Right, Mid, InStr, Len, …, but I thought I’d stretch my VBA RegEx legs a little and create a simple reusable function to do the job.
In case it can help anyone else out there, here it is!
'---------------------------------------------------------------------------------------
' Procedure : RegEx_HTML_Img_GetSrcMIMEType
' Author : Daniel Pineault, CARDA Consultants Inc.
' Website : http://www.cardaconsultants.com
' Purpose : Extract the image mime type from the img src
' Copyright : The following is release as Attribution-ShareAlike 4.0 International
' (CC BY-SA 4.0) - https://creativecommons.org/licenses/by-sa/4.0/
' Req'd Refs: Early Binding -> Microsoft VBScript Regular Expressions X.X
' Late Binding -> None required
' References:
'
' Input Variables:
' ~~~~~~~~~~~~~~~~
' sImgSrc : The HTML img src attribute value
'
' Usage:
' ~~~~~~
' ? RegEx_HTML_Img_GetSrcMIMEType("data:image/png;base64,iVBORw0KGgo")
' Returns -> png
'
' Revision History:
' Rev Date(yyyy-mm-dd) Description
' **************************************************************************************
' 1 2023-02-25 Initial Release
'---------------------------------------------------------------------------------------
Function RegEx_HTML_Img_GetSrcMIMEType(ByVal sImgSrc As String) As String
On Error GoTo Error_Handler
#Const RegEx_EarlyBind = False 'True => Early Binding / False => Late Binding
#If RegEx_EarlyBind = True Then
Dim oRegEx As VBScript_RegExp_55.RegExp
Dim oMatches As VBScript_RegExp_55.MatchCollection
Set oRegEx = New VBScript_RegExp_55.RegExp
#Else
Dim oRegEx As Object
Dim oMatches As Object
Set oRegEx = CreateObject("VBScript.RegExp")
#End If
With oRegEx
.Pattern = "data:image\/(.*?);" 'Extract src -> src\s*=\s*"([^"]+)"
.Global = True
.IgnoreCase = True
.MultiLine = True
Set oMatches = .Execute(sImgSrc)
End With
If oMatches.Count <> 0 Then _
RegEx_HTML_Img_GetSrcMIMEType = oMatches(0).SubMatches(0)
Error_Handler_Exit:
On Error Resume Next
Set oMatches = Nothing
Set oRegEx = Nothing
Exit Function
Error_Handler:
MsgBox "The following error has occurred" & vbCrLf & vbCrLf & _
"Error Number: " & Err.Number & vbCrLf & _
"Error Source: RegEx_HTML_Img_GetSrcMIMEType" & vbCrLf & _
"Error Description: " & Err.Description & _
Switch(Erl = 0, "", Erl <> 0, vbCrLf & "Line No: " & Erl) _
, vbOKOnly + vbCritical, "An Error has Occurred!"
Resume Error_Handler_Exit
End Function
It is very straightforward to use, you simply pass the src attribute value to the function and it will return the MIME type.
Usage Example
? RegEx_HTML_Img_GetImgSrcMIMEType("data:image/png;base64,iVBORw0KGgo...")
will return a value of:
png
Using Plain Vanilla VBA
As I said earlier, we can obvious extract this information using standard string manipulation functions. Below is one way it can be accomplished:
'---------------------------------------------------------------------------------------
' Procedure : HTML_Img_GetSrcMIMEType
' Author : Daniel Pineault, CARDA Consultants Inc.
' Website : http://www.cardaconsultants.com
' Purpose : Extract the image mime type from the img src
' Copyright : The following is release as Attribution-ShareAlike 4.0 International
' (CC BY-SA 4.0) - https://creativecommons.org/licenses/by-sa/4.0/
' Req'd Refs: None required
'
' Input Variables:
' ~~~~~~~~~~~~~~~~
' sImgSrc : The HTML img src attribute value
'
' Usage:
' ~~~~~~
' ? HTML_Img_GetSrcMIMEType("data:image/png;base64,iVBORw0KGgo")
' Returns -> png
'
' Revision History:
' Rev Date(yyyy-mm-dd) Description
' **************************************************************************************
' 1 2023-02-25 Initial Release
'---------------------------------------------------------------------------------------
Function HTML_Img_GetSrcMIMEType(ByVal sImgSrc As String) As String
On Error GoTo Error_Handler
If Left(sImgSrc, 5) <> "data:" Then GoTo Error_Handler_Exit
sImgSrc = Replace(sImgSrc, "data:image/", "")
HTML_Img_GetSrcMIMEType = Left(sImgSrc, InStr(sImgSrc, ";") - 1)
Error_Handler_Exit:
On Error Resume Next
Exit Function
Error_Handler:
MsgBox "The following error has occurred" & vbCrLf & vbCrLf & _
"Error Number: " & Err.Number & vbCrLf & _
"Error Source: HTML_Img_GetSrcMIMEType" & vbCrLf & _
"Error Description: " & Err.Description & _
Switch(Erl = 0, "", Erl <> 0, vbCrLf & "Line No: " & Erl) _
, vbOKOnly + vbCritical, "An Error has Occurred!"
Resume Error_Handler_Exit
End Function
Usage Example
? HTML_Img_GetSrcMIMEType("data:image/png;base64,iVBORw0KGgo...")
will also return a value of:
png
My goal here was to expand my knowledge of RegEx and have a little fun. That said, if we are strictly considering performance, the plain vanilla VBA approach does perform faster than using RegEx. Ultimately, the choice is yours!