Using VBA to Identify File Types Using Their File Signatures

So this one is a bit of a niche need, but I recently was part of a discussion in which a user was wanting to recover some files which he insisted were xlsx, but Excel was reported as invalid.

He supplied a sample file and he was correct, Excel wouldn’t open it. I started to dig and decided to examine the file in more detail and discovered the file was NOT an xlsx file, regardless of the extension, it was in fact an mp4 video file.

Now, I did the above manually and thought to myself if VBA could do such an analysis automatically?

Spoiler alert, the answer is yes. Keep reading.

The Basic Concept – File Signatures

The basic idea is that ‘most’ files contain a ‘header’ section (normally at the start of the file) which include various bits of information and normally includes a unique identifier allowing identification of the file type.

So this is not a question of merely looking at the file’s extension. That could be incorrect. Instead, we read some of the file’s actual content, the header, and use that to properly ascertain the file type.
 

The Code

So I started to dabble with VBA to read this ‘header’ section and breakdown the various file signatures (also referred to as ‘magic numbers’ by some).

Below is my 1st run at things (probably 80-85% of the way there). I still want to perfect the Office document identification (I know it can be done, just need to have the time to do it), but I’m posting it ‘as is’ in the hope it might be useful to someone else. It’s still rough around the edges, needs proper error handling …, but does work.

Helper Functions

Per the usual, I have a couple helpers functions:

'Helper function to read a file's signature
Function GetFilesSignature(filePath As String) As String
    Dim fileNum               As Integer
    Dim sSignature            As String

    fileNum = FreeFile
    Open filePath For Binary Access Read As fileNum
    'sSignature = Space$(16)
    sSignature = Space$(500) 'increased significantly for files like .tar
    Get fileNum, 1, sSignature
    Close fileNum

    GetFilesSignature = sSignature
End Function

' Helper function to convert binary string to hex for debugging
Function ConvertToHex(binaryStr As String) As String
    Dim i                     As Integer
    Dim hexStr                As String
    For i = 1 To Len(binaryStr)
        hexStr = hexStr & " " & Right("0" & Hex(Asc(Mid(binaryStr, i, 1))), 2)
    Next i
    ConvertToHex = Trim(hexStr)
End Function

Main Procedures

'Main function!
Function IdentifyFileType(ByVal filePath As String, _
                          ByRef sFileType As String, _
                          ByRef sFileExtension As String, _
                          ByRef sSignature As String, _
                          ByRef sHEXsSignature As String) As Boolean
    Static vTestArray         As Variant
    Dim sTestHEXSignature     As String
    Dim sTestSignature        As String
    Dim lArrayRow             As Long
    Dim sOutput               As String
    Dim sTestSignatureStart   As String
    Dim sTestSignatureEnd     As String
    Dim lTestSignatureEndOffset As Long

    sSignature = GetFilesSignature(filePath)
    sHEXsSignature = ConvertToHex(sSignature)
    'Debug.Print "File sSignature: '" & sSignature & "'", , sHEXsSignature

    If IsEmpty(vTestArray) Then vTestArray = BuildTestArray 'Small performance boost
    'vTestArray = BuildTestArray '*** For Development purposes
    For lArrayRow = LBound(vTestArray, 1) To UBound(vTestArray, 1)
        sTestSignature = vTestArray(lArrayRow, 0)
        sTestHEXSignature = ConvertToHex(Mid(sSignature, vTestArray(lArrayRow, 3) + 1, UBound(Split(sTestSignature)) + 1))
        If InStr(sTestSignature, "?") = 0 Then    'Most cases
            'If Left(sHEXsSignature, Len(vTestArray(lArrayRow, 0))) = vTestArray(lArrayRow, 0) Then
            If sTestHEXSignature = vTestArray(lArrayRow, 0) Then
                sFileType = vTestArray(lArrayRow, 1)
                sFileExtension = vTestArray(lArrayRow, 2)
                Exit For
            End If
        Else    'Exceptions like webp, jpg, wav, avi, ... with patterns rather than a single signature
            sTestSignatureStart = Left(sTestSignature, InStr(sTestSignature, "?") - 2)
            sTestSignatureEnd = Mid(sTestSignature, InStrRev(sTestSignature, "?") + 2)
            'sTestSignatureEndOffset = UBound(Split(Trim(Replace(Replace(sTestSignature, sTestSignatureStart, ""), sTestSignatureEnd, "")), " ")) + 1
            lTestSignatureEndOffset = Len(Replace(Replace(sTestSignature, sTestSignatureStart, ""), sTestSignatureEnd, ""))
            If Left(sTestHEXSignature, Len(sTestSignatureStart)) = sTestSignatureStart And Mid(sTestHEXSignature, Len(sTestSignatureStart) + lTestSignatureEndOffset + 1, Len(sTestSignatureEnd)) = sTestSignatureEnd Then
                sFileType = vTestArray(lArrayRow, 1)
                sFileExtension = vTestArray(lArrayRow, 2)
                Exit For
            End If
        End If
    Next lArrayRow

End Function

'Function to build an array of file signatures, this can further be expanded as I've just included 64 file types to date
'I used an array, but you could use a table, worksheet, ...
Function BuildTestArray() As Variant
    Dim vFileSignatures(63, 3) As String    ' 9->rows, 2->columns
    Dim lCounter              As Long

    vFileSignatures(lCounter, 0) = "00 01 00 00 53 74 61 6E 64 61 72 64 20 4A 65 74 20 44 42"
    vFileSignatures(lCounter, 1) = "Microsoft Access Database"
    vFileSignatures(lCounter, 2) = "mdb"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "00 01 00 00 53 74 61 6E 64 61 72 64 20 41 43 45 20 44 42"
    vFileSignatures(lCounter, 1) = "Microsoft Access 2007+ Database"
    vFileSignatures(lCounter, 2) = "accdb"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "00 01 00 00 4D 53 49 53 41 4D 20 44 61 74 61 62 61 73 65"
    vFileSignatures(lCounter, 1) = "Microsoft Money file"
    vFileSignatures(lCounter, 2) = "mny"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00"
    vFileSignatures(lCounter, 1) = "SQLite Database"
    vFileSignatures(lCounter, 2) = "sqlitedb, sqlite, db"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "D0 CF 11 E0 A1 B1 1A E1"
    vFileSignatures(lCounter, 1) = "Compound File Binary Format, a container format defined by Microsoft COM"
    vFileSignatures(lCounter, 2) = "doc, xls, ppt, msi, msg"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "25 50 44 46 2D"
    vFileSignatures(lCounter, 1) = "PDF document"
    vFileSignatures(lCounter, 2) = "pdf"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "7B 5C 72 74 66 31"
    vFileSignatures(lCounter, 1) = "Rich Text Format"
    vFileSignatures(lCounter, 2) = "rtf"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "52 65 63 65 69 76 65 64 3A"
    vFileSignatures(lCounter, 1) = "Email Message var5"
    vFileSignatures(lCounter, 2) = "eml"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "21 42 44 4E"
    vFileSignatures(lCounter, 1) = "Microsoft Outlook Personal Storage Table file"
    vFileSignatures(lCounter, 2) = "pst"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "25 21 50 53 2D 41 64 6F 62 65 2D 33 2E 30 20 45 50 53 46 2D 33 2E 30"
    vFileSignatures(lCounter, 1) = "Encapsulated PostScript file version 3.0"
    vFileSignatures(lCounter, 2) = "eps, epsf"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "25 21 50 53 2D 41 64 6F 62 65 2D 33 2E 31 20 45 50 53 46 2D 33 2E 30"
    vFileSignatures(lCounter, 1) = "Encapsulated PostScript file version 3.1"
    vFileSignatures(lCounter, 2) = "eps, epsf"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "49 54 53 46 03 00 00 00 60 00 00 00"
    vFileSignatures(lCounter, 1) = "MS Windows HtmlHelp Data"
    vFileSignatures(lCounter, 2) = "chm"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "3F 5F"
    vFileSignatures(lCounter, 1) = "Windows 3.x/95/98 Help file"
    vFileSignatures(lCounter, 2) = "hlp"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "25 21 50 53"
    vFileSignatures(lCounter, 1) = "PostScript document"
    vFileSignatures(lCounter, 2) = "ps"
    vFileSignatures(lCounter, 3) = 0


    ' Images
    ' //////////////////////////////////////////////////////////////////////////////
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "42 4D"
    vFileSignatures(lCounter, 1) = "BMP file, a bitmap format used mostly in the Windows world"
    vFileSignatures(lCounter, 2) = "bmp, dib"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "47 49 46 38 37 61"
    vFileSignatures(lCounter, 1) = "Image file encoded in the Graphics Interchange Format (GIF87a)"
    vFileSignatures(lCounter, 2) = "gif"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "47 49 46 38 39 61"
    vFileSignatures(lCounter, 1) = "Image file encoded in the Graphics Interchange Format (GIF89a)"
    vFileSignatures(lCounter, 2) = "gif"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "66 74 79 70 68 65 69 63"
    vFileSignatures(lCounter, 1) = "High Efficiency Image Container (HEIC)"
    vFileSignatures(lCounter, 2) = "heic"
    vFileSignatures(lCounter, 3) = 4

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "00 00 01 00"
    vFileSignatures(lCounter, 1) = "Computer icon encoded in ICO file format"
    vFileSignatures(lCounter, 2) = "ico"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "FF D8 FF DB"
    vFileSignatures(lCounter, 1) = "JPEG raw or in the JFIF or Exif file format"
    vFileSignatures(lCounter, 2) = "jpg"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "FF D8 FF E0 00 10 4A 46 49 46 00 01"
    vFileSignatures(lCounter, 1) = "JPEG raw or in the JFIF or Exif file format"
    vFileSignatures(lCounter, 2) = "jpg"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "FF D8 FF EE"
    vFileSignatures(lCounter, 1) = "JPEG raw or in the JFIF or Exif file format"
    vFileSignatures(lCounter, 2) = "jpg"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "FF D8 FF E1 ?? ?? 45 78 69 66 00 00"
    vFileSignatures(lCounter, 1) = "JPEG raw or in the JFIF or Exif file format"
    vFileSignatures(lCounter, 2) = "jpg"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "FF D8 FF E0"
    vFileSignatures(lCounter, 1) = "JPEG raw or in the JFIF or Exif file format"
    vFileSignatures(lCounter, 2) = "jpg"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "00 00 00 0C 6A 50 20 20 0D 0A 87 0A"
    vFileSignatures(lCounter, 1) = "JPEG 2000 format"
    vFileSignatures(lCounter, 2) = "jp2, j2k, jpf, jpm, jpg2, j2c, jpc, jpx, mj2"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "4F 67 67 53"
    vFileSignatures(lCounter, 1) = "Ogg, an open source media container format"
    vFileSignatures(lCounter, 2) = "ogg, oga, ogv"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "89 50 4E 47 0D 0A 1A 0A"
    vFileSignatures(lCounter, 1) = "Image encoded in the Portable Network Graphics format"
    vFileSignatures(lCounter, 2) = "png"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "38 42 50 53"
    vFileSignatures(lCounter, 1) = "Photoshop Document file, Adobe Photoshop's native file format"
    vFileSignatures(lCounter, 2) = "psd"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "49 49 2A 00"
    vFileSignatures(lCounter, 1) = "Tagged Image File Format (TIFF - little-endian)"
    vFileSignatures(lCounter, 2) = "tif, tiff"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "4D 4D 00 2A"
    vFileSignatures(lCounter, 1) = "Tagged Image File Format (TIFF - big-endian)"
    vFileSignatures(lCounter, 2) = "tif, tiff"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "49 49 2B 00"
    vFileSignatures(lCounter, 1) = "BigTIFF (little-endian)"
    vFileSignatures(lCounter, 2) = "tif, tiff"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "4D 4D 00 2B"
    vFileSignatures(lCounter, 1) = "BigTIFF (big-endian)"
    vFileSignatures(lCounter, 2) = "tif, tiff"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "52 49 46 46 ?? ?? ?? ?? 57 45 42 50"
    vFileSignatures(lCounter, 1) = "Google WebP image file"
    vFileSignatures(lCounter, 2) = "webp"
    vFileSignatures(lCounter, 3) = 0


    ' Audio
    ' //////////////////////////////////////////////////////////////////////////////
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "66 4C 61 43"
    vFileSignatures(lCounter, 1) = "Free Lossless Audio Codec"
    vFileSignatures(lCounter, 2) = "flac"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "4D 54 68 64"
    vFileSignatures(lCounter, 1) = "MIDI sound file"
    vFileSignatures(lCounter, 2) = "mid, midi"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "FF FB"
    vFileSignatures(lCounter, 1) = "MPEG-1 Layer 3 file without an ID3 tag or with an ID3v1 tag (which is appended at the end of the file)"
    vFileSignatures(lCounter, 2) = "mp3"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "FF F3"
    vFileSignatures(lCounter, 1) = "MPEG-1 Layer 3 file without an ID3 tag or with an ID3v1 tag (which is appended at the end of the file)"
    vFileSignatures(lCounter, 2) = "mp3"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "FF F2"
    vFileSignatures(lCounter, 1) = "MPEG-1 Layer 3 file without an ID3 tag or with an ID3v1 tag (which is appended at the end of the file)"
    vFileSignatures(lCounter, 2) = "mp3"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "49 44 33"
    vFileSignatures(lCounter, 1) = "MP3 file with an ID3v2 container "
    vFileSignatures(lCounter, 2) = "mp3"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "52 49 46 46 ?? ?? ?? ?? 57 41 56 45"
    vFileSignatures(lCounter, 1) = "Waveform Audio File Format"
    vFileSignatures(lCounter, 2) = "wav"
    vFileSignatures(lCounter, 3) = 0



    ' Video
    ' //////////////////////////////////////////////////////////////////////////////
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "52 49 46 46 ?? ?? ?? ?? 41 56 49 20"
    vFileSignatures(lCounter, 1) = "Audio Video Interleave video format"
    vFileSignatures(lCounter, 2) = "avi"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "46 4C 56"
    vFileSignatures(lCounter, 1) = "Flash Video file"
    vFileSignatures(lCounter, 2) = "flv"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "66 74 79 70 69 73 6F 6D"
    vFileSignatures(lCounter, 1) = "ISO Base Media file (MPEG-4)"
    vFileSignatures(lCounter, 2) = "mp4"
    vFileSignatures(lCounter, 3) = 4
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "66 74 79 70 4D 53 4E 56"
    vFileSignatures(lCounter, 1) = "MPEG-4 video file"
    vFileSignatures(lCounter, 2) = "mp4"
    vFileSignatures(lCounter, 3) = 4
    lCounter = lCounter + 1    'generic entry until I figure out other isom values
    vFileSignatures(lCounter, 0) = "66 74 79 70"
    vFileSignatures(lCounter, 1) = "MPEG-4 video file"
    vFileSignatures(lCounter, 2) = "mp4"
    vFileSignatures(lCounter, 3) = 4


    ' Archives
    ' //////////////////////////////////////////////////////////////////////////////
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "37 7A BC AF 27 1C"
    vFileSignatures(lCounter, 1) = "7-Zip File Format"
    vFileSignatures(lCounter, 2) = "7z"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "42 5A 68"
    vFileSignatures(lCounter, 1) = "Compressed file using Bzip2 algorithm"
    vFileSignatures(lCounter, 2) = "gbz2"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "1F 8B"
    vFileSignatures(lCounter, 1) = "GZIP compressed file"
    vFileSignatures(lCounter, 2) = "gz, tar.gz"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "4C 5A 49 50"
    vFileSignatures(lCounter, 1) = "lzip compressed file"
    vFileSignatures(lCounter, 2) = "lz"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "52 61 72 21 1A 07 00"
    vFileSignatures(lCounter, 1) = "Roshal ARchive compressed archive v1.50 onwards"
    vFileSignatures(lCounter, 2) = "rar"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "52 61 72 21 1A 07 01 00"
    vFileSignatures(lCounter, 1) = "Roshal ARchive compressed archive v5.00 onwards"
    vFileSignatures(lCounter, 2) = "rar"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "75 73 74 61 72 00 30 30"
    vFileSignatures(lCounter, 1) = "tar archive"
    vFileSignatures(lCounter, 2) = "tar"
    vFileSignatures(lCounter, 3) = 257
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "75 73 74 61 72 20 20 00"
    vFileSignatures(lCounter, 1) = "tar archive"
    vFileSignatures(lCounter, 2) = "tar"
    vFileSignatures(lCounter, 3) = 257

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "50 4B 03 04"
    vFileSignatures(lCounter, 1) = "zip file format and formats based on it, such as EPUB, JAR, ODF, OOXML"
    vFileSignatures(lCounter, 2) = "zip"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "50 4B 05 06"
    vFileSignatures(lCounter, 1) = "zip file format and formats based on it, such as EPUB, JAR, ODF, OOXML (empty archive)"
    vFileSignatures(lCounter, 2) = "zip"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "50 4B 07 08"
    vFileSignatures(lCounter, 1) = "zip file format and formats based on it, such as EPUB, JAR, ODF, OOXML (spanned archive)"
    vFileSignatures(lCounter, 2) = "zip"
    vFileSignatures(lCounter, 3) = 0




    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "4D 53 43 46"
    vFileSignatures(lCounter, 1) = "Microsoft Cabinet file"
    vFileSignatures(lCounter, 2) = "cab"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "49 53 63 28"
    vFileSignatures(lCounter, 1) = "InstallShield CAB Archive File"
    vFileSignatures(lCounter, 2) = "cab"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "3C 3C 3C 20 4F 72 61 63 6C 65 20 56 4D 20 56 69 72 74 75 61 6C 42 6F 78 20 44 69 73 6B 20 49 6D 61 67 65 20 3E 3E 3E"
    vFileSignatures(lCounter, 1) = "VirtualBox Virtual Hard Disk file format"
    vFileSignatures(lCounter, 2) = "vdi"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "63 6F 6E 65 63 74 69 78"
    vFileSignatures(lCounter, 1) = "Windows Virtual PC Virtual Hard Disk file format"
    vFileSignatures(lCounter, 2) = "vhd"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "4B 44 4D"
    vFileSignatures(lCounter, 1) = "VMDK files"
    vFileSignatures(lCounter, 2) = "vmdk"
    vFileSignatures(lCounter, 3) = 0
    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "23 20 44 69 73 6B 20 44 65 73 63 72 69 70 74 6F"
    vFileSignatures(lCounter, 1) = "VMware 4 Virtual Disk description file (split disk)"
    vFileSignatures(lCounter, 2) = "vmdk"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "00 01 00 00 00"
    vFileSignatures(lCounter, 1) = "TrueType font"
    vFileSignatures(lCounter, 2) = "ttf, tte, dfont"
    vFileSignatures(lCounter, 3) = 0

    lCounter = lCounter + 1
    vFileSignatures(lCounter, 0) = "4F 54 54 4F"
    vFileSignatures(lCounter, 1) = "OpenType font"
    vFileSignatures(lCounter, 2) = "otf"
    vFileSignatures(lCounter, 3) = 0

    BuildTestArray = vFileSignatures
End Function

Usage Example

Below is a simple example how the above code can be used.

Sub Test()
    Dim sFileType             As String
    Dim sFileExtension        As String
    Dim sSignature            As String
    Dim sHEXsSignature        As String
    Const sFile = "C:\temp\img01.jpg"

    Call IdentifyFileType(sFile, sFileType, sFileExtension, sSignature, sHEXsSignature)
    Debug.Print sFile, sFileType, sFileExtension
    Debug.Print
    Debug.Print sSignature
    Debug.Print
    Debug.Print sHEXsSignature
End Sub

 

File Signatures

So where do all those File Signatures come from? Initially, just by playing around myself and comparing files, then I did some searching and found some websites that had some of them.

Here are a few of the sites I used, but there were others as well :

3 responses on “Using VBA to Identify File Types Using Their File Signatures

    1. Daniel Pineault Post author

      Initially, just by playing around myself and comparing files, then I did some searching and found some websites that had some of them. I’ve added a section to the article to cover this question.