Decrypting the ‘AVG’ Photo Vault

Screenshot of AVG Menu

I have been holding onto this post for a while now, one of the first applications I attempted to decrypt which took a very long time. ‘AVG AntiVirus – Mobile Security & Privacy’ is as the name suggests an antivirus solution, the Android equivalent of its well known computer counterpart. The mobile application contains additional features such as the ability to restrict access to applications and more relevant to this post, the ability to place ‘private photos in a password-protected Vault to prevent snooping’. ‘Photo Vault’ has been implemented in AVG on Android for a number of years.

‘Photo Vault’ requires the user to login to a Google Account to gain access to this feature. Once this has been completed they are free to use the vault, however, if they are using the free version they will be limited to 10 files that can be encrypted.

If the device supports it the user will be able to set their fingerprint to unlock the vault. After initial setup a user can change the unlock type to a pattern instead of a PIN, but there is no way to disable the original PIN. This becomes important when looking at how the application handles encryption / decryption.

Changing from PIN to Pattern

Examination into the application identified the following key locations:

Data
/data/data/com.antivirus
Media Files
/sdcard/Vault
Key Artefact Locations

The PIN / Pattern

PinSettingsImpl.xml‘ is stored under ‘/data/data/com.antivirus/shared_prefs‘ and contains entries for both the PIN and pattern set by the user:

‘PinSettingsImpl.xml’ Screenshot

Light work can be made of both entries. ‘encrypted_pin‘ is the user created PIN code stored as a SHA1 hash value. In order to identify the number the hash value relates to one can apply some ‘Google-Fu‘ and search for the value online:

Google Result for ‘7110eda4d09e062aa5e4a390b0a572ac0d2c0220’

A PIN code has been identified in the first search result and in this case corresponds to the correct PIN code ‘1234’. ‘encrypted_pattern‘ is also a SHA1 hash value and corresponds to the numbers selected while creating the pattern lock. Patterns for Android devices and their applications are generally interpreted as a keypad system starting at 00 running through to 08, from left to right. As an example, the above ‘encrypted_pattern‘ corresponds to ‘0>3>6>7>8’, the pattern could be interpreted as starting at ‘1’, but the way in which it is programmed means in the background it starts at ‘0’.

Pattern Example Layout

Rather than taking the values directly from the grid and hashing them, they are first taken from hex and then run through the hashing algorithm. If this were a password rather than a PIN or pattern lock there could be a huge number of potential combinations, taking into account variable lengths, characters; uppercase, lowercase, special characters, language etc. Dealing with a PIN or pattern is more simple, there will be a finite number of combinations which means that rainbow tables can be created as a lookup for the values or they can be more easily bruteforced. This is all in respect to SHA1 hash values, not talking about more complex algorithms that are designed as cryptographic functions which put mechanisms in place to slow the attempts of bruteforcing.

Although there are many programs out there that already have the rainbow tables incorporated or have the ability to bruteforce these values, the code for identifying the PIN in python has been included below to give a better understanding of how it could work:

import sys
from hashlib import sha1

## Take the PIN hash as the first argument
AVGPIN = sys.argv[1]

## For each passcode it will need to run the hash function and compare the hashed result to the hash from the file 
## If it is wrong it will keep trying until it gets to '9999'

for i in range(0,10000):
    currentPIN = ('{0:04}'.format(i)).encode('utf-8')
    ## Section of code to try the hash process and assign PasscodFound if correct
    ## Compare the current PIN SHA1 and the provided AVG PIN
    if sha1(currentPIN).hexdigest() == AVGPIN:
        print(f'FOUND PIN:\t\t\t\t\t****{currentPIN.decode("utf-8")}****')
        pinFound = True
        break
    else:
        continue        
    ## If the PIN is not found then exit
    if not pinFound:
        print('****PIN not found, program will exit***')
        exit() 

The above code is written for python 3 and takes one argument, the PIN hash. Realistically the above code would not be the quickest way of identifying the PIN as there is no optimisation and because of the finite number of PINs available a rainbow table would be quicker but the code works as a free solution that can be applied to other applications as well; providing that they are being created in the same way. Note: the code is limited to 4 digit codes, as in the AVG application.

Media Files & Decryption

Encrypted Data Structure

Media files are stored within subfolders of ‘/sdcard/Vault’. All of these folders contain files of interest, however, they are all encrypted. For each encrypted media file there is a corresponding ‘.thumbnail‘ and ‘.mid_picture‘. ‘.thumbnail‘, as the name suggests is a thumbnail representation of the image and ‘.mid_picture‘ an even smaller version.

Files are linked together by their filenames, corresponding to the time the file was encrypted as a Unix timestamp. The files are differentiated by file location only, thumbnails residing in the ‘.thumbnail‘ directory etc. In addition to the encrypted media files there is the ‘.metadata_store‘ folder, containing another encrypted file. When this file is decrypted it reveals the details of all files encrypted using the application including details such as original file name and size:

Decrypted Data from ‘.metadata_store’ File

Each encrypted media file has a standard header, in which the IV and the size of the encrypted data can be obtained:

File Header Example

Encrypted media files start with the IV length, followed by the IV itself, the length of the encrypted data represented as a 4 byte value and finally the encrypted data equalling that length. In the example above the values are:

IV Length0x10 = 16 bytes
IV8FB2681CCA1A1A3639534B9E8E05AC2E
Encrypted Data Length0x00033D40 = 212288 bytes
Encrypted Data212288 of encrypted data
File Header Details

In some previous applications the IV and Key have been the same across all files. In this application the IV is randomly generated at the time of encryption and stored in the encrypted file as seen above. The key however is generated in a very specific, programmatic way.

The Key

Encryption within the application is part of the reason it took so long for me to complete it. During analysis I identified the type of encryption as AES_CBC and could identify the key and IV required to perform the decryption. Tracing back the origins of the IV has been demonstrated above but the Key was more difficult. As a result of the analysis it was determined that several steps are required to get the right encryption / decryption key. The explanation may seem long winded but it does all work out in the end, it will be broken down into sections and then summarised at the end.

The first part requires several inputs, some eagle eyed readers may have seen references to it when the encrypted files were covered. It’s technically the second part of the process but it is best covered here. The file containing some of the required values is stored within the ‘.key_store‘ folder, in this example named ‘AVG-Encrypted-1645550081160‘. It is essentially a key file and can be interpreted as the following:

LengthThe length of data in the file. 0x50 = 80
Global IVFirst IV stored in file, 16 bytes (variable)
First Encrypted Value32 byte encrypted value (variable)
Second Encrypted Value32 byte encrypted value (variable)
Key Values Explained

Now these values have been identified, lets have a look at the first stage of the key generation process, the PBKDF2 key derivation function. In order to generate / derive a PBKDF2 key several elements are required; saltpassworditerationskey length and hash mode. Similarly to the ‘Secret Calculator Photo Vault’ application, some of the values are hard coded in the application and are as follows:

Iterations100
Key Length32 Bit
Hash ModeSHA1
Hard Coded PBKDF2 Values

The salt for the key generation is the ‘Global IV‘ identified above. The password is slightly more tricky.

Stage one of getting the password is taking the users PIN and running it through the SHA1 hash algorithm. This provides the same value as identified as the ‘encrypted_pin‘.

Next, the SHA1 hash is run through the SHA256 hash algorithm, in this case resulting in the value ‘a0626c0a5edb4b637d7e25d2290792a0a3094da0b5d4413641a130bbcf46ef85‘. So, take the PIN and run through SHA1, the result is then run through SHA256.

That is how the Java code reads and the value generated by carrying out this process did not work in producing the correct key. It was at this point that things got a bit more difficult. During the dynamic code analysis I could see references to the password value and it seemed to resemble the SHA256 as above but it wasn’t 100% the same. After some research around the anomaly it was identified that something in the code was likely causing the issue. Below is the input parameters for the PBKDF2 key derivation from the application code:

Code Snippet Breakdown

A very specific anomaly occurs within Java on Android when using the ‘toCharArray’ function. Several articles on the internet outline this issue and in short it is to do with unpresentable data during the conversation of certain bytes. During the conversion those values that it cannot represent are replaced with the string ‘efbfbd’. This concept becomes important when looking to generate the correct password to use for the PBKDF2 key derivation. Rather than using the SHA256 value that would normally be generated the ‘toCharArray’ function alters the ‘password‘ input significantly. The example value above changes to the below value:

efbfbd626c0a5eefbfbd4b637d7e25efbfbd2907efbfbdefbfbdefbfbd094defbfbdefbfbdefbfbd413641efbfbd30efbfbdefbfbd46efbfbdefbfbd

The anomaly, in this case, results in a significantly longer value, which is not always the case. In order to deal with this anomaly a custom Android application was created, replicating the code from the application in order to generate a list of all possible PIN numbers. However, unlike the python code provided earlier, the application was written to reproduce the ‘toCharArray’ variation of the ‘password‘ creating a rainbow table to be used for the key derivation function. Note: the custom code has not been included in this case as the results will be provided instead.

Now the correct string has been generated it provides the last piece for the PBKDF2 derivation producing the value required for the next stage. In this example it produces: ‘d025fb14e02802e9c98fe0dfed87b546‘. PBKDF2 derivation can be accomplished using CyberChef:

To summarise the process so far the diagram below attempts to provide a collation of the steps:

Key Derivation Function Diagram 1

Unfortunately the process doesn’t stop there as the new value is used as part of the next cryptographic function. During the analysis it was determined that in order to obtain the key a further decryption process was required, this was in the form of AES_CBC. Seen in previous posts, in order to facilitate AES_CBC encryption / decryption there are two values required; the Key and the IV. The IV in this case is the ‘Global IV‘ as detailed above and the Key is the newly generated value from the PBKDF2 key derivation.

The final stage to obtain the correct encryption key requires the decryption of the last two values from the ‘.key_store‘ file, named in this post as ‘First Encrypted Value‘ and ‘Second Encrypted Value‘ (note: these are not the names referenced in the applications code).

Code Snippet for Key Decryption

In Java there are two methods generally used for encryption, ‘doFinal’ and ‘update’. ‘doFinal’ is used for the data in its entirety whereas ‘update’ can be used for blocks of data, for example taking larger pieces and feeding them through in chunks. The code snippet above from the application demonstrates that the values are being used with the ‘update’ method. Meaning it is dealing with the values separately, the reason why will be demonstrated below.

Using Python it is possible to accomplish the task and generate the correct key. Below is a snippet of code which will take the contents of the ‘.key_store‘ file, starting at the ‘Global IV‘, and replicate the steps from the application in order to decrypt the values individually in a way similar to ‘update‘:

### Import required modules
from Crypto.Cipher import AES
from base64 import b64encode
from hashlib import sha256

### Static values
pbkdf2Key = "d025fb14e02802e9c98fe0dfed87b546"
fullFile = "A0 E9 D9 52 6A 6B 8A 4D 7D F3 1E A5 E8 5E DE 05 36 B7 DB FB 6F 3E 4A 06 A9 20 B7 80 DD A2 19 E8 4A DC 18 E5 FC 0D 3E 60 5F 4B F6 9B 2E D0 16 60 23 0D 33 B0 C8 FF A3 65 D3 B7 C2 1E A5 62 66 1A 31 31 F5 70 8A 6D 2C BD 1C D9 61 78 B5 0C 4E BC"

### Create new instance of AES Cipher
cipher = AES.new(bytes.fromhex(pbkdf2Key), AES.MODE_CBC, (bytes.fromhex(fullFile)[:16])[:AES.block_size]) 

### data = bytes 16:48 (as per java code)
data = bytes.fromhex(fullFile)[16:48]

### data1 = bytes 48 to end of file as per java code
data1 = bytes.fromhex(fullFile)[48:]

### Decrypt 'first encrypted value'
decrypted = cipher.decrypt(data)

### Print decrypted value
print(f'First Encrypted Value\t{decrypted.hex()}')

### Decrypt 'second encrypted value'
decrypted1 = cipher.decrypt(data1)

### New instance of sha256 hash
hashStr = sha256()

### Run the decrypted data through sha256 hash
hashStr.update(decrypted1)

### Print decrypted value
print(f'Second Encrypted Value\t{decrypted1.hex()}')

### Print the hash value as hex 
print(f'Calculated Hash\t\t{hashStr.hexdigest()}')

The splitting of the values serves a specific purpose and goes to answer; How does the application know whether the decryption key generated is correct? This is accomplished by carrying out another function on-top of the decryption and comparing the values. The application takes the result of the first ‘update‘ function and compares this value to the SHA256 of the second ‘update‘ function. If the decrypted ‘First Encrypted Value‘ and the SHA256 of the decrypted ‘Second Encrypted Value‘ match then the supplied password is correct.

Key Verification Flow

Using the values from this example and those cited within the python code above, three values are printed. The first is the decryption of the ‘First Encrypted Value‘ and the second value, ‘279657…‘ is the ‘Master Encryption Key‘, generated after the ‘Second Encrypted Value‘ value has been decrypted. The third is the SHA256 of the ‘Master Encryption Key‘ for comparison to the first value:

Python Output

Being overzealous in trying to make sure its understood, the below diagram shows the whole process hopefully aiding in making sense of the above:

Master Encryption Key Diagram

No access to ‘/data/data/’

Realistically the encryption in this application has been designed so that in order to decrypt the data the PIN needs to be known. Whether that is by getting the hash value from the preferences file or by knowing the PIN, either way this would be required to generate the overarching ‘Master Encryption Key‘. Strictly speaking we have used the preferences file in order to bruteforce the passcode, and could have just used the SHA1 hash directly from the file. But what happens when there is no access to ‘/data/data’, such is the case when the type of extraction is restricted in a way which does not result in the acquisition of the application data? We would still have the encrypted files from the SD card / emulated SD card but we wouldn’t be able to validate the users PIN code using the preferences file.

However, it is possible to brute force and identify the correct PIN code and in turn identify the correct decryption key required. This can be achieve in two ways, the first is using the file from within the ‘.metadata_store‘ folder and the second is by utilising the same function that the application uses to determine if the ‘Master Encryption Key‘ is correct.

The ‘.metadata_store‘ folder contains an encrypted file which when decrypted contains the following:

Decrypted Data from ‘.metadata_store’ File

The very first part of the decrypted data, ‘{“version’, is consistent in every install I have seen and the file is always present within the encrypted files. In order to identify the correct PIN / encryption key, carry out the process as detailed above for each possible passcode and attempt to decrypt that file. If the first part of the decrypted data is ‘{“version’, then the PIN / ‘Master Encryption Key‘ is correct and the media files can be decrypted. If ‘{“version’ is not at the start of the decrypted data then it needs to try the next PIN and so on.

Another option is to use the same process that the application uses. Similar to the above approach, each variant of PIN would need to go through the process of PBKDF2 derivation to generate the derived key. However, instead of using this to decrypt the file and checking for ‘{“version’, the derived key would be used to decrypt the ‘First Encrypted Value‘ and ‘Second Encrypted Value‘. The comparison of the SHA256 hash of the generated key to the decrypted ‘First Encrypted Value‘ would take place. If these match then the PIN and key are correct.

The code to carry out that process is longer than the theory but the process provides an examiner the chance to decrypt and view the files even if the files are rouge on an SD Card or the extraction type has not retrieved the relevant ‘/data/data’ folder. Lets not forget that not all applications can be reverse engineered or there simply isn’t the time to do it in all cases, however, if we can identify the PIN code for this application it may be the same for another one on the same handset.

Conclusion

The process used to generate the key in this case is very long on paper but programmatically can be achieved fairly easily with the exception of the abnormality seen in the Java ‘toCharArray’ function. This has been circumvented by generating all possible occurrences by replicating the original application code in a small Android app, which hopefully saves other people from needing to do the same.

It is possible to both identify the users PIN and pattern lock for the application if there is access to ‘/data/data/’. Furthermore it is possible to identify the users PIN in cases where access to ‘/data/data/’ is not possible by leveraging a bruteforce attack on known data or by using the same validation method implemented by the application.

Although snippets of code have been included for the purposes of explaining some concepts, a full script has been created and made available on my GitHub. The script is designed to take the relevant ‘/data/data/’ folder and derive the users PIN / pattern and decrypt any encrypted data provided. The script also has the function if provided with encrypted data only to bruteforce and identify the PIN, decrypting the data at the same time. During the post there is mention of the mobile application used to generate the rainbow table of values required for the PBKDF2 key derivation, the code for the mobile application is not included however the file containing the results is available on the GitHub page.

As always, any feedback, queries or questions are always welcome on here or on Twitter @4n6chewtoy.

Enjoy!

One thought on “Decrypting the ‘AVG’ Photo Vault”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create your website with WordPress.com
Get started
%d bloggers like this: