The following document has some useful notes on how resource strings are stored within the resources section in a PE file. Quoting directly from the document:
4.8 String Table Resources
These tables are constructed in blocks of 16 strings. The
organization of these blocks of 16 is determined by the IDs given to
the various strings. The lowest four bits of the ID determine a
string s position in the block. The upper twelve bits determine
which block the string is in. Each block of 16 strings is stored as
one resource entry. Each string or error table resource block is
stored as follows:
[Normal resource header (type = 6 for strings)]
[Block of 16 strings. The strings are Pascal style with a WORD
length preceding the string. 16 strings are always written, even
if not all slots are full. Any slots in the block with no string
have a zero WORD for the length.]
It is important to note that the various blocks need not be written
out in numerical order in the resource file. Each block is assigned
an ordinal ID. This ID is the high 12 bits of the string IDs in the
block plus one (ordinal IDs can t be zero). The blocks are written
to the .RES file in the order the blocks are encountered in the .RC
file, while the CVTRES utility will cause them to become ordered in
the COFF object, and hence the image file.
Hence, to extract those string from the file, we can write a small Python script using pefile.
First we need to read the directory entry for the resources and see if there's an entry of type RT_STRING (value 6).
print [entry.id for entry in pe.DIRECTORY_ENTRY_RESOURCE.entries]
Would produce something like (3, 4, 5, 6, 9, 14, 24)
. Therefore we know that in this specific PE file the RT_STRING directory entry is at index 3.
A dump of the corresponding resource directory entry looks like:
Id: [0x6] (RT_STRING)
[IMAGE_RESOURCE_DIRECTORY_ENTRY]
Name: 0x6
OffsetToData: 0x80000108
[IMAGE_RESOURCE_DIRECTORY]
Characteristics: 0x0
TimeDateStamp: 0x0 [Thu Jan 1 00:00:00 1970 UTC]
MajorVersion: 0x0
MinorVersion: 0x0
NumberOfNamedEntries: 0x0
NumberOfIdEntries: 0x1
Id: [0x7]
[IMAGE_RESOURCE_DIRECTORY_ENTRY]
Name: 0x7
OffsetToData: 0x80000320
[IMAGE_RESOURCE_DIRECTORY]
Characteristics: 0x0
TimeDateStamp: 0x0 [Thu Jan 1 00:00:00 1970 UTC]
MajorVersion: 0x0
MinorVersion: 0x0
NumberOfNamedEntries: 0x0
NumberOfIdEntries: 0x1
[IMAGE_RESOURCE_DIRECTORY_ENTRY]
Name: 0x409
OffsetToData: 0x4B8
[IMAGE_RESOURCE_DATA_ENTRY]
OffsetToData: 0x251F0
Size: 0x48
CodePage: 0x0
Reserved: 0x0
We need to iterate through all the directory entries under the RT_STRING directory and read the data entries in order to be able to reach the actual string data.
We can iterate through the entries with the following code:
(the strings will be saved in the strings list)
# The List will contain all the extracted Unicode strings
#
strings = list()
# Fetch the index of the resource directory entry containing the strings
#
rt_string_idx = [
entry.id for entry in
pe.DIRECTORY_ENTRY_RESOURCE.entries].index(pefile.RESOURCE_TYPE['RT_STRING'])
# Get the directory entry
#
rt_string_directory = pe.DIRECTORY_ENTRY_RESOURCE.entries[rt_string_idx]
# For each of the entries (which will each contain a block of 16 strings)
#
for entry in rt_string_directory.directory.entries:
# Get the RVA of the string data and
# size of the string data
#
data_rva = entry.directory.entries[0].data.struct.OffsetToData
size = entry.directory.entries[0].data.struct.Size
print 'Directory entry at RVA', hex(data_rva), 'of size', hex(size)
# Retrieve the actual data and start processing the strings
#
data = pe.get_memory_mapped_image()[data_rva:data_rva+size]
offset = 0
while True:
# Exit once there's no more data to read
if offset>=size:
break
# Fetch the length of the unicode string
#
ustr_length = pe.get_word_from_data(data[offset:offset+2], 0)
offset += 2
# If the string is empty, skip it
if ustr_length==0:
continue
# Get the Unicode string
#
ustr = pe.get_string_u_at_rva(data_rva+offset, max_length=ustr_length)
offset += ustr_length*2
strings.append(ustr)
print 'String of length', ustr_length, 'at offset', offset