cgrr holds utility functions used by other modules for parsing game resource files.
At present, cgrr.py provides three things:
verify
, a simple function to verify that certain files exist in a pathFile
, a namedtuple to be used withverify
FileReader
, a class used for reading files into dictionaries
Pass this function a list of files (instances of the File
namedtuple) and a
path and it will verify that those files exist in that path. It is intended to
be used to verify that a certain program resides in the given path, e.g. by
checking that the program's main executable is in the expected place.
identifying_files = [
File("ARCHERY.EXE", 31616, "d8fae202edcc48d51a72026cbfbe7fa8"),
]
path = "path/to/archery"
verify(identifying_files, path)
The call to verify
above will return True
iff a file
path/to/archery/ARCHERY.EXE
exists, is 31,616
bytes, and has md5 hash
d8fae202edcc48d51a72026cbfbe7fa8
. If identifying_files
contains multiple
File
namedtuples, all of the files described in the list must be present.
File
is simply a namedtuple representing a file. The fields of the namedtuple
are path
, size
, and md5
.
To create a new File
:
example = File("path/to/example.tle", 12345, "0123456789abcdef0123456789abcdef")
The path should be given relative to some base path (e.g. the main path to the
program to be identified by that file) which will be passed to verify
separately.
size
is the file size in bytes.
md5
is the md5 hash of the file.
FileReader is a factory that produces readers for specific file formats. A
reader provides two methods, pack
and unpack
, used for parsing and
unparsing data from files. Under the hood, it uses the struct
module.
Construct a file reader with FileReader(format)
, where format
is a
string describing the file format, such as:
score_reader = FileReader("""
<
Uint32 score # Score at index 0x00, before name
string[16] name
options[6] game_options # A six byte field with a custom data format
""")
The format of each line is
TYPE VARIABLE_NAME
or
TYPE[COUNT] VARIABLE_NAME
If COUNT is not specified, it defaults to 1.
Optionally, a line may contain a single character describing the endianness of the numbers in the file, in the style of struct. By default, little-endian ('<') integers are assumed.
Characters following a pound sign ('#') are treated as comments and ignored.
If TYPE is one of the builtin types supported by the struct module (e.g. Uint16), it will be processed by struct. For builtin types, COUNT is treated as the repeat count for struct: Uint32[4] means four 32-bit unsigned integers (16 bytes), and string[4] means a 4 byte string.
Otherwise, TYPE is treated as a user-defined type. Then COUNT is the number of bytes occupied by the variable, and the FileReader will look for a function named parse_TYPE (e.g. parse_options) when unpacking the data. If found, the function will be called with the bytestring as an argument and the return value assigned as the value of the variable. Similarly, the FileReader will pass the variable to a function named unparse_TYPE (e.g. unparse_options) which should return a bytestring of length COUNT when packing the data. If those functions are not defined, the bytes will be returned as-is.
The Struct
used by this module can be accessed directly as
score_reader.struct
, if desired.
The reader specified above will extract three variables from a 26-byte
file: score
, a (little-endian) 32-bit unsigned integer; name
, a
16-byte string; and game_options
, a 6-byte field in a custom format.
Given a file in the required format, the file can be parsed with:
data = scorefile.read(26)
scores = score_reader.unpack(data)
which will produce scores
, a dictionary with three entries
scores = {"name" : "SomeName", "score" : 1234, "game_options" : b'......'}
Given a dictionary with these entries, pack
can be used to generate a
scorefile in the original format.
data = score_reader.pack( {"name" : "Cheater",
"score" : 9999,
"game_options" : b'......'} )
scorefile.write(data)
Since we didn't define parse_options
and unparse_options
functions,
the six bytes devoted to that variable are simply assigned directly. It
might be more useful to parse the options, however:
def parse_options(b):
return { 'option' + str(i) : b[i] for i in range(6) }
def unparse_options(o):
return bytes([o['option' + str(i)] for i in range(6)])
If you know the offsets of data in a file, but not necessarily the format of the
whole file, the from_offsets
constructor may be more useful.
Construct a file reader with from_offsets(format_def)
, where format_def
is a
string describing the file format, such as:
FileReader.from_offsets('''
<
0x00 Uint32 score # Score at index 0x00, before name
0x04 string[16] name
0x14 options[6] options # A six byte field with a custom data format
0x1a EOF
''')
The format of each line is
OFFSET TYPE VARIABLE_NAME
or
OFFSET TYPE[COUNT] VARIABLE_NAME
The final line of format_def may be:
FILE_LENGTH EOF
OFFSET and FILE_LENGTH must be specified in hexadecimal. The number must begin with '0x' and may use either capital or lowercase, i.e. 0x1a and 0x1A are equivalent.
It is not required to specify offsets in any particular order.
Optionally, a line may contain a single character describing the endianness of the numbers in the file, in the style of struct. By default, little-endian ('<') integers are assumed.
For an explanation of the remaining segment of each line, see the
documentation for FileReader
.
This function is useful if a file format contains unknown segments,
because from_offsets
will automatically fill in the unknown segments
with dummy variables. So:
FileReader.from_offsets('''
<
0x00 Uint32 score # Score at index 0x00, before name
0x04 string[16] name
0x24 options[6] options # A six byte field with a custom data format
0x50 EOF
''')
is equivalent to:
FileReader('''
<
Uint32 score # 0x00-0x03: Score at index 0x00, before name
string[16] name # 0x04-0x13
unknown[16] unk1 # 0x14-0x23
options[6] options # 0x24-0x29: A six byte field with a custom data format
unknown[38] unk2 # 0x2a-0x4f
''')
The EOF statement is not required, but if not specified, the variable with the highest offset specified will also be presumed to be the end of the file.
cgrr.py is used by other modules in the CGRR project. For example:
- cgrr-gameboy, which reads and edits Game Boy ROM headers
- cgrr-gamecube, which reads and edits GameCube GCI files
- cgrr-mariospicross, which reads and edits puzzles for the Game Boy game Mario's Picross
- cgrr-pokemon, which reads and edits save files for Pokemon games
CGRR is available under the GPL v3 or later. See the file COPYING for details.