-
Notifications
You must be signed in to change notification settings - Fork 29
MIDI File Format Specifications
Standard MIDI files provide a common file format used by most musical software and hardware devices to store song information including the title, track names, and most importantly what instruments to use and the sequence of musical events, such as notes and instrument control information needed to play back the song. This standardization allows one software package to create and save files that can later be loaded and edited by another completely different program, even on a different type of computer. Almost every software music sequencer is capable of loading and saving standard MIDI files. All data values are stored in Big-Endian (most significant byte first) format. Also, many values are stored in a variable-length format which may use one or more bytes per value. Variable-length values use the lower 7 bits of a byte for data and the top bit to signal a following data byte. If the top bit is set to 1, then another value byte follows. Below is a table of examples to help demonstrate how variable length values are used. |
Value | Variable-Length | ||
Hex | Bin | Hex | Bin |
00 | 00000000 | 00 | 00000000 |
C8 | 11001000 | 8148 | 10000001 01001000 |
100000 | 00010000 00000000 00000000 | C08000 | 11000000 10000000 00000000 |
A variable-length value may use a maximum of 4 bytes. This means the maximum value that can be represented is 0x0FFFFFFF (represented as 0xFF, 0xFF, 0xFF, 0x7F).
MIDI files are organized into data chunks. Each chunk is prefixed with an 8 byte header: 4 byte ID string used to identify the type of chunk followed by a 4 byte size which defines the chunk's length as number of bytes following this chunk's header. The header chunk contains information about the entire song including MIDI format type, number of tracks and timing division. There is only one header chunk per standard MIDI file and it always comes first. Before describing each element of the header chunk, here is a chart to help give an overview of the chunk's organization.
Offset | Length | Type | Description | Value |
0x00 | 4 | char[4] | chunk ID | "MThd" (0x4D546864) |
0x04 | 4 | dword | chunk size | 6 (0x00000006) |
0x08 | 2 | word | format type | 0 - 2 |
0x10 | 2 | word | number of tracks | 1 - 65,535 |
0x12 | 2 | word | time division | see following text |
Chunk ID and Size
The chunk ID is always "MThd" (0x4D546864) and the size is always 6 because the header chunk always contains the same 3 word values.
Format Type
The first word describes the MIDI format type. It can be a value of 0, 1 or 2 and describes what how the following track information is to be interpreted. A type 0 MIDI file has one track that contains all of the MIDI events for the entire song, including the song title, time signature, tempo and music events. A type 1 MIDI file should have two or more tracks. The first, by convention, contains song information such as the title, time signature, tempo, etc. (more detail in Track Chunk section). The second and following tracks contain a title, musical event data, etc. specific to that track. This closely matches the organization of modern multi-track MIDI sequencers. A type 2 MIDI file is sort of a combination of the other two types. It contains multiple tracks, but each track represents a different sequence which may not necessarily be played simultaneously. This is meant to be used to save drum patterns, or other multi-pattern music sequences.
Number of Tracks
The second word simply defines the number of track chunks that follow this header chunk. A type 0 MIDI file may only contain a value of 1, because they can only contain one track. Type 1 and 2 MIDI files may contain up to 65,536 (0xFFFF) tracks.
Time Division
The third and final word in the MIDI header chunk is a bit more complicated than the first two. It contains the time division used to decode the track event delta times into "real" time. This value is represents either ticks per beat or frames per second. If the top bit of the word (bit mask 0x8000) is 0, the following 15 bits describe the time division in ticks per beat. Otherwise the following 15 bits (bit mask 0x7FFF) describe the time division in frames per second. Ticks per beat translate to the number of clock ticks or track delta positions (described in the Track Chunk section) in every quarter note of music. Common values range from 48 to 960, although newer sequencers go far beyond this range to ease working with MIDI and digital audio together. Frames per second is defined by breaking the remaining 15 bytes into two values. The top 7 bits (bit mask 0x7F00) define a value for the number of SMPTE frames and can be 24, 25, 29 (for 29.97 fps) or 30. The remaining byte (bit mask 0x00FF) defines how many clock ticks or track delta positions there are per frame. So a time division example of 0x9978 could be broken down into it's three parts: the top bit is one, so it is in SMPTE frames per second format, the following 7 bits have a value of 25 (0x19) and the bottom byte has a value of 120 (0x78). This means the example plays at 24 frames per second SMPTE time and has 120 ticks per frame.
Offset | Length | Type | Description | Value |
0x00 | 4 | char[4] | chunk ID | "MTrk" (0x4D54726B) |
0x04 | 4 | dword | chunk size | see following text |
0x08 | track event data (see following text) |
Chunk ID and Size
The chunk ID is always "MTrk" (0x4D54726B) and the size varies depending on the number of bytes used for all of the events contained in the track.
Track Event Data
The track event data contains a stream of MIDI events that define information about the sequence and how it is played. The next section describes the different types of events.
Delta-Times
The event delta time is defined by a variable-length value. It determines when an event should be played relative to the track's last event. A delta time of 0 means that it should play simultaneously with the last event. A track's first event delta time defines the amount of time to wait before playing this first event. Events unaffected by time are still preceded by a delta time, but should always use a value of 0 and come first in the stream of track events. Examples of this type of event include track titles and copyright information. The most important thing to remember about delta times is that they are relative values, not absolute times. The actual time they represent is determined by a couple factors. The time division (defined in the MIDI header chunk) and the tempo (defined with a track event). If no tempo is define, 120 beats per minute is assumed.
Types of Events
There are three types of events: MIDI Control Events, System Exclusive Events and Meta Events.
MIDI Channel Events
Musical control information such as playing a note or adjusting a MIDI channel's modulation value are defined by MIDI Channel Events. Each MIDI Channel Event consists of a variable-length delta time (like all track events) and a two or three byte description which determines the MIDI channel it corresponds to, the type of event it is and one or two event type specific values. Below is a table illustrating how MIDI Channel Events are formatted.
Delta Time | Event Type Value | MIDI Channel | Parameter 1 | Parameter 2 |
variable-length | 4 bits | 4 bits | 1 byte | 1 byte |
MIDI Channel Events are the most common type of track event and usually make up the bulk of a MIDI file. The following table gives an overview of the seven MIDI Channel Events, listing their numeric value and parameters.
Event Type | Value | Parameter 1 | Parameter 2 |
Note Off | 0x8 | note number | velocity |
Note On | 0x9 | note number | velocity |
Note Aftertouch | 0xA | note number | aftertouch value |
Controller | 0xB | controller number | controller value |
Program Change | 0xC | program number | not used |
Channel Aftertouch | 0xD | aftertouch value | not used |
Pitch Bend | 0xE | pitch value (LSB) | pitch value (MSB) |
Although all of the MIDI Channel Events follow the same basic format, each one requires a bit of explanation. Below is a detailed description of each and how it is used.
Note Off Event
Note On Event
Note Aftertouch Event
Controller Event
Below is a list of the defined MIDI controller types.
Program Change Event
Channel Aftertouch Event
Pitch Bend Event
|
Meta Events
Events that are not to be sent or received over a MIDI port are called Meta Events. These events are defined by an event type value of 0xFF and have a variable size of parameter data which is defined after the event type.
Meta Event | Type | Length | Data |
255 (0xFF) | 0-255 | variable-length | type specific |
There are currently fifteen defined Meta Events. Each one is described in detail below.
Sequence Number
Text Event
Copyright Notice
Sequence/Track Name
Instrument Name
Lyrics
Marker
Cue Point
MIDI Channel Prefix
End Of Track
Set Tempo
SMPTE Offset
Time Signature
Key Signature
Sequencer Specific
|
System Exclusive Events
Also known as SysEx Events, these MIDI events are used to control MIDI hardware or software that require special data bytes that will follow their manufacturer's specifications. Every SysEx event includes an ID that specifies which manufacturer's product is to be the intended receiver. All other products will ignore the event. There are three types of SysEx messages which are used to send data in a single event, across multiple events or authorize the transmission of specific MIDI messages.
Normal SysEx Events
Divided SysEx Events
Authorization SysEx Events
|