TASVideos

Tool-assisted game movies
When human skills are just not enough

GM2

This is a proposal for a new movie format for recording movies with the Gens Sega Genesis emulator. There is no code that implements this proposal, and the implemented version may be different. This proposal is provided as a set of guidelines to let the eventual implementer know which issues need to be addressed.

There is an ongoing discussion about this proposition on the forum.

The source code of ucon64 may prove helpful for extracting information directly from ROMs.

Table of contents [expand all] [collapse all]

Issues to be Addressed

Extensibility

The GMV format can not easily be modified or improved. The GM2 format should allow additional information to be inserted in a forwards-compatible manner. GM2 does not need to be backwards-compatible with GMV.

Configuration Options

The GMV format does not store certain configuration options which can effect synchronization. These include the following.
  • sound rate [1]
  • SRM information (important for SegaCD recording, when stable)
  • perfect synchro (important for SegaCD recording, when stable)
  • country
  • up/down left/right (important for recording)
  • Video resolution. [7]

Game Information

GMV does not contain any information on which ROM was used to record a movie. The GM2 format should contain both the internal ROM name and a checksum of the ROM.

Hexedit-friendly

The GMV format is relatively hexedit-friendly. However, the format may be considered hedexit-friendly as long as there is a portable editor or GM2<->text convertor available. The ability to directly load and save an ASCII verion of GM2 fulfills this requirement.

Compression

Button-press recordings compress very well, but GMV does not allow for internal compression. GM2 should allow movies to be optionally compressed as zip files for publication. This prevents the need for separate compression/decompression steps when submitting movies or viewing submitted movies.

In other words, a zip archive containing a valid GM2 file is also a valid GM2 file. (Not recursively, that would be silly.)

Number of Players

GMV is hard-coded to allow 2 players with 6-button controllers. Version 9D contained a hack to allow 3 players with 3-button controllers, but this is not an ideal solution. GM2 should allow at least 3 players with 6-button controllers. Ideally, the number and configuration of controllers would be variable so that an arbitrary configuration could be stored with no wasted space.

Integrity

A GM2 file should contain an internal checksum to ensure file integrity. This checksum need only alert a user of errors resulting from corruption in storage and transit.

Current Proposal

The current proposal is a chunk-based format, superficially similar to PNG. This format is a stripped-down version of Jyzero's proposal. All integers are unsigned 4-byte little endian. All strings are UTF-8 encoded and null-terminated unless otherwise specified. Lengths given include terminal nulls.

Header

  0x000 - 4-byte magic number 'GM2\0'
  0x004 - 4-byte CRC of file (not including the first 8 bytes), or 0
  0x008 - 1-byte number of chunks in file. *
  0x009 - 3-byte 0 padding. +
  0x00C - 4-byte movie "uid" - recording time in epoch seconds, useful for
                               identifying movies and states with each other
  0x010 - start of chunk data

General Chunk Format

Values are offsets from the beginning of a chunk. No chunk may appear more than once. Required chunks must appear exactly once.

  0x000 - 8-byte chunk ID (ASCII encoded, fixed 8-byte size, not null-terminated) [2]
  0x008 - 4-byte chunk data length
  0x00C - chunk data

Required Chunks

These chunks must always be present in a GM2 file.

Start Information

  ID: 'start   '
This chunk tells Gens whether the movie begins from a saved state, SRAM, or power on.

Note: If a movie starts from a saved state, the frame data is interpreted as starting from that state. Frame data from before that state will not be stored.

  1-byte flag (0 = power on, 1 = SRAM, 2 = state)
if starting from SRAM:
  4-byte length of embedded SRAM image
  SRAM data (uncompressed)
if starting from a state:
  4-byte number of first frame
  4-byte length of embedded state
  state data (uncompressed)
Note: The state contains the SRAM data, so loading both is redundant.

This chunk must occur before the frame chunk. [3]

Recording information

  ID: 'record  '
data:
  4-byte Gens version string length
  Gens version string
  4-byte author name length
  author name
  4-byte author comment length
  author comment
  4-byte re-record count
  4-byte number of recorded frames (including all recorded-over frames)

Game Requirements

  ID: 'gamereqs'
data:
  1-byte country identifier
    0x00 = Japan (NTSC)
    0x01 = USA   (NTSC)
    0x02 = Europe(PAL)
    0x03 = Japan (PAL)
  1-byte system identifier:[6]
    0x01 - Genesis game
    0x02 - 32x game
    0x04 - SegaCD game
    0x06 - SegaCD + 32x game
    0x08 - Master System game (for Meka, if applicable)
  4-byte config flags:
   0x00000001 - Z80 enabled [4]
   0x00000002 - YM2612 enabled
   0x00000004 - PSG enabled
   0x00000008 - DAC enabled
   0x00000010 - PCM enabled
   0x00000020 - PWM enabled
   0x00000040 - CDDA enabled
   0x00000080 - YM2612 improvement enabled
   0x00000100 - DAC improvement enabled
   0x00000200 - PSG improvement enabled
   0x00000400 - Sample-rate multiplier, low bit (rate is 11025 hz times two raised to this number) +
   0x00000800 - Sample-rate multiplier, high bit (rate is 11025 hz times two raised to this number) +
   0x0000?000 - reserved for future chipset options *
   0x00010000 - enable up/down and left/right while recording[5]
  4-byte calculated ROM CRC
  4-byte internal ROM name length
  internal ROM name (initially 48 bytes, converted to null-terminated UTF-8)
  4-byte calculated Genesis BIOS CRC, or 0 --
if system is 32x: --
  4-byte 68K BIOS calculated CRC, or 0
  4-byte Master SH2 BIOS calculated CRC, or 0 --
  4-byte Slave SH2 BIOS calculated CRC, or 0 --
if system is SegaCD: --
  4-byte SegaCD BIOS calculated CRC, or 0
--

Frame Data

  ID: 'frames  '
data:
  4-byte frame count (number of frames in the movie, not the number of the last frame)
  1-byte controller recorded flags: *
   0x01 - Controller 1 recorded *
   0x02 - Controller 2 recorded *
   0x04 - Controller 1B recorded +
   0x08 - Controller 1C recorded +
   0x10 - Controller 1D recorded +
   0x20 - Controller 2B recorded +
   0x40 - Controller 2C recorded +
   0x80 - Controller 2D recorded +
  1-byte controller 1 type (ignored when 1 recorded flag is 0) *
  1-byte controller 2 type (ignored when 2 recorded flag is 0) *
  1-byte controller 1B type (ignored when 1B recorded flag is 0) *
  1-byte controller 1C type (ignored when 1C recorded flag is 0) *
  1-byte controller 1D type (ignored when 1D recorded flag is 0) *
  1-byte controller 2B type (ignored when 2B recorded flag is 0) *
  1-byte controller 2C type (ignored when 2C recorded flag is 0) *
  1-byte controller 2D type (ignored when 2D recorded flag is 0) *
    where
    0x00 = 3-button -*
    0x01 = 6-button *
    0x02 = Mega Mouse *
    0x03 = Sega Menacer *
    0x04 = TeeVGolf *
    0x05 = Batter Up! *
    0x06 = Sega Master System controller *
Frame segment order: +
  console buttons+
  Player 1 (if recorded)+
  Player 2 (if recorded)+
  Player 1B (if recorded)+
  Player 1C (if recorded)+
  Player 1D (if recorded)+
  Player 2B (if recorded)+
  Player 2C (if recorded)+
  Player 2D (if recorded)+
frame data:
  1 means a button is not pressed
  0 means a button is pressed
Console data: +
  1 byte: power cycle, reset button, SMS Pause button
3-button data:
  1 byte: a,b,c,start,up,down,left,right
6-button data:
  1 byte: a,b,c,start,up,down,left,right
  1 byte: x,y,z,mode
Sega Master System controller data: +
  1 byte: 1,2,up,down,left,right+
Format and number of bytes for other peripherals to be decided at a later date. See http://www.vidgame.net/SEGA/peripherals.htm for a list of peripherals.

Note: Only bytes for active controllers are stored. If a movie has two 6-button controllers (2 bytes each) and one 3-button controller (1 byte each), each frame will occupy 5 bytes.

Optional Chunks

These chunks are optional for a GM2 file.

Console Events

  ID: 'consevnt'
It makes more sense for these to go in frame data. An extra 1 byte per frame is not a big deal, especially if these are being compressed. +

Saved States-------------

  ID: 'states  '
This chunk contains one or more unordered saved states. This allows viewers to skip to another level or to watch a level over again with minimal effort. *
Chunk format: +
  4-byte number of states+
State format:
  4-byte state name length
  state name (decided by recorder)
  4-byte state length
  state data (uncompressed)

Subtitles

  ID: 'subtitle'
This chunk contains one or more strings and associated data to be displayed during the movie. These do not need to be sorted. Each string is stored as follows:
  4-byte size of string (in bytes)
  string data
  2-byte x-coordinate (upper left corner of first letter)
  2-byte y-coordinate
  4-byte text color (red,green,blue,transparency, stored as 0xRRGGBBTT)
  4-byte border color (rgbt)
  4-byte first frame to display text
  4-byte last frame to display text

Text-only Format

This format exists to simplify hexediting, since manually hexediting the binary GM2 format is non-trivial. This format will consist of only ASCII characters and will be very easy to edit with any text-manipulation tool. Gens will be able to read from and write to this format natively. Because it is intended for editing, this format will not contain any corruption detection measures such as a CRC checksum. Line separators may be either unix-style ('\n') or DOS-style ('\r\n').

Header

The header will consist of the following 7 characters in the first line,
  GM2TEXT

Generic Text-only Chunk

Each chunk will occupy one line and will consist of the following fields, with no separation:
  8-character chunk name
  base-64 encoded chunk data, not including chunk header
  1-character ASCII exclamation mark ('!')

Text-only Frame Chunk

Because this chunk is intended for editing, all data will be encoded into human-editable text. The format will appear as follows, one line for each field:
  8-character chunk name ('frames   ')
  recorded controlers ('controllers_recorded:1,1,1,0,0,0,0,0') *
  controller types ('controller_types:2,1,2,0,0,0,0,0')
Note: Because editing is likely to change the number of frames, the frame count is explicitly excluded from this data. It must be derived from the number of frames. Note: The values on the 'controller_types' line represent the following controllers: 1, 2, 1b, 1c, 1d, 2b, 2c, 2d.

After this, the chunk will contain frame data conforming to the following BNF description:

  lines             := lines line |
                       line |
                       "!" "\n"
  line              := frame-comment ":" button-lists "\n"
  frame-comment      := non-colon-char frame-comment |
                       ""
  non-colon-char    := a-zA-Z0-9,./<>?;'"{}[]\|~!@#$%^&*()_+=-
  button-lists      := button-lists "," button-list |
                       button-list |
                       ""
  button-list       := controller-number button-chars
  controller-number := 1-8
  button-chars      := button-chars button-char |
                       ""
  button-char       := UuDdLlRrAaBbCcSsXxYyZzMm
It is recommended that Gens store the number of each frame before the colon to make editing simpler, however Gens must ignore this information when reading the file because editing may cause it to become inconsistent.

The end of this chunk will consist of an ASCII exclamation mark at the end of a line.

Button presses are stored in comma separated strings consisting of 1 character for controller number and 1 or more characters for buttons pressed. Controller number must be an ASCII character 1-8. Valid characters for button are u,d,l,r,a,c,b,s,x,y,z,m (case insensitive). In the event of a frame showing the same controller number in multiple segments, only the last segment for each controller will be used. All of the following are valid frames:

  45:1abud
  painintheassframe:1ab,2d,1alb,1alaU
  foobar:1u
  :
  ?!?!;:1a,2cu,3l,6r
  xyz:
  foo:1bar,2cab
  this_is_the_last_frame:1ab!

[1]: Sound rate should not affect synchronization. --Bisqwit

    • In an utopian world maybe, but should not and does not aren't the same thing. Both Gens and Snes9x have troublems where sound output affects emulation, but shouldn't. 44100 should always be the preselected in Gens, but that is an emulator implementation and not a format dito. -- Truncated

    • In my opinion, sound output should NEVER affect emulation. Instead of adding that header, fix the "sound ouput that affects emulation" problem. Hax like that won't resolve the problem. -- Phil

[2]: I think the chunk ID should be restricted into ASCII bytes. This avoids inconvenient byte/character length confusions that may apply in fixed-width fields. Also you should specify the padding used (spaces?). And direction of padding? --Bisqwit

    • Changed chunk ID to ASCII (I think it was that previously). Padding is specified as seen in every chunk (spaces at the end). -- Truncated

[3]: Why must it occur before the input stream? Even though Gens will probably always save the chunks in the same order, why must this be a requirement? I thought the idea of chunks was that new ones could be added, left out, or ignored, regardless of order. --Truncated

    • Maybe not necessarily. It would just perhaps make the playback / other analysis based on the movie easier. At least it wouldn't need backtracking in the file. --Bisqwit

    • It's a very small change in the Gens code that would make external processing significantly simpler. I think it's a good trade-off. --ideamagnate

    • Because the input stream is constantly changing size, it should always be the last chunk. Otherwise, all data after it will also have to be rewritten constantly, which increases the chance of corruption, as well as general wear on the disk. --upthorn

[4]: nitsuja: Can these "x improvement" options possibly cause desync? If not, they are a nuisance to store because viewers who prefer a certain option will want to change it whenever the recorder didn't happen to use it.

    • This should be looked into. I've included all options to be safe. The problem is that it's easy to show that an option can cause a desync for a certain movie, but hard to show that it never will. --ideamagnate

[5]: Bisqwit: I think, for clarity, the up/down option should be moved to 00010000 or something, so that if more chipset-related options are added later, they will still be in a contiguous block.

    • I'm not sure if it will prove necessary, but there are plenty of bits to spare. --ideamagnate

      • It's not a question of bits to spare. It's a question of readability of the format description. Don't you agree that a list grouped by purpose is easier to read than a randomly shuffled list? --Bisqwit

[6]: upthorn: Why are these bitwise? There will never be a situation where master system is combined with anything else, and if we give each system it's own number, there's more space for future expansion, even if it means that the genesis and its addons get 4, that's 4 out of 256, rather than 4 out of 8.

[7]: Phil: It's not that it affects emulation but some games, such as Battletoads, use special resolution. Unfortunately, Gens doesn't auto-detect those games to use the correct resolution. So, either we put something in the header or make Gens to detect those games to use the correct resolution. What I think is that when you start recording you have the option to use other resolution, so when people play the movie, Gens auto-resize to desired resolution. Default is 320*240.
P.S. AVI dumping doesn't let Windows users to use other resolution than 320*240.

  • This seems to be a poor solution to a larger problem. But more importantly, it isn't something which can affect movie sync, so the movie file isn't a good place to change how it's handled. --upthorn

upthorn: I'm starting to implement this format, and I'm going to change some parts of the specification that don't make sense to me, or I see a better way to do. Summary of changes:

  • Got rid of stored filenames -- CRC is far more meaningful than filename.
    • The CRC is more useful unless you're a human trying to figure out which ROM the GM2 goes with. I'd keep both of pieces of information. --Ideamagnate
      • Well, the game's internal ROM name is still stored, which should be enough for most humans to work with, and the filenames for various bios files are probably unnecessary anyway -- there are only 2 versions of each country's SegaCD bios, two of the 32X bios images don't even seem to be used (not that there are multiple versions, anyway), and I think there was only one or two versions of the Genesis BIOS. Unless someone records a movie using a homebrew version of one of those, it's not likely to be a problem. --upthorn
  • Replaced stored filesize in header with a stored number of chunks.
  • Added sound rate storage to the config flags.
  • Modified frame chunk format slightly -- multitap is now derived from which controllers are recorded, and there is no need to reserve a controller type for "inactive".
  • Added a spec for SMS controller data format.
  • Merged console events chunk into framedata -- slightly less efficient file-size wise, but seems much simpler to handle this way.


Combined RSS Feed
GM2 last edited by upthorn on 2007-03-02 04:20:14
Page info and history | Latest diff | List referrers | View Source