View Page Source

Revision (current)
Last Updated by Randomno on 7/6/2023 7:29 PM
Back to Page

This is a proposal for a new movie format for recording movies 
with the Gens Sega Genesis emulator.  There is no code that implements 
this proposal, and the implemented version may be different.  This 
proposal is provided as a set of guidelines to let the eventual 
implementer know which issues need to be addressed.

There is an [Forum/Topics/3995|ongoing discussion] about this proposition on the forum.

The source code of [http://ucon64.sourceforge.net/|ucon64] may prove helpful
for extracting information directly from ROMs.

%%TOC%%

!!! Issues to be Addressed

!! Extensibility
The GMV format can not easily be modified or improved.  The GM2 format 
should allow additional information to be inserted in a [http://en.wikipedia.org/wiki/Forward_compatibility|forwards-compatible] manner.  
GM2 does not need to be backwards-compatible with GMV.

!! Configuration Options
The GMV format does not store certain configuration options which can 
effect synchronization.  These include the following.
* sound rate [#1]
* SRM information (important for SegaCD recording, when stable)
* perfect synchro (important for SegaCD recording, when stable)
* country
* up/down left/right (important for recording)
* ---Video resolution.--- [#7]

!! Game Information
GMV does not contain any information on which ROM was used to record a movie.  
The GM2 format should contain both the internal ROM name and a checksum of the ROM.

!! Hexedit-friendly
The GMV format is relatively hexedit-friendly.  However, the format 
may be considered hedexit-friendly as long as there is a portable editor
or GM2<->text convertor available.
The ability to directly load and save an ASCII verion of 
GM2 fulfills this requirement.

!! Compression
Button-press recordings compress very well, but GMV does not allow for 
internal compression.  GM2 should allow movies to be optionally compressed as zip
files for publication.  This prevents the need for separate 
compression/decompression steps when submitting movies or viewing submitted movies.

In other words, a zip archive containing a valid GM2 file is also a valid GM2 file. (Not recursively, that would be silly.)

!! Number of Players
GMV is hard-coded to allow 2 players with 6-button controllers.  Version 9D 
contained a hack to allow 3 players with 3-button controllers, but this is 
not an ideal solution.  GM2 should allow at least 3 players with 6-button controllers.  Ideally, the number and configuration of controllers would be 
variable so that an arbitrary configuration could be stored with no wasted space.

!! Integrity
A GM2 file should contain an internal checksum to ensure file integrity.  
This checksum need only alert a user of errors resulting from corruption 
in storage and transit.


!!! Current Proposal
The current proposal is a chunk-based format, superficially similar to 
[http://www.libpng.org/pub/png/pngintro.html|PNG].  This format is a 
stripped-down version of Jyzero's proposal.
All integers are unsigned 4-byte little endian.  All strings are UTF-8 encoded and 
null-terminated unless otherwise specified.  Lengths given include terminal nulls.

!! Header
 0x000 - 4-byte magic number 'GM2\0'
 0x004 - 4-byte CRC of file (not including the first 8 bytes), or 0
 0x008 - 1-byte number of chunks in file. *
 0x009 - 3-byte 0 padding. +
 0x00C - 4-byte movie "uid" - recording time in epoch seconds, useful for
                              identifying movies and states with each other
 0x010 - start of chunk data

!! General Chunk Format
Values are offsets from the beginning of a chunk.  No chunk may appear more
than once.  Required chunks must appear exactly once.

 0x000 - 8-byte chunk ID (ASCII encoded, fixed 8-byte size, not null-terminated) [#2]
 0x008 - 4-byte chunk data length
 0x00C - chunk data

!! Required Chunks

These chunks must always be present in a GM2 file.

! Start Information
 ID: 'start   '
This chunk tells Gens whether the movie begins from a saved state, SRAM, or power on.

Note: If a movie starts from a saved state, the frame data is interpreted
as starting from that state.  Frame data from before that state will not
be stored.

 1-byte flag (0 = power on, 1 = SRAM, 2 = state)
if starting from SRAM:
 4-byte length of embedded SRAM image
 SRAM data (uncompressed)
if starting from a state:
 4-byte number of first frame
 4-byte length of embedded state
 state data (uncompressed)
Note: The state contains the SRAM data, so loading both is redundant.

This chunk must occur before the frame chunk. [#3]

! Recording information
 ID: 'record  '
data:
 4-byte Gens version string length
 Gens version string
 4-byte author name length
 author name
 4-byte author comment length
 author comment
 4-byte re-record count 
 4-byte number of recorded frames (including all recorded-over frames)

! Game Requirements
 ID: 'gamereqs'
data:
 1-byte country identifier
   0x00 = Japan (NTSC)
   0x01 = USA   (NTSC)
   0x02 = Europe(PAL)
   0x03 = Japan (PAL)
 1-byte system identifier:[#6]
   0x01 - Genesis game
   0x02 - 32x game 
   0x04 - SegaCD game 
   0x06 - SegaCD + 32x game 
   0x08 - Master System game (for Meka, if applicable)
 4-byte config flags:
  0x00000001 - Z80 enabled [#4]
  0x00000002 - YM2612 enabled
  0x00000004 - PSG enabled
  0x00000008 - DAC enabled
  0x00000010 - PCM enabled
  0x00000020 - PWM enabled
  0x00000040 - CDDA enabled
  0x00000080 - YM2612 improvement enabled
  0x00000100 - DAC improvement enabled
  0x00000200 - PSG improvement enabled
  0x00000400 - Sample-rate multiplier, low bit (rate is 11025 hz times two raised to this number) +
  0x00000800 - Sample-rate multiplier, high bit (rate is 11025 hz times two raised to this number) +
  0x0000?000 - reserved for future chipset options *
  0x00010000 - enable up/down and left/right while recording[#5]
 4-byte calculated ROM CRC
 4-byte internal ROM name length
 internal ROM name (initially 48 bytes, converted to null-terminated UTF-8)
 4-byte calculated Genesis BIOS CRC, or 0 --
if system is 32x: --
 4-byte 68K BIOS calculated CRC, or 0
 4-byte Master SH2 BIOS calculated CRC, or 0 --
 4-byte Slave SH2 BIOS calculated CRC, or 0 --
if system is SegaCD: --
 4-byte SegaCD BIOS calculated CRC, or 0
--
! Frame Data
 ID: 'frames  '
data:
 4-byte frame count (number of frames in the movie, not the number of the last frame)
 1-byte controller recorded flags: *
  0x01 - Controller 1 recorded *
  0x02 - Controller 2 recorded *
  0x04 - Controller 1B recorded +
  0x08 - Controller 1C recorded +
  0x10 - Controller 1D recorded +
  0x20 - Controller 2B recorded +
  0x40 - Controller 2C recorded +
  0x80 - Controller 2D recorded +
 1-byte controller 1 type (ignored when 1 recorded flag is 0) *
 1-byte controller 2 type (ignored when 2 recorded flag is 0) *
 1-byte controller 1B type (ignored when 1B recorded flag is 0) *
 1-byte controller 1C type (ignored when 1C recorded flag is 0) *
 1-byte controller 1D type (ignored when 1D recorded flag is 0) *
 1-byte controller 2B type (ignored when 2B recorded flag is 0) *
 1-byte controller 2C type (ignored when 2C recorded flag is 0) *
 1-byte controller 2D type (ignored when 2D recorded flag is 0) *
   where
   0x00 = 3-button -*
   0x01 = 6-button *
   0x02 = Mega Mouse *
   0x03 = Sega Menacer *
   0x04 = TeeVGolf *
   0x05 = Batter Up! *
   0x06 = Sega Master System controller *
Frame segment order: +
 console buttons+
 Player 1 (if recorded)+
 Player 2 (if recorded)+
 Player 1B (if recorded)+
 Player 1C (if recorded)+
 Player 1D (if recorded)+
 Player 2B (if recorded)+
 Player 2C (if recorded)+
 Player 2D (if recorded)+
frame data:
 1 means a button is __not pressed__ 
 0 means a button __is pressed__
Console data: +
 1 byte: power cycle, reset button, SMS Pause button
3-button data:
 1 byte: a,b,c,start,up,down,left,right
6-button data:
 1 byte: a,b,c,start,up,down,left,right
 1 byte: x,y,z,mode
Sega Master System controller data: +
 1 byte: 1,2,up,down,left,right+
Format and number of bytes for other peripherals to be decided at a later date. See http://www.vidgame.net/SEGA/peripherals.htm for a list of peripherals.

Note: Only bytes for active controllers are stored.  If a movie has two 6-button
controllers (2 bytes each) and one 3-button controller (1 byte each), each frame will occupy 5 bytes.

!! Optional Chunks

These chunks are optional for a GM2 file.

! Console Events 
 ID: 'consevnt' 
It makes more sense for these to go in frame data. An extra 1 byte per frame is not a big deal, especially if these are being compressed. +
! Saved States-------------
 ID: 'states  '
This chunk contains one or more unordered saved states.  
This allows viewers to skip to another level or to watch a level 
over again with minimal effort. *%%%
Chunk format: +
 4-byte number of states+
State format:
 4-byte state name length
 state name (decided by recorder)
 4-byte state length
 state data (uncompressed)

! Subtitles
 ID: 'subtitle'
This chunk contains one or more strings and associated data to be displayed
during the movie.  These do not need to be sorted.  Each string is stored 
as follows:
 4-byte size of string (in bytes)
 string data
 2-byte x-coordinate (upper left corner of first letter)
 2-byte y-coordinate
 4-byte text color (red,green,blue,transparency, stored as 0xRRGGBBTT)
 4-byte border color (rgbt)
 4-byte first frame to display text
 4-byte last frame to display text

!! Text-only Format
This format exists to simplify hexediting, since manually hexediting the binary
GM2 format is non-trivial.  This format will consist of only ASCII characters
and will be very easy to edit with any text-manipulation tool.  Gens will
be able to read from and write to this format natively.
Because it is intended for editing, this format will not contain any
corruption detection measures such as a CRC checksum.
Line separators may be either unix-style ('\n') or DOS-style ('\r\n').

! Header
The header will consist of the following 7 characters in the first line, 
 GM2TEXT

! Generic Text-only Chunk
Each chunk will occupy one line and will consist of the following fields, with no separation:
 8-character chunk name
 base-64 encoded chunk data, not including chunk header
 1-character ASCII exclamation mark ('!')

! Text-only Frame Chunk
Because this chunk is intended for editing, all data will be encoded into
human-editable text.  The format will appear as follows, one line for each
field:
 8-character chunk name ('frames   ')
 recorded controlers ('controllers_recorded:1,1,1,0,0,0,0,0') *
 controller types ('controller_types:2,1,2,0,0,0,0,0')

Note: Because editing is likely to change the number of frames, the frame
count is explicitly excluded from this data.  It must be derived
from the number of frames.
Note: The values on the 'controller_types' line represent the following
controllers: 1, 2, 1b, 1c, 1d, 2b, 2c, 2d.

After this, the chunk will contain frame data conforming to the following [http://en.wikipedia.org/wiki/Backus-Naur_form|BNF] description:

 lines             := lines line |
                      line |
                      "!" "\n"
 line              := frame-comment ":" button-lists "\n"
 frame-comment      := non-colon-char frame-comment |
                      "" 
 non-colon-char    := a-zA-Z0-9,./<>?;'"{}[]\|~!@#$%^&*()_+=-
 button-lists      := button-lists "," button-list |
                      button-list |
                      ""
 button-list       := controller-number button-chars
 controller-number := 1-8
 button-chars      := button-chars button-char |
                      ""
 button-char       := UuDdLlRrAaBbCcSsXxYyZzMm


It is recommended that Gens store the number of each frame before the colon
to make editing simpler, however Gens must ignore this information when reading
the file 
because editing may cause it to become inconsistent.

The end of this chunk will consist of an ASCII exclamation mark at the end of a line.

Button presses are stored in comma separated strings consisting of 1 character for controller number and 1 or more characters for buttons pressed.  Controller number must be 
an ASCII character 1-8. Valid characters for button are u,d,l,r,a,c,b,s,x,y,z,m (case insensitive). In the event of a frame showing the same controller number in multiple segments, only the __last__ segment for each controller will be used.
All of the following are valid frames:
 45:1abud
 painintheassframe:1ab,2d,1alb,1alaU
 foobar:1u
 :
 ?!?!;:1a,2cu,3l,6r
 xyz:
 foo:1bar,2cab
 this_is_the_last_frame:1ab!

----
[1]: Sound rate should not affect synchronization. --[user:Bisqwit]

** In an utopian world maybe, but should not and does not aren't the same thing. Both Gens and Snes9x have troublems where sound output affects emulation, but shouldn't. 44100 should always be the preselected in Gens, but that is an emulator implementation and not a format dito. -- [user:Truncated]

** In my opinion, sound output should NEVER affect emulation. Instead of adding that header, fix the "sound ouput that affects emulation" problem. Hax like that won't resolve the problem. -- [user:Phil]

[2]: I think the chunk ID should be restricted into ASCII bytes. This avoids inconvenient byte/character length confusions that may apply in fixed-width fields.
Also you should specify the padding used (spaces?). And direction of padding? --[user:Bisqwit]

** Changed chunk ID to ASCII (I think it was that previously). Padding is specified as seen in every chunk (spaces at the end). -- [user:Truncated]

[3]: Why must it occur before the input stream? Even though Gens will probably always save the chunks in the same order, why must this be a requirement? I thought the idea of chunks was that new ones could be added, left out, or ignored, regardless of order. --[user:Truncated]

** Maybe not necessarily. It would just perhaps make the playback / other analysis based on the movie easier. At least it wouldn't need backtracking in the file. --[user:Bisqwit]

** It's a very small change in the Gens code that would make external processing significantly simpler.  I think it's a good trade-off.  --[user:ideamagnate]

** Because the input stream is constantly changing size, it should always be the last chunk. Otherwise, all data after it will also have to be rewritten constantly, which increases the chance of corruption, as well as general wear on the disk. --[user:upthorn]

[4]: [user:nitsuja]: Can these "x improvement" options possibly cause desync?
If not, they are a nuisance to store because viewers who prefer a certain option
will want to change it whenever the recorder didn't happen to use it.

** This should be looked into.  I've included all options to be safe.  The problem is  that it's easy to show that an option can cause a desync for a certain movie, but hard to show that it never will. --[user:ideamagnate]

[5]: [user:Bisqwit]: I think, for clarity, the up/down option should be moved to 00010000 or something, so that if more chipset-related options are added later, they will still be in a contiguous block.

** I'm not sure if it will prove necessary, but there are plenty of bits to spare. --[user:ideamagnate]

*** It's not a question of bits to spare. It's a question of readability of the format description. Don't you agree that a list grouped by purpose is easier to read than a randomly shuffled list? --[user:Bisqwit]

[6]: [user:upthorn]: Why are these bitwise? There will never be a situation where master system is combined with anything else, and if we give each system it's own number, there's more space for future expansion, even if it means that the genesis and its addons get 4, that's 4 out of 256, rather than 4 out of 8.

[7]: [user:Phil]: It's not that it affects emulation but some games, such as Battletoads, use special resolution. Unfortunately, Gens doesn't auto-detect those games to use the correct resolution. So, either we put something in the header or make Gens to detect those games to use the correct resolution.
What I think is that when you start recording you have the option to use other resolution, so when people play the movie, Gens auto-resize to desired resolution. Default is 320*240. %%% 
__P.S.__ AVI dumping doesn't let Windows users to use other resolution than 320*240.
* This seems to be a poor solution to a larger problem. But more importantly, it isn't something which can affect movie sync, so the movie file isn't a good place to change how it's handled. --[user:upthorn]

[user:upthorn]: I'm starting to implement this format, and I'm going to change some parts of the specification that don't make sense to me, or I see a better way to do.
Summary of changes: 
* Got rid of stored filenames -- CRC is far more meaningful than filename.
** The CRC is more useful unless you're a human trying to figure out which ROM the GM2 goes with.  I'd keep both of pieces of information.  --[user:ideamagnate] 
*** Well, the game's internal ROM name is still stored, which should be enough for most humans to work with, and the filenames for various bios files are probably unnecessary anyway -- there are only 2 versions of each country's SegaCD bios, two of the 32X bios images don't even seem to be used (not that there are multiple versions, anyway), and I think there was only one or two versions of the Genesis BIOS. Unless someone records a movie using a homebrew version of one of those, it's not likely to be a problem. --[user:upthorn]
* Replaced stored filesize in header with a stored number of chunks. 
* Added sound rate storage to the config flags.
* Modified frame chunk format slightly -- multitap is now derived from which controllers are recorded, and there is no need to reserve a controller type for "inactive".
* Added a spec for SMS controller data format.
* Merged console events chunk into framedata -- slightly less efficient file-size wise, but seems much simpler to handle this way.