Title: A tutorial on decoding tables for PFM
Author: Genreless



Tutorial What follows is a tutorial on decoding table schemas using PFM. Due to its length the tutorial is broken down into several sub-sections.

An introduction to the PFM Decode Tool(‘s interface)

  • On the left of the window is the hex display. As with all hex editors (though I don’t know if you can edit the file through it) each 2-character group is one byte in hexadecimal notation. The red section is the header. I don’t remember ever seeing it the wrong length so you can probably ignore this portion. The blue section is the bytes in the current row of the current schema. The green section is the bytes of the column currently selected in the column value display. The black section is the bytes after the currently decoded portion of the current row.
  • At the top of the area to the right of the hex display is a display of the names and data types of the columns of the currently edited schema.
  • To the right of the name display is a display of the values of each column in the current row of the loaded schema. It should be noted this display and the name display scroll and highlight independently of each other.
  • Below these displays are four buttons: A double left arrow, a left arrow, a right arrow, and a button labeled problem.
    1. The double left arrow changes the displayed row to the first parsed row of the table.
    2. The left arrow changes the displayed row to the previous row.
    3. The right arrow changes the displayed row to the next row if the current row is valid.
    4. The problem button changes the displayed row to the first row that can’t be decoded with the current schema.

  • Below the arrow buttons are six rows labeled: string, int, bool, float, optstring, and byte. Each row contains the following: The data type label, a display of what the expected next value is, and a button labled “use.” The use button adds a column of the data type to the end of the current schema or before the currently selected column if a column’s name is selected.
  • Below these rows is a row of buttons labeled: “Delete,” “Name,” “Show,” and “Set.”
    1. The “Delete” button removes the last column from the schema or the selected column if a column’s name is selected.
    2. The “Name” button brings up a prompt that lets you type a new name for a selected column.
    3. The “Show” button brings up a prompt with the text that gets saved to the master schema and then parsed when PFM loads. This text can be copied to your clipboard. Also, the schema displayed here will always have the same version number as the table being decoded.
    4. The “Set” button adds the current schema to the list of schemas PFM currently has available to parse tables with and saves it to your “schema_current.xml” file. This data will not be loaded the next time PFM runs and will be overwritten if you save something else to the file in a different PFM session.

  • Below this row is a row with the name of the table, the table version the pack file claims it is using, and the version of the schema that is currently being used to parse the table.
  • Below the table name is a row containing the number of currently parse-able table rows out of the number of rows the table claims to have and the number of parsed bytes out of the number of bytes of data in the table. When these are equal the row's text turns green.
  • The bottom row of the window is a text field labeled header length and a button named “Set.” The set button sets the header length of the table that is being used by the parser to the number of bytes displayed in the text field.
  • In the top bar of the window are several dropdown menus. The following options in these menus are of note:
    1. Definitions->available: This sub-menu allows you to select the schema version being edited. If PFM can’t decode a table it frequently defaults to the lowest version schema available in the master schema. If the decode tool has selected a low version, it is frequently worth checking here to see if there is a schema version that is closer to the proper decryption than the version selected by default.
    2. More->Add all again: This duplicates all the columns in the current schema. It does not matter what columns you have selected.
    3. More->Toggle encoding: This toggles whether the “string” and “optstring” buttons add “string”s and “optstring”s to the table or “string_ascii”s and “optstring_ascii”s by default.
    4. More->Parse from here/Parse from Start: Sets the start of the current list of rows to the current row/first row. It should be noted that this changes the row number of the base row but not the row number the decode tool is looking at so you will see a different row after hitting one of these buttons. It will also make the number of rows decoded not turn green if all the rows in the table are decoded. These buttons are useful if two potentially equal length data types have been confused with each other but the error doesn’t become apparent until a row in the middle of the table.
    5. More->Transform: The options in this submenu change the data type of a selected column from one data type to another.

PFM Data Types
  • Int: A four-byte big-endian integer. If you see this as a number with more than five digits it is probably incorrect. If you see an autonumber in DAVE and it appears to be an integer followed by another zero integer in the pack file, standard practice is currently to label the second integer as “-“ until an eight-byte integer is implemented into the PFM parser (DAVE outputs autonumbers as several things including a string, an int, or a long (eight byte integer)).
  • Float: A four-byte floating point number. Since this has a different encoding than integers, it is typically easy to differentiate between the two (they only look the same when they are both zero). If this contains an exponent it is probably incorrect.
  • Bool: A one-byte boolean.
  • String_ascii: A two-byte integer that contains the length of the string followed by an ASCII-encoded string of the indicated length. ASCII characters have a length of one byte each. This is the default string in games since and including Rome 2.
  • String: A two-byte integer that contains the length of the string followed by an UNIX-encoded string of the indicated length. UNIX characters have a length of two bytes each. This is the default string in games up to Shogun 2.
  • Optstring_ascii: A boolean indicating whether there is a string followed by a string_ascii if true.
  • Optstring: A boolean indicating whether there is a string followed by a string if true.
  • List: You probably won’t see these. These are a list of a series of rows repeating an integral number of times. These are created by selecting an integer and the columns that repeat and then selecting More->Transform->Fold list. They cause several inconsistencies in the decode tool, particularly in the prediction of the value of the next column.
  • Byte: You won’t see these. They appear nowhere in the schema. The decode tool does not add them properly. Trying to use these frequently causes the decode tool to start throwing errors and lose your work. Formatted in the schema as “byte#” where # is the number of bytes in the byte block.

Reading the table list in PFM:
You may have noticed that the list of tables in PFM is color coded. This is how:
  • Black tables have a schema and are not empty. The schema may or may not work. This color has the lowest priority for the folder color.
  • Red tables don’t have a schema and are not empty. Alternatively, tables turn red when they are marked as modified even if they have the original value.
  • Green tables have been renamed but not saved since the rename. This color has a higher priority than modification, but does not apply to the folder.
  • Blue tables have no data rows. This is currently the highest priority color for folders regardless of whether the folder also contains tables with other colors.
  • Table names with a yellow background highlight have a lower version number than the maxVersions file for the selected game. The schema version may still match the table’s version. The highlight won’t appear if the table version is greater than the game’s indicated maxVersion for the table.

Decoding a table in PFM:
  1. Open a pack file in PFM that contains the table you want to decode.
  2. Right-click on the table you want to decode, select "Open -> Open DecodeTool..."
  3. Edit the schema using this window until the schema is correct or you think the values of the columns you are adding belong to the next row.
  4. If you think you see the next row, delete the extra columns and use the problem button to proceed to the first row that doesn't successfully parse so you can check if an incorrect data type was chosen for a column. This happens most frequently when a fixed length element has been inserted in place of a variable length element or vice-versa.
  5. When the schema is correct add the table to the list of schemas PFM can currently use with the "Set" button so it is easy to verify that the table has been parsed correctly.
  6. Reload the table using your new schema by selecting a different table and then selecting the table with the new schema again.
  7. Now that you can read your table, check the table in DAVE, reopen the decode tool or paste the schema text into a text editor and name your columns.
  8. If you haven't already, hit the "Show" button to bring up the schema text. When you are done this text can be copied and pasted into your master schema (above the "</schema>" entry or between a "</table>" and a "<table>" entry) to retain your results and pasted into PFM's schema update thread to provide it for the rest of the community. For now though, this text will need more editing to be fully complete since some options are missing from the decode tool.
  9. The first thing to be done to the new schema is to label which columns are keys. To find out which columns are keys, we need to open the table's schema in DAVE. The "Show Schema" button is located in the upper right corner of each table in DAVE unless you are looking at Shogun 2's DAVE. If you can't open the schema in DAVE for any reason, it is also stored as an xml table per table in the assembly kit's "raw_data/db" folder as any file starting with "TWaD_*." Any column with "Key" marked as true in DAVE's schema should also have "pk='true' " inserted into the "<field" value prior to the "/>".
  10. It is also beneficial to mark which columns pull their values from another table. We also find this information in DAVE's schema. For this we are primarily interested in the "Table relationship" and "Field relationship" columns but "Lookup fields" can also be important when DAVE is behaving oddly. "Table relationship" contains the name of the table that is being referenced, minus the "_tables" ending; "Field relationship" contains the column name; and "Lookup fields" contains the other columns that are referenced along with "Field relationship"'s column. This information will be used by inserting "fkey='TableName.ColumnName' " between the "<field" value and "name" value in the schema text for the column. This information does not include the value from "Lookup fields" since PFM can't currently do anything with that data so we only care about it as a sign of potential weird behavior. If "Lookup fields" has multiple values separated by colons, the value displayed in DAVE's table isn't the one being saved to the pack file, it is the value(s) in another table that TW games are looking up using the value in the pack file. In order to see the pack file's value, change the display dropdowns above the table data in DAVE from "Lookup Values" to "Underlying." Also, if "Lookup fields" has a value but "Table relationship" and "Field relationship" don't in DAVE's schema, then an odd error in DAVE that I don't know the cause of has happened and DAVE has decided not to display those two values even though they exist. When this happens there is typically a pattern to what the values are, based off of the table and column names, but it is easiest to look up the values in the "TWaD_*" file from the last step.
  11. With the fkey values and pk values added to the schema and all the columns named it is now complete and ready to be added to the master schema and uploaded to the PFM schema update thread. It may still be wise to save your additions to the master schema in another file though just in case a schema update comes along before you have a chance to post your changes and overwrites your master schema file.



In case the default tables can't be reliably parsed and an alternative pack file is necessary, here is a guide on editing and outputting pack files from the assembly kit.
First, a disclaimer for those using Shogun 2's DAVE: Shogun 2's DAVE does not work properly if it is installed into a folder whose path includes spaces. The other games don't have this issue.
Creating a sample table in DAVE:
  1. Open Tweak
  2. Select DAVE
  3. Open the Table Launcher
  4. Select a table
  5. If the table has similar columns or a column of only blank strings, you will have to enter unique values for the data in a row to make it easier to identify which column is which. Avoid using zeroes, ones, pure false values, and blank strings for this as they can all look similar in PFM's decode tool.
  6. Apply your changes. (Attila and later give you the choice whether you want to close the table when you apply. Rome 2 and Shogun 2 automatically close the table.) (If you are using Rome 2's DAVE and it asks you to set states for edited text when you try applying your changes, select the cells with an icon in them and hit the button with a square with an x in it to the left of the "Show Schema" button in the table window to mark them as placeholder values. This will make DAVE stop complaining and apply your changes.)
  7. Export the table. There are three ways to do this, I am familiar with two which I will explain.
  8. If you have no other tables edited other than the ones you are checking, you can select "Export Changes to Binary" to export all of your edited tables. Otherwise, you can use "Export Single Table(s)" to open the "Table Export" window, select tables to be exported and hit "Export." This creates folders in the assembly kit's "working_data" folder that contain the data from the tables.
  9. Now that your tables have been exported, open BOB.
  10. Select the data checkbox in the Retail Data column and hit start. This takes the data from the assembly kit's "working_data" folder and places it in a pack file called "mod.pack" in the assembly kit's "retail/data" folder.

To reset DAVE's contents:
  1. Right-click on the tool in Steam
  2. Open the Properties Window
  3. Go to the Local Files tab
  4. Click Verify Integrity of Tool Files

Alternatively:
  1. Before editing files in DAVE create a copy of the raw_data folder
  2. When you want to reset paste over your raw_data folder with your copy
  3. Delete the contents of the db folder in the working_data folder