As the developer of a tool to edit CUF fonts I was wondering if a CA developer might be able to answer a few questions. The CUF files are the fonts used by anything from M2TW to even S2TW a good understanding of how they work will help fix all sorts of text rendering issues that crop up once you start modding the CUF files.
On the CUF file format
Quite a few of the CUF “properties” in the first table are poorly understood by me and the one other person I've encountered on the Internet who I know has been looking into this format as well. He/she goes by the handle of just on the TWC. That discussion is over at the TWC, starting with this post to be precise. Below is a summary of what just and I so far found out (it's a few forum posts collated into one, so “I” or “me” might refer to either myself or to just). Could you please clarify what these properties refer to, or correct our misunderstanding of the format where appropriate?
Summary of what we do (not) know so far:
[spoil]
• 4 bytes magic 'CUF0'
• short 0: unknown
• short 1: unknown, seems to always match the height of a capital, but is not used for rendering
• short 2: line height, seems to be two pixels lower than the actual line height.
I think this is used for calculating/allocating a proper height of the entire rendered image (block of text) so that item which extend (far) below the baseline are not cut off.
• short 3: unknown
• short 4: unknown
• short 5: seems to have something to do with baseline, but there is a maximum and a minimum, in one font I tested changing default 7 to 0xF resulting in the text moving down two pixels, while 7 to 1 results in moving up by five pixels.
I think it is the baseline, or at least an equivalent value. Whether signed or unsigned I do not know.
• short 6: also seems related to baseline, changing this matches the changed baseline exactly.
I call it LayoutYOffset. It is a signed short (rather than the unsigned ones that most of the others are). The absolute value is an offset as described earlier, where 0<= offset <= 0x7ff. If the bit corresponding to 0x8000 is set then it is a negative offset, otherwise positive.
• short 7: how wide a space is for justification and text wrapping calculations, the width for rendering a space is as it is specified in the glyph properties and/or kerning.
• short 8: how much less wide a line (or maybe the dynamically sized container with the line) is than the sum of the glyph sizes. If you use a value that is larger than the width of the first word and following space then there is an extra line wrap after the first word and space. In this specific case the space is rendered for justification reasons, normally the space that turns into a line wrap is not rendered/calculated.
I thought it was the horizontal equivalent of #5. At any rate it appears to be a signed short, too; but the above interpretation explains some of the behaviour I saw better.
• short 9: the largest width of any glyph in the font. Anything to the right of this amount of pixels is not rendered.
• short 10: the largest height of any glyph in the font. Anything below this amount of pixels (calculated from the top of that glyph) is not rendered.
• short 11: the number of glyphs
• int 12: total size of the glyph bitmaps
• Array of 65536 shorts/[w]chars. Position in the array indicates UTF-16LE/UCS2 code point, 0xFFFF indicates “not supported”. I call it “chartable” since that's basically what it is. Values correspond to offsets in the following tables.
• for each glyph
o byte 0: starting height of the glyph in pixels above the baseline, 0x80 indicates the glyph doesn't have a height.
o byte 1: Width that should be allocated to the glyph: this is used to advance to the position at which the next glyph should be drawn.
o byte 2: width of the bitmap data of the glyph
o byte 3: height of the bitmap data of the glyph
• for each glyph
o int: offset that indicates where in the bitmap data the data for that glyph is, relative to the start of the bitmap data.
• bitmap data
[/spoil]
ETW expanded on the CUF file format seen in M2TW by introducing kerning tables at the end of the file format.
Empire extension (kerning), as discovered by just on the TWC:[ex]
• short: number of glyphs for which kerning data is available
• short: the amount of glyphs to skip, the first nn glyphs don't have kerning data, some of the last might not either if the number of glyphs with kerning and the skipped glyphs are less than the total amount of glyphs
• for each glyph with kerning
o for each glyph with kerning
byte: width of the glyph in the outer loop when followed by the glyph in the inner loop, the width is calculated form the left of the glyph.
Tooltip text width
A related question (this part of the engines is simply not well researched) is about the height-for-width swap algorithm used by the engine to render tooltips. That is, when a tooltip appears to be too wide it is made more narrow by increasing the number of lines of text in the tooltip and wrapping the long lines. There does not appear to be a clear maximum width limit, so descriptions cannot be made to force a single width for rendering purposes (e.g. to support indentation for a quote in trait description).
The question then is: do you know whether there are certain specific aspect ratio's being enforced, or more generally how the height-for-width swap algorithm works? Is it similar to what e.g. GTK 3 is doing?
Being able to know what width a piece of text will require is very useful for more advanced modding. It allows us to fix text layout in advance which in turn enables supporting proper indentation based around tabs (not in terms of number of spaces but in terms of pixels, creating an effect similar to that of word processors), as well as prevent/circumvent issues with how ETW/NTW/Medieval II/Kingdoms text layout algorithms do not wrap lines properly for characters from locales that they do not support. (E.g. line wrapping appears to be broken for CJK characters in M2TW Kingdoms at least, which means that if we need to use CJK slots we have to do the line wrapping ourselves or pay the penalty when the engine simply does not render characters which are outside the bounding box as a result.)