@tornnight
I don't know if it would help. It's sort of a
"to a man who has a hammer, every problem is a nail"
situation. It's written in MatLab so unless you've got
access to MatLab it won't work. It's quite short so somebody
could redo it in Python or Perl.
Code:
function read_modeldb
% Open input file and get entire file in one line then close it.
fidin = fopen( 'battle_models.modeldb' ) ;
remainder = fgetl( fidin ) ;
fclose( fidin ) ;
% Indices to the regular expression matches.
% Primary indices are those that match: # (space) character string with / . and numbers followed by (space).
indices1 = regexp( remainder, '[0-9]+\s[a-zA-Z][a-zA-Z_/.0-9]*\s' ) ;
fprintf( 'Number of primary matches: # (space) general string, is %d\n', length(indices1) ) ;
% Secondary indices are those that match: 0 (space)(space) #. Add 3 to these indices.
indices2 = regexp( remainder, '0\s\s[0-9]*\s' ) ;
indices2 = indices2 + 3 ;
fprintf( 'Number of secondary matches: 0 (space)(space) #, is %d\n', length(indices2) ) ;
% Secondary indices are those that match: 16 -0.090000004 0 0 -0.34999999 0.80000001 0.60000002
indices3 = regexp( remainder, '\s[0-9]+\s-0.090000004' ) ;
indices3 = indices3 + 1 ;
fprintf( 'Number of secondary matches: # (space) -0.090000004, is %d\n', length(indices3) ) ;
% Secondary indices are those that match: # # lower case string.
indices4 = regexp( remainder, '\s[1-9]+\s[1-9]+\s[a-z]*\s' ) ;
indices4 = indices4 + 1 ;
fprintf( 'Number of secondary matches: # (space) # (space) general string, is %d\n', length(indices4) ) ;
% Secondary indices are those that match: single digit# single digit# general number# general string
indices5 = regexp( remainder, '\s[1-9]\s[1-9]\s[1-9]+\s[a-zA-Z][a-zA-Z_/.0-9]*\s' ) ;
indices5 = indices5 + 1 ;
fprintf( 'Number of secondary matches: single digit# (space) single digit# (space) general number# (space) general string, is %d\n', length(indices5) ) ;
% Merge the indices and sort them.
indices = [ indices1 indices2 indices3 indices4 indices5 ] ;
indices = sort( indices ) ;
fprintf( 'Total number of indices found: %d\n', length(indices1)+length(indices2)+length(indices3)+length(indices4)+length(indices5) ) ;
fprintf( 'Total number of indices after merging: %d\n', length(indices) ) ;
% Remove repeated indices that matched in more than one regular expression.
indices = unique( indices ) ;
% Open output file.
fidout = fopen( 'reformattedCRLF_battle_models.modeldb', 'w' ) ;
% Print out line-by-line, spaces will be important.
fprintf( fidout, '%s\r\n', remainder(1:indices(1)-1) ) ;
for ii = 2 : length(indices)
fprintf( fidout, '%s\r\n', remainder(indices(ii-1):indices(ii)-1) ) ;
end
fprintf( fidout, '%s\r\n', remainder(indices(end):end) ) ;
fclose( fidout ) ;
In essence I look for five pattern matches using regular expressions. First is
'[0-9]+\s[a-zA-Z][a-zA-Z_/.0-9]*\s'
this is the main one, it says look for any decimal number, [0-9]+, followed by
space, \s, followed by any character string containing lower case, [a-z],
upper case, [A-Z], or _, /, or ., whole thing followed by space, \s.
Second match is to break after any of those 0(space)(space) occurences,
this is
'0\s\s[0-9]*\s'
meaning 0, two spaces, \s\s, and any number, [0-9]*, followed by space,\s.
I then add 3 to the returned indices because I wanted the break after
the two spaces like kleemann said.
And so on. This might be more detail than what you were asking for.
@ahiga
I hate to have to ask but being (figuratively speaking) new in town,
what does the term "+ rep" mean.