While we did the conversion almost exactly the same way as XenForo import, there were some differences which resulted in this issue:
- To improve accuracy when the site contains multiple character sets, the character set detection algorithm is used (XenForo only uses the character set of the default language, which may be incorrect if content was posted using a different forum language).
- Due to a bug in PHP, it is not possible to detect character sets like Windows-1252 or CP-1252; however, they are similar to ISO-8859-1, and are detected as such (something VaultWiki's importer was already aware of).
- Due to a bug in PHP, converting from ISO-8859-1 does not include high-byte characters; these are only converted when converting from Windows-1252 (something XenForo's importer was aware of, but not VaultWiki's).
This is fixed in the next release by treating a detected character set of ISO-8859-1 as Windows-1252, since Windows-1252 cannot be detected and it includes ISO-8859-1 anyway.
Additionally, the automatic correction of the database character set from blank to latin1 had not been working for a few versions; the charset had to be manually entered in the import config file. The setting is obsolete in the next release anyway, since it introduces form-based import configs, with validation for settings like these.
I suspect that this change will resolve all your other similar reports as Duplicates, but I will check each case individually.