• Register
    • Help

    striker  0 Items
    Currently Supporting
    • Home
    • News
    • Forum
    • Wiki
    • Support
      • Manage Subscriptions
      • FAQ
      • Support For
        • VaultWiki 4.x Series
        • VaultWiki.org Site
    • What's New?
    • Buy Now
    • Manual
    • 
    • Support
    • VaultWiki 4.x Series
    • Bug
    • Incorrect string value:

    1. Welcome to VaultWiki.org, home of the wiki add-on for vBulletin and XenForo!

      VaultWiki allows your existing forum users to collaborate on creating and managing a site's content pages. VaultWiki is a fully-featured and fully-supported wiki solution for vBulletin and XenForo.

      The VaultWiki Team encourages you to join our community of forum administrators and check out VaultWiki for yourself.

    Issue: Incorrect string value:

    • Issue Tools
      • View Changes
    1. issueid=4899 February 1, 2017 5:21 PM
      Alfa1 Alfa1 is offline
      Distinguished Member
      Incorrect string value:

      Code:
      Importing edits to wiki pages
      An exception occurred: Mysqli statement execute error : Incorrect string value: '\xB1-meth...' for column 'section' at row 1 in [path]/library/Zend/Db/Statement/Mysqli.php on line 214
      #0: Zend_Db_Statement_Mysqli->_execute() in [path]/library/Zend/Db/Statement.php at line 297
      #1: Zend_Db_Statement->execute() in [path]/library/Zend/Db/Adapter/Abstract.php at line 479
      #2: Zend_Db_Adapter_Abstract->query() in [path]/library/Zend/Db/Adapter/Abstract.php at line 574
      #3: Zend_Db_Adapter_Abstract->insert() in [path]/library/XenForo/DataWriter.php at line 1638
      #4: XenForo_DataWriter->_insert() in [path]/library/XenForo/DataWriter.php at line 1627
      #5: XenForo_DataWriter->_save() in [path]/library/vw/XenForo/DataWriter.php at line 352
      #6: vw_XenForo_DataWriter->_save() in [path]/library/XenForo/DataWriter.php at line 1419
      #7: XenForo_DataWriter->save() in [path]/vault/core/controller/dm/xf.php at line 415
      #8: vw_DM_Controller_XF->save() in [path]/vault/core/controller/import/handle/vw3/revision/vw.php at line 343
      #9: vw_Import_Handle_VW3_Revision_Controller->add_edit() in [path]/vault/core/controller/import/handle/vw3/revision/vw.php at line 272
      #10: vw_Import_Handle_VW3_Revision_Controller->do_edit() in [path]/vault/core/controller/import/handle/vw3/revision/vw.php at line 152
      #11: vw_Import_Handle_VW3_Revision_Controller->do_edits() in [path]/vault/core/controller/import/steps/vw3/vw.php at line 466
      #12: vw_Import_Steps_VW3_Controller->{closure}() in [path]/vault/core/controller/progress/steps/vw.php at line 83
      #13: vw_Progress_Steps_Controller->call() in [path]/vault/core/controller/progress/steps/vw.php at line 53
      #14: vw_Progress_Steps_Controller->execute() in [path]/vault/core/controller/progress/vw.php at line 92
      #15: vw_Progress_Controller->exec_script() in [path]/vault/core/controller/progress/vw.php at line 74
      #16: vw_Progress_Controller->execute() in [path]/vault/core/controller/cp/progress/vw.php at line 35
      #17: vw_CP_Progress_Controller->process() in [path]/vault/core/controller/cp/impex/vw.php at line 108
      #18: vw_CP_ImpEx_Controller->import() in [path]/vault/core/controller/cp/impex/vw.php at line 33
      #19: vw_CP_ImpEx_Controller->execute() in [path]/library/vw/XenForo/ControllerAdmin/Wiki.php at line 118
      #20: vw_XenForo_ControllerAdmin_Wiki->actionIndex() in [path]/library/XenForo/FrontController.php at line 351
      #21: XenForo_FrontController->dispatch() in [path]/library/XenForo/FrontController.php at line 134
      #22: XenForo_FrontController->run() in [path]/admin.php at line 13
    Issue Details
    Issue Number 4899
    Issue Type Bug
    Project VaultWiki 4.x Series
    Category Importing
    Status Fixed
    Priority 2 - Fatal / Database Errors
    Affected Version 4.0.16
    Fixed Version (none)
    Milestone (none)
    Software DependencyXenForo 1.x
    License TypePaid
    Users able to reproduce bug 0
    Users unable to reproduce bug 0
    Attachments 0
    Assigned Users (none)
    Tags (none)


    Page 1 of 2 12 Next LastLast


    1. February 2, 2017 10:10 AM
      pegasus pegasus is offline
      VaultWiki Team
      I see an invalid UTF-8 byte right there. This might be a little touch and go. In vault/core/model/string/vw.php, find:
      Code:
      	public function mb_clean($input)
      	{
      		$utf8 = '#([\x09\x0A\x0D\x20-\x7E]' .		# ASCII
      			'|[\xC2-\xDF][\x80-\xBF]' .				# non-overlong 2-byte
      			'|\xE0[\xA0-\xBF][\x80-\xBF]' .			# excluding overlongs
      			'|[\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}' .	# straight 3-byte
      			'|\xED[\x80-\x9F][\x80-\xBF]' .			# excluding surrogates
      			'|\xF0[\x90-\xBF][\x80-\xBF]{2}' .		# planes 1-3
      			'|[\xF1-\xF3][\x80-\xBF]{3}' .			# planes 4-15
      			'|\xF4[\x80-\x8F][\x80-\xBF]{2})#S';	# plane 16
      
      		$string = '';
      		$matches = array();
      
      		$pcre_limit = $this->pcre_limit;
      
      		if ($pcre_limit AND strlen($input) > $pcre_limit)
      		{
      			$chunks = str_split($input, $pcre_limit);
      		}
      		else
      		{
      			$chunks = array($input);
      		}
      
      		foreach ($chunks AS $chunk)
      		{
      			while (preg_match($utf8, $chunk, $matches))
      			{
      				$string .= $matches[0];
      				$chunk = substr($chunk, strlen($matches[0]));
      			}
      		}
      
      		return $string;
      	}
      Replace with:
      Code:
      	public function mb_clean($input, $replace = '')
      	{
      		$utf8 = '#([\x09\x0A\x0D\x20-\x7E]' .		# ASCII
      			'|[\xC2-\xDF][\x80-\xBF]' .				# non-overlong 2-byte
      			'|\xE0[\xA0-\xBF][\x80-\xBF]' .			# excluding overlongs
      			'|[\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}' .	# straight 3-byte
      			'|\xED[\x80-\x9F][\x80-\xBF]' .			# excluding surrogates
      			'|\xF0[\x90-\xBF][\x80-\xBF]{2}' .		# planes 1-3
      			'|[\xF1-\xF3][\x80-\xBF]{3}' .			# planes 4-15
      			'|\xF4[\x80-\x8F][\x80-\xBF]{2})#S';	# plane 16
      
      		$string = '';
      		$matches = array();
      
      		$pcre_limit = $this->pcre_limit;
      
      		if ($pcre_limit AND strlen($input) > $pcre_limit)
      		{
      			$chunks = str_split($input, $pcre_limit);
      		}
      		else
      		{
      			$chunks = array($input);
      		}
      
      		foreach ($chunks AS $chunk)
      		{
      			while ($chunk AND preg_match($utf8, $chunk, $matches, PREG_OFFSET_CAPTURE))
      			{
      				if ($replace AND $matches[0][1])
      				{
      					$string .= str_repeat($replace, $matches[0][1]);
      				}
      
      				$string .= $matches[0][0];
      				$chunk = substr($chunk, strlen($matches[0][0]) + $matches[0][1]);
      			}
      
      			if ($replace AND $chunk)
      			{
      				$string .= str_repeat($replace, strlen($chunk));
      			}
      		}
      
      		return $string;
      	}
      In vault/core/controller/dm/base/vw.php, find:
      Code:
      	public function verify_utf8(&$text)
      	{
      		$regex = '/[\x00-\x08\x10\x0B\x0C\x0E-\x19\x7F]';
      		$regex .= '|[\x00-\x7F][\x80-\xBF]+';
      		$regex .= '|([\xC0\xC1]|[\xF0-\xFF])[\x80-\xBF]*';
      		$regex .= '|[\xC2-\xDF]((?![\x80-\xBF])|[\x80-\xBF]{2,})';
      		$regex .= '|[\xE0-\xEF](([\x80-\xBF](?![\x80-\xBF]))|(?![\x80-\xBF]{2})|[\x80-\xBF]{3,})/';
      
      		$text = preg_replace($regex, '?', $text);
      
      		$regex = '/\xE0[\x80-\x9F][\x80-\xBF]';
      		$regex .= '|\xED[\xA0-\xBF][\x80-\xBF]/S';
      
      		$text = preg_replace($regex, '?', $text);
      	}
      Replace with:
      Code:
      	public function verify_utf8(&$text)
      	{
      		$text = vw_Hard_Core::model('String')->mb_clean($text, '?');
      	}
      Messing with the byte sequences has the potential to import non-ASCII characters incorrectly. Would have been good to have seen this on your test import. This must be from a new edit on your source wiki in the past month or two.
      Reply Reply
    2. February 2, 2017 1:48 PM
      Alfa1 Alfa1 is offline
      Distinguished Member
      That makes me very nervous. I don't feel confident doing it because I am afraid it could mess up my website data.
      Reply Reply
    3. February 5, 2017 8:13 PM
      Alfa1 Alfa1 is offline
      Distinguished Member
      Quote Originally Posted by pegasus
      This might be a little touch and go. Messing with the byte sequences has the potential to import non-ASCII characters incorrectly.
      Could you please explain this further? I think my site contains a lot of non-ASCII characters. As you describe it it seems that your fix may result in data corruption or incorrect import of non-ASCII characters. The last thing I want is a messed up database. This is why I am afraid of your fix.

      I have now migrated my site from vb to xenforo without Vaultwiki.
      Reply Reply
    4. February 6, 2017 8:39 AM
      pegasus pegasus is offline
      VaultWiki Team
      This change alters the regular expressions that allow UTF-8 bytes into your content. While the old regex was not corrupting characters, according to the error you posted, it was allowing characters that were not UTF-8.

      The old regex had years of usage behind it without corruption. I just gave you a new regex; it simply does not have that level of use behind it. This regex should function similarly to the old one: the old one searched for non-UTF-8 characters and replaced them with a ?. The new one is slower, searches for valid UTF-8 characters, and any other characters are replaced with a ?.

      But the worst that will happen is that you will have to reinstall VaultWiki and try the import again with a different regex. This is no different than any of the previous imports you've done. It will not affect the original data, just the imported copy.
      Reply Reply
    5. February 6, 2017 3:11 PM
      Alfa1 Alfa1 is offline
      Distinguished Member
      What I am worried about is that the imported copy will have bad data without me noticing it for some time. Can this be avoided or checked somehow?
      Reply Reply
    6. February 6, 2017 3:33 PM
      pegasus pegasus is offline
      VaultWiki Team
      Pick 3 - 5 articles that you know contain a large number / variety of special characters and review them after the import. Also, retain the final backup of your VW3 database for a long time, in case you notice something later. It would not be a major issue for us to repair 1 or 2 articles from the backup. I like keeping backups indefinitely.
      Reply Reply
    7. February 16, 2017 10:39 AM
      pegasus pegasus is offline
      VaultWiki Team
      I have noticed that the above edit to vault/core/model/string/vw.php removes trailing "0"s from content, like if a title or a post ends in a number "1,000" with no markup after it.

      To fix it, change:
      Code:
      while ($chunk AND preg_match($utf8, $chunk, $matches, PREG_OFFSET_CAPTURE))
      To:
      Code:
      while ($chunk !== '' AND preg_match($utf8, $chunk, $matches, PREG_OFFSET_CAPTURE))
      I have not noticed any other issues yet.
      Reply Reply
    8. February 17, 2017 7:42 PM
      Alfa1 Alfa1 is offline
      Distinguished Member
      Quote Originally Posted by pegasus
      I have not noticed any other issues yet.
      Errr! I am going to pass on importing until I know that I can import without corrupting thousand+ wiki articles.
      Reply Reply
    9. February 17, 2017 7:46 PM
      Alfa1 Alfa1 is offline
      Distinguished Member
      Quote Originally Posted by pegasus
      But the worst that will happen is that you will have to reinstall VaultWiki and try the import again with a different regex. This is no different than any of the previous imports you've done. It will not affect the original data, just the imported copy.
      The difference is that we are live now. Once it is imported our members will start adding content.
      Basically you are saying that this needs to be tested fully on a development install and import to the live environment is not a good idea at this time.
      Reply Reply
    10. February 18, 2017 12:18 PM
      pegasus pegasus is offline
      VaultWiki Team
      That would have been ideal. But obviously the problem wiki data was not a part of the source wiki in your last test of the import. You can either test again with this new source data, or try it live.

      Personally, the only problem I noticed with the edits above involving trailing zeros, and I gave the fix for that. I have tested with a few articles that contain non-ASCII characters, and I did not see any problems; for the characters that were considered valid, they were left untouched. But I cannot test and review that it works for all 1,112,064 characters in the UTF-8 character set. That is almost beyond the testing anyone can do, even you if you had another test import to try. I do know that it works for the invalid string cited above "\xB1-meth..." and correctly changes it to a valid string as "?-meth...", and that the same regex used by this cleaner is used by other applications without reported issue.
      Reply Reply
    11. February 21, 2017 5:39 PM
      Alfa1 Alfa1 is offline
      Distinguished Member
      Do you have any idea how to best test the imported articles for the existence of non-ASCI characters?
      I mean: how do I know if any non-ASCI characters are in the source text?
      I do find these German Characters being correctly imported: http://german.about.com/od/writingge...a-Keyboard.htm
      For example: äß
      Does this mean I am good to go or are there other things that would be wise to test out?
      Reply Reply
    12. February 22, 2017 8:24 AM
      pegasus pegasus is offline
      VaultWiki Team
      Generally I just check articles for characters that I know are being used. Since you use German characters, and the articles you checked appear to be fine, then you should be mostly fine.

      If you want to check other articles, the following will test if the article contains only ASCII characters.
      Code:
      $ascii_only = mb_check_encoding($source_text, 'ASCII');
      
      if (!$ascii_only)
      {
      // contains non-ASCII characters
      }
      Reply Reply
    13. February 22, 2017 10:46 AM
      Alfa1 Alfa1 is offline
      Distinguished Member
      How and where do I apply that code?
      Reply Reply
    14. February 22, 2017 11:03 AM
      pegasus pegasus is offline
      VaultWiki Team
      I thought you were writing a script to test for non-ASCII characters. That code is out-of-context; I was providing you with the PHP function that does what you were asking about.
      Reply Reply
    15. February 22, 2017 1:52 PM
      Alfa1 Alfa1 is offline
      Distinguished Member
      No I am just looking trough posts and I can do database searches.
      Reply Reply
    Page 1 of 2 12 Next LastLast
    + Reply

    Assigned Users
    Loading Please Wait
    Tags
    Loading Please Wait
    • Contact Us
    • License Agreement
    • Privacy
    • Terms
    • Top
    All times are GMT -4. The time now is 2:35 AM.
    This site uses cookies to help personalize content, to tailor your experience, and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Learn more… Accept Remind me later
  • striker
    Powered by vBulletin® Version 4.2.5 Beta 2
    Copyright © 2025 vBulletin Solutions Inc. All rights reserved.
    Search Engine Optimisation provided by DragonByte SEO (Pro) - vBulletin Mods & Addons Copyright © 2025 DragonByte Technologies Ltd.
    Copyright © 2008 - 2024 VaultWiki Team, Cracked Egg Studios, LLC.