Issue: Fetch Article Uses Too Much Memory

Issue Tools
- View Changes

May 30, 2010 1:21 AM

pegasus

VaultWiki Team

Fetch Article Uses Too Much Memory

Currently, vB_WikiArticle::fetch retrieves all articles in the requested namespace. This is done to avoid running fetch more than once on a single page, but it also means a lot of unnecessary preprocessing and memory usage since most of the other pages will not be used. This would be especially problematic for wikis with millions of articles. There must be some way to obtain a list of needed article names ahead of time without fetching them all at once.

How about using a preg_match statement to cycle through all posts and look for links (wouldn't catch autolinks). Another option is to use the link cache, which may contain autolinks, but which may be out of date, disabled, or be slower than performing a preg_match (testing will tell). As for book and chapter links, a list of IDs can easily be compiled to fetch the items that are shown. If the fetch method becomes more generic than its current state, and we can fetch by thread ID, then we can also reduce the number of fetches that are performed (provided we already know the thread ID).

So a new class should be made:
preFetch - registers thread IDs or titles to be added to memory
fetch - current behavior, except use only the existing cache. if the item is not cached, perform a slave query
commit - retrieve the registered info and add it to memory
update - perform a db and cache update

Issue Details

Issue Number 1607

Issue Type Task

Project VaultWiki 4.x Series

Category BB-Code Parsing

Status In Progress

Priority 6 - Dev-Related Tasks

Target Version 3.x

Resolved Version (none)

Milestone VaultWiki 4.2

Software DependencyAny

License TypePaid

Votes to perform 1

Votes not to perform 0

Attachments 0

Assigned Users (none)

Tags (none)

October 6, 2010 3:42 PM

pegasus

VaultWiki Team

Cache size has been reduced by half in VaultWiki 4. Closing this for now, as I don't think the changes proposed in the OP would be practical given certain common situations.

Reply
August 14, 2011 9:02 PM

pegasus

VaultWiki Team

Reopening this for consideration in 4.0.0 Beta 1, as I actually this can be done if we register not only IDs and titles, but handlers that are supposed to update links in their various contexts. This is a technique used throughout VaultWiki 4, and could allow us to make a generic, extensible system.

We can have a class like vw_LinkQueue_Model or something. If the size of the full article list exceeds a certain amount or is unknown, the queue can be used; otherwise, the old load-everything-up-front method can be used.

Reply
October 15, 2013 11:28 AM

pegasus

VaultWiki Team

Originally reported for 3.0.0 RC 3.

Reply
February 6, 2014 1:42 PM

pegasus

VaultWiki Team

We can actually reduce the size of the page list in memory by applying some tricks to how the list is stored in memory. By using fixed-length arrays for the actual page records, we can save almost 20% for each list.

I am working on this now with a client who has a memory limit of 32MB, and a page list that is 40000+ records long. After implementing a fixed-length array, each record uses 460 bytes versus 605 bytes. For one test set at 8.3 MB initially, this reduces it to 6.4 MB. Further, we don't need to store this information as an array all the time, as it is accessed infrequently. Instead, we can serialize each fixed-length array and unserialize them as we need the actual record info (often it is only an if-exists check). This further reduces each record to 161 bytes each, for a total of 2.2 MB used in our test set.

That's 25% of the memory footprint, with very little impact to the speed of lookups in the list.

Reply

+ Reply

All times are GMT -4. The time now is 12:26 PM.

This site uses cookies to help personalize content, to tailor your experience, and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.

Learn more… Accept Remind me later

Welcome to VaultWiki.org, home of the wiki add-on for vBulletin and XenForo!

Issue: Fetch Article Uses Too Much Memory

Issue Tools