When deploying a large number of documents and/or a large number of users (more than a few hundred) on Google Drive, performance issues may occur, such as longer response times in the Google Drive user interface, incomplete search results, or intermittent error messages.
This article provides the best practices that will help you and your team avoid these issues. In addition to the information provided below, we strongly recommend that you contact the AODocs technical support team before importing large amounts of content into Google Drive and AODocs. Our team will help you find the best strategy to import your documents without running into performance problems.
|Ownership transfer and memory of the “first” ownership|
|Sharing permission performances|
The most important guidelines are listed below:
- When using global enterprise accounts to centralize the ownership of documents, performance bottlenecks can happen in scenarios where a large number of documents are shared with a large number of users, and when these users are very active (i.e. they make frequent changes to the documents). While there is no hard limitation on the number of documents you can have in a single AODocs library, we strongly recommend that you contact our technical support team when you plan to have a large number of documents (tens of thousands) managed by AODocs. Our team will provide you with the best practices and personalized guidelines that will help you configure your AODocs domain to avoid any performance issue. The sections below provide more details about these best practices.
- Updating sharing permissions is a slow operation in Google Drive, especially when a large number of files or a large number of sharing permissions are involved (the typical example being when you want to add or remove multiple sharing permissions on a folder containing a large number of items, and when these permission updates need to be propagated to all the items contained in the folder). To mitigate this issue, it is very important to minimize the number of sharing permissions applied to your files and folders, and the best way to achieve this is to assign permissions to Google Groups instead of individual users, whenever it is possible. The sections below provide more details about these limitations.
- Adding a shared folder to the “My Drive” folder of many end-users can lead to performance issues. If a folder structure needs to be shared with a large number of users (thousands), then you should avoid pushing it to the “My Drive” folder of the users and consider alternative sharing methods, like for example using an AODocs library accessed via the AODocs user interface, or using a Google Site. The sections below provide more details about these limitations.
Although Google Drive is a very scalable platform designed for millions of users, the Google Drive infrastructure has limited processing capabilities for each individual account.
As indicated above, a combination of a large number of files, with a large number of sharing permissions and a high activity on the file can result in an “overload” of the Google Drive account owning the files and therefore to cause visible performance impact for the end-users. To work around this potential issue, we recommend that the content stored in AODocs should be split in multiple libraries, and that a sufficient number of Google Drive service accounts should be allocated so that the Google Drive workload is distributed over these accounts. Our support team can help you determine the number of storage accounts that is appropriate for your specific parameters.
Ownership transfer and memory of the “first” ownership
Google Drive makes it possible to transfer the ownership of a file to a different account on the same G Suite domain. A file can only have a single owner, so when transferring a file’s ownership from account A to account B, account B becomes the new owner of the file and gets the associated privileges:
- The file can only be deleted (i.e. moved to the Trash) by user B
- The file size is counted against user B’s storage quota
- Only user B can transfer the file ownership to another account
- Only user B can decide whether other editors can update the file’s sharing permissions or not
However, as far as performance is concerned, the file is still linked to its original owner (user A), and the per-account limitations described above continue to apply to the file as if it was still owned by user A. As a result, when defining the storage strategy for a large number of Google Drive files, one should consider the number of “original owners” of the files (i.e. the accounts which will be used to first create or upload the files) as well as the number of “final owners” of the files.
For example, let’s imagine we have an on-premises file server with 300,000 files to be migrated to Google Drive. Following our team’s recommendations, we will allocate 5 Google Drive accounts and spread the files evenly across these 5 accounts, so that each account will end owning 60,000 files (that will leave some room for the growth of the library). We can consider the following scenarios for uploading the content:
- Using automated migration tools (such as the file server migration tool provided by AODocs), we upload the files directly to their assigned Google Drive account: in this case, there is no ownership transfer and the “original owner” of each file is also the “final owner”. The performance limitations described above apply normally to the final storage accounts
- “Crowdsourced” upload: we ask a “large number” of users (for example, 100 users) to manually upload files to their My Drive folder, and then to transfer the ownership of the files to the final storage accounts (for example by using the folder import function in AODocs). In this case, the 300,000 files in AODocs have 100 different “original owners”, and none of these “original owners” has too many files, so there are no performance issues to be expected here.
- Single upload: we upload all 300,000 files to the same Google Drive account, and then we transfer the ownership to the final storage accounts. In this case, despite the fact that no final owner owns more than 30,000 files, there is a single “original owner” for all the 300,000 files and this account will become a performance bottleneck as described above.
We recommend to use scenario 1 since it has the advantage of simplicity (the “final owners” are the same as the “original owners”, so there is no bad surprise nor side effect to expect).
Scenario 2 is acceptable as long as the largest number of files uploaded by any single user remains reasonably small (thousands). Scenario 3 should be strictly avoided since it exceeds the limitations described above.
Sharing permission performances
The biggest source of “server workload” in Google Drive is the propagation of sharing permissions. Each time a single permission is added or removed on a file or a folder, the Google Drive servers have to update their internal database, therefore the amount of work that the Google Drive servers need to do is very much related to the average number of sharing permissions on each file and folder.
Permission changes can happen not only when explicitly sharing / unsharing items in Google Drive, but also when moving items to different parent folders, because the sharing permissions of the new parent will be propagated to the content being added.
For example, let’s imagine a folder named “Accounting documents” which contains 100 files and is shared with 2 users, and a folder named “Finance” which is shared with 10 users (including the 2 having access to “Accounting documents”). If someone moves “Accounting documents” into “Finance”, then Google Drive will have to add 8 new users to each file contained inside the “Accounting documents” folder, which will require 808 permission changes (8 for the “Accounting documents” folder itself, plus 8 permission changes on each file).
Therefore the overall performance of a set of shared documents is more or less “proportional” to the total number of files AND to the number of sharing permissions added or deleted per file. To mitigate this impact, we strongly recommend to set sharing permissions by using a small number of Google Groups rather than a long list of individual accounts.
Replacing 10 individual users by a Google Group (containing the same 10 people) can result in a significant performance improvement. Adding an 11th member on this group will not require any permission change (the files are already shared with the group), while adding a 11th person on the shared folder will result in as many permission changes as the number of files in the folder. Note that both Google Drive and AODocs fully support “nested groups” (groups members of other groups) so it is also possible to replace a set of sharing permissions on multiple Google Groups by a single Google Group containing them.
Based on measurements made on some large Google Drive deployments, updating a few permissions on a folder tree containing 10,000 documents can take more than 4 hours.
When a folder or a file is added to the “My Drive” of a user, the user is considered to have “subscribed” to this item and the user’s Google Drive account gets access to the file or folder (for example, the file or folder will show up in the search results when doing a keyword search). This connection impacts some of the internal workings of Google Drive, regarding the full text search, audit logs, activity panels and other back-office activities. The net result of this connection is that every modification happening on the files or folders that are in the user’s “My Drive” (or in a subfolder of the user’s “My Drive”) will send a notification to this user, thus impacting the overall performance. Each individual notification sent does not cost much, but when a file or folder has thousands of “subscribers” (i.e. if thousands of users have it in their “My Drive”), then the performance overhead becomes significant.
The Google Drive infrastructure was not designed to support such scenarios, so we strongly recommend not to exceed 300 “subscriptions” per file or folder, i.e. never to add a file or folder into the “My Drive” of more than 300 users. Unlike the sharing performance explained above, the limitation on “subscriptions” does not depend on the way the file or folder is shared. Sharing the item via a single Google Group, multiple Google Groups, or an explicit list of users will result in the same number of “subscriptions” being added to the item when using the AODocs “Push to My Drive” feature, since in all scenarios AODocs will end up adding the folder to each individual user’s My Drive folder.