Skip to content

Commit

Permalink
MV3: Move translation from sandboxed content iframe to offscreen
Browse files Browse the repository at this point in the history
Offscreen page doesn't support eval, so we are actually embedding a
sandbox iframe there which then supports eval so we can run the
translate architecture there.

Message passing is technically complicated:
	Content Scripts <-chrome.runtime.sendMessage->
	Background Service Worker (background-worker.js) <-MessagePort->
	Offscreen Sandbox Iframe (offscreenSandbox.html)

The MessagePort is created by Offscreen Page (offscreen.html) and the
two ports are passed respectively to to Offscreen Sandbox Iframe and Background
page after which offscreen.html only performs the function of embedding
the Offscreen Sandbox and reinitializing the MessagePort in case the
Background Service Worker gets restarted by the browser.

We are moving translate to offscreen sandbox because the sandbox iframe
on content pages are causing issues on some websites where their pages
refuse to load due to loading code finding unexpected elements.

Also:
Added a wrapper to Zotero.Translate.Web which makes translate only be
concerned with scrapping the webpage (i.e. the stuff we need to be able
to run eval for). Scrapped item saving to Zotero and UI notifications
are now handled independently.

This was done because
1. The Zotero Translate architecture has no business in doing that stuff
   anyway
2. If ItemSaver runs in the Offscreen Translate page it then needs to
   be able to independently send progress callback messages to the tab
   where translation occurs, but we have no way to pass the tab
   information with which to communicate to ItemSaver if it is managed
   by Zotero.Translate, without changing the translate repo. Passing
   around that information via Zotero.Translate is dumb and creates even
   more insane and difficult to track interdependencies in translate
   architecture.

As part of this some refactoring to Zotero.Inject also occurred as that
namespace has troubled me for a while as more and more random unrelated
hard-to-place functions got crammed there. All page saving related
functionality has been moved to pageSaving.js
  • Loading branch information
adomasven committed Jan 7, 2025
1 parent de45382 commit 315239f
Show file tree
Hide file tree
Showing 25 changed files with 1,605 additions and 1,161 deletions.
7 changes: 4 additions & 3 deletions gulpfile.js
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,9 @@ var injectInclude = [
'translate/rdf/identity.js',
'translate/rdf/rdfparser.js',
'translate/translation/translate.js',
'translate/translation/translate_item.js',
'translate/translator.js',
'translate/utilities_translate.js',
'translate_item.js',
'inject/http.js',
'inject/sandboxManager.js',
'integration/connectorIntegration.js',
Expand Down Expand Up @@ -101,7 +101,7 @@ var injectIncludeBrowserExt = ['browser-polyfill.js'].concat(
var injectIncludeManifestV3 = ['browser-polyfill.js'].concat(
injectInclude,
['api.js'],
['translateSandbox/translateSandboxFunctionOverrides.js', 'translateSandbox/translateSandboxManager.js'],
['inject/virtualOffscreenTranslate.js'],
injectIncludeLast);

var backgroundInclude = [
Expand Down Expand Up @@ -149,7 +149,8 @@ var backgroundIncludeBrowserExt = ['browser-polyfill.js'].concat(backgroundInclu
'webRequestIntercept.js',
'contentTypeHandler.js',
'saveWithoutProgressWindow.js',
'translateSandbox/translateBlocklistManager.js'
'messagingGeneric.js',
'offscreen/offscreenFunctionOverrides.js', 'background/offscreenManager.js',
]);

function reloadChromeExtensionsTab(cb) {
Expand Down
3 changes: 2 additions & 1 deletion src/browserExt/background.js
Original file line number Diff line number Diff line change
Expand Up @@ -52,12 +52,13 @@ Zotero.Connector_Browser = new function() {

this.init = async function() {
if (Zotero.isManifestV3) {
if (!Zotero.isFirefox) {
if (Zotero.isChromium) {
// Chrome recently stopped displaying context menus on button right-click
// with 'browser_action' as context. It's supposed to work, so maybe a bug
// in Chrome, but let's fix it on our side. Firefox, meanwhile, throws if 'action'
// is included in the context list.
buttonContext.push('action');
await Zotero.OffscreenManager.init();
}
this._tabInfo = _tabInfo = await Zotero.Utilities.Connector.createMV3PersistentObject('tabInfo');
setInterval(async () => {
Expand Down
139 changes: 139 additions & 0 deletions src/browserExt/background/offscreenManager.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
/*
***** BEGIN LICENSE BLOCK *****
Copyright © 2024 Corporation for Digital Scholarship
Vienna, Virginia, USA
http://zotero.org
This file is part of Zotero.
Zotero is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Zotero is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with Zotero. If not, see <http://www.gnu.org/licenses/>.
***** END LICENSE BLOCK *****
*/

/**
* Part of background page. Manages the offscreen page
*/
Zotero.OffscreenManager = {
initPromise: null,
offscreenPageInitialized: false,
messagingDeferred: Zotero.Promise.defer(),
offscreenUrl: 'offscreen/offscreen.html',

async init() {
const offscreenPage = await this.getOffscreenPage();
if (!offscreenPage) {
// Make sure we're waiting for a new deferred
this.messagingDeferred = Zotero.Promise.defer();
// Create offscreen document
await browser.offscreen.createDocument({
url: this.offscreenUrl,
reasons: ['DOM_SCRAPING', 'DOM_PARSER'],
justification: 'Scraping the document with Zotero Translators',
});
}
else {
// Technically the service worker can restart without the offscreen
// page being unloaded per Chrome docs, although not clear whether this would actually happen in practice.
offscreenPage.postMessage('service-worker-restarted');
}
await this.messagingDeferred.promise;

// Only need to set the below up once
if (this.offscreenPageInitialized) return;
this.offscreenPageInitialized = true;

// Watch for browserext event of tab close and inform the offscreen page translate
browser.tabs.onRemoved.addListener((tabId, removeInfo) => {
this.sendMessage('tabClosed', tabId);
});

// Run cleanup every 15min
setInterval(() => this.cleanup(), 15*60e3);
Zotero.debug('OffscreenManager: offscreen page initialized');
},

async sendMessage(message, payload, tab, frameId) {
const offscreenPage = await this.getOffscreenPage();
if (!offscreenPage) {
await this.init();
}
if (tab) {
payload.push(tab.id, frameId);
}
return await this._messaging.sendMessage(message, payload);
},

async addMessageListener(...args) {
const offscreenPage = await this.getOffscreenPage();
if (!offscreenPage) {
await this.init();
}
return this._messaging.addMessageListener(...args);
},

/**
* onTabRemoved handler should make sure offscreen doesn't hold translate instances
* that are dead and moreover the offscreen page should get killed every now and then by the browser,
* but we want to be extra sure we're not leaking memory
*/
async cleanup() {
const offscreenPage = await this.getOffscreenPage();
if (!offscreenPage) return false;
let tabs = await browser.tabs.query({status: "complete", windowType: "normal"});
let cleanedUpTabIds = await this.sendMessage('translateCleanup', tabs.map(tab => tab.id));
if (cleanedUpTabIds.length > 0) {
Zotero.logError(new Error(`OffscreenManager: manually cleaned up translates that were kept `
+ `alive after onTabRemoved ${JSON.stringif(cleanedUpTabIds)}`));
}
},

async getOffscreenPage() {
const matchedClients = await self.clients.matchAll();
return matchedClients.find(client => client.url.includes(this.offscreenUrl));

}
}

// Listener needs to be added at worker script initialization
self.onmessage = async (e) => {
if (e.data === 'offscreen-port') {
Zotero.debug('OffscreenManager: received the offscreen page port')
// Resolve _initMessaging() in offscreenSandbox.js
let messagingOptions = {
handlerFunctionOverrides: OFFSCREEN_BACKGROUND_OVERRIDES,
}
messagingOptions.sendMessage = (...args) => {
e.ports[0].postMessage(args)
};
messagingOptions.addMessageListener = (fn) => {
e.ports[0].onmessage = (e) => fn(e.data);
};
// If the offscreen document got killed by the browser and we restarted it
// we only need to set sendMessage, otherwise previously added message listeners
// will get discarded
if (Zotero.OffscreenManager._messaging) {
Zotero.OffscreenManager._messaging.reinit(messagingOptions);
}
else {
Zotero.OffscreenManager._messaging = new Zotero.MessagingGeneric(messagingOptions);
}
Zotero.debug('OffscreenManager: messaging initialized')
e.ports[0].postMessage(null);
await new Promise(resolve => Zotero.OffscreenManager._messaging.addMessageListener('offscreen-sandbox-initialized', resolve));
Zotero.debug('OffscreenManager: offscreen sandbox initialized message received')
Zotero.OffscreenManager.messagingDeferred.resolve();
}
}
123 changes: 123 additions & 0 deletions src/browserExt/inject/virtualOffscreenTranslate.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
/*
***** BEGIN LICENSE BLOCK *****
Copyright © 2024 Corporation for Digital Scholarship
Vienna, Virginia, USA
http://zotero.org
This file is part of Zotero.
Zotero is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Zotero is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with Zotero. If not, see <http://www.gnu.org/licenses/>.
***** END LICENSE BLOCK *****
*/

// A virtual translate that offloads translating to the offscreen page
Zotero.VirtualOffscreenTranslate = class {
translateDoc = null;

/**
* @returns {Promise<Zotero.VirtualOffscreenTranslate>}
*/
static async create() {
let translate = new Zotero.VirtualOffscreenTranslate();
await translate.sendMessage('Translate.new');
return new Proxy(translate, {
get: (target, property, ...args) => {
if (!target[property] && (property in Zotero.Translate.Web.prototype)) {
return (...args) => {
return target.sendMessage(`Translate.${property}`, args);
}
}
return Reflect.get(target, property, ...args);
}
});
}

constructor() {
// Handling for translate.monitorDOMChanges
let mutationObserver;
this.addMessageListener('MutationObserver.observe', ([selector, config]) => {
// We allow at most one observer, or we'll have to keep track of them. Websites
// that need this will only have one translator applying an observer anyway.
if (mutationObserver) mutationObserver.disconnect();
mutationObserver = new MutationObserver(() => {
// We disconnect immediately because that's what monitorDOMChanges does, and if we don't
// there's an async messaging timeblock where more mutations may occur and result in
// pageModified being called multiple times.
mutationObserver.disconnect();
return this.sendMessage('MutationObserver.trigger');
});
const node = this.translateDoc.querySelector(selector);
mutationObserver.observe(node, config);
});
}

getProxy() {
return this.sendMessage('Translate.getProxy');
}

async setHandler(name, callback) {
let id = Zotero.Utilities.randomString(10);
await this.sendMessage('Translate.setHandler', [name, id]);
this.addMessageListener(`Translate.onHandler.${name}`, ([remoteId, args]) => {
if (name == 'select') {
args[2] = (...args) => {
this.sendMessage('Translate.selectCallback', [id, args]);
}
}
if (remoteId == id) {
callback(...args);
}
});
}

setDocument(doc, updateLiveElements=false) {
this.translateDoc = doc;
if (updateLiveElements) {
for (const checkbox of doc.querySelectorAll('input[type=checkbox]')) {
if (checkbox.checked) {
checkbox.setAttribute('checked', '');
}
else {
checkbox.removeAttribute('checked');
}
}
}
return this.sendMessage('Translate.setDocument', [doc.documentElement.outerHTML, doc.location.href, doc.cookie]);
}

async setTranslator(translators) {
if (!Array.isArray(translators)) {
translators = [translators];
}
translators = translators.map(t => t.serialize(Zotero.Translator.TRANSLATOR_PASSING_PROPERTIES));
return this.sendMessage('Translate.setTranslator', [translators])
}

async getTranslators(...args) {
let translators = await this.sendMessage('Translate.getTranslators', args);
return translators.map(translator => new Zotero.Translator(translator));
}

sendMessage(message, payload=[]) {
return Zotero.OffscreenManager.sendMessage(message, payload)
}

addMessageListener(...args) {
// Listening for messages from bg page messaging via which OffscreenManager will send messages
// since it doesn't have the ability to send messages directly to tabs itself
return Zotero.Messaging.addMessageListener(...args)
}
}
10 changes: 6 additions & 4 deletions src/browserExt/manifest-v3.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
"default_title": "Save to Zotero"
},
"host_permissions": ["http://*/*", "https://*/*"],
"permissions": ["tabs", "contextMenus", "cookies", "scripting",
"permissions": ["tabs", "contextMenus", "cookies", "scripting", "offscreen",
"webRequest", "declarativeNetRequest", "webNavigation", "storage"],
"declarative_net_request": {
"rule_resources": [{
Expand Down Expand Up @@ -51,16 +51,18 @@
}
],
"sandbox": {
"pages": ["translateSandbox/translateSandbox.html"]
"pages": ["offscreen/offscreenSandbox.html"]
},
"web_accessible_resources": [{
"resources": [
"images/*",
"progressWindow/progressWindow.html",
"modalPrompt/modalPrompt.html",
"translateSandbox/translateSandbox.html",
"test/data/journalArticle-single.html",
"lib/SingleFile/single-file-hooks-frames.js"
"lib/SingleFile/single-file-hooks-frames.js",
"inject/pageSaving.js",
"translateWeb.js",
"itemSaver.js"
],
"matches": ["http://*/*", "https://*/*"]
}],
Expand Down
5 changes: 4 additions & 1 deletion src/browserExt/manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,10 @@
"progressWindow/progressWindow.html",
"modalPrompt/modalPrompt.html",
"test/data/journalArticle-single.html",
"lib/SingleFile/single-file-hooks-frames.js"
"lib/SingleFile/single-file-hooks-frames.js",
"inject/pageSaving.js",
"translateWeb.js",
"itemSaver.js"
],
"content_security_policy": "script-src 'self' 'unsafe-eval'; object-src 'self'",
"homepage_url": "https://www.zotero.org/",
Expand Down
13 changes: 12 additions & 1 deletion src/browserExt/messagingGeneric.js
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,17 @@ Zotero.MessagingGeneric = class {
this._initMessageListener();
}

// Reinit messaging without resetting existing message listeners. Needed if the existing connection
// gets severed for some reason.
reinit(options) {
if (!options.sendMessage || !options.addMessageListener) {
throw new Error('Zotero.MessagingGeneric: mandatory reinit() options missing');
}
this._sendMessage = options.sendMessage;
this._addMessageListener = options.addMessageListener;
this._initMessageListener();
}

// Initialize message handler
_initMessageListener() {
this._addMessageListener(async (args) => {
Expand All @@ -149,7 +160,7 @@ Zotero.MessagingGeneric = class {
if (this._options.supportsResponse) {
return result;
}
else if (result !== undefined) {
else {
this._sendMessage(`response`, result, messageId);
}
}
Expand Down
Loading

0 comments on commit 315239f

Please sign in to comment.