Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Translator for TinRead OPACs #3124

Closed
adam3smith opened this issue Aug 30, 2023 · 7 comments
Closed

Add Translator for TinRead OPACs #3124

adam3smith opened this issue Aug 30, 2023 · 7 comments
Labels
Difficulty: Easy New Translator Pull requests for new translators

Comments

@adam3smith adam3smith added New Translator Pull requests for new translators Difficulty: Easy labels Aug 30, 2023
@brendan-oconnell brendan-oconnell self-assigned this Sep 26, 2023
@brendan-oconnell
Copy link
Contributor

@adam3smith I've taken a look at this, and the MARC, while easily accessible, is tag-structured in HTML in a way that that makes it difficult to write a querySelectorAll. Most catalogs with MARC put it in tables that can be read as a tree structure, but here's an example "row" from TinRead:

` 100
0

      <b>$6</b>&nbsp;
      137697
       
      <b>$a</b>&nbsp;
      Carstea, Gheorghe
       
      <b>$u</b>&nbsp;
      Academia de Studii Economice din Bucuresti. Facultatea de Management, Departamentul de Management
      
      <br/> `

Since it just uses line breaks and the data isn't structured in rows, I can't figure out how to easily select it, without resorting to writing loops.

On the other hand, the "Etichetat" view (first tab on the catalog page) is nicely structured in a table, but isn't MARC, so would require writing more lines of code and not taking advantage of the MARC translator. Which approach do you recommend?

@AbeJellinek
Copy link
Member

Maybe something like:

let root = doc.querySelector('#marc li');
for (let child of root.childNodes) {
	if (child.nodeType === Node.ELEMENT_NODE && child.tagName == 'B') {
		if (child.textContent.startsWith('$') {
			let subtag = child.textContent;
			// do something with the subtag
		}
		else {
			let tag = child.textContent;
			// do something with the tag
		}
	}
	else {
		let content = child.textContent;
		// this is the content of the last subtag - do something with it
	}
}

@franklindyer
Copy link
Contributor

franklindyer commented Jan 10, 2024

It looks like the MARC can actually be exported in the MARCXML format by clicking the Exportă button (which has id exportBibs). This brings up a form allowing you to select the XML format and download the info as a .xml file. I've tested out one of these XML files with MARCXML.js and it seems to parse just fine.

Maybe I'll try writing a translator to grab the XML from the export button and then defer to the MARCXML translator?

@franklindyer
Copy link
Contributor

Sure enough, the following works for me:

function doWeb(doc, url) {
	doc.getElementById("DirectLink").click();
	let exportButton = doc.getElementById("exportBibs");
	let marcUrl = exportButton.href;

	ZU.doGet(marcUrl, function(result) {
		var translator = Zotero.loadTranslator("import");
		translator.setTranslator("edd87d07-9194-42f8-b2ad-997c4c7deefd");
		translator.setString(result);
		translator.setHandler("itemDone", function (obj, item) {
			finalize(doc, item);
			item.complete();
		});
		translator.translate();
	});
}

I'll see if I can get detectWeb written as well, and open a pull request.

@adam3smith
Copy link
Collaborator Author

Have a look at the template functions in Scaffold/Translator Editor. New translators should use async functions and the async requestText commands to load the text -- this is actually easier to read&code once you've seen the syntax because you don't have to keep track of that callback anymore. Beyond that, yes a PR for this would be great.

@franklindyer
Copy link
Contributor

franklindyer commented Jan 10, 2024

Got it! Went ahead and opened a PR. Right now all of my test cases pass (though they can be kind of hit-or-miss due to a race condition), and linter tests also pass. Let me know if there's anything else I need to fix here!

The fact that these functions are async now is handy, it actually solved a problem I was having (needing to wait for a dialog box to open before getting the MARCXML link). Though I still wonder whether there's a better way of getting this link.

@brendan-oconnell brendan-oconnell removed their assignment Jan 10, 2024
@AbeJellinek
Copy link
Member

Fixed by #3223

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Difficulty: Easy New Translator Pull requests for new translators
Development

No branches or pull requests

4 participants