Skip to content

Woofmagic/JSON_Chemical_Formula_Glossary

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

JSON_Chemical_Formulae_Glossary

A JSON file that contains the name of the chemical compound in "symbolic notation", the English name of the compound, and then its CAS number. A helpful file for useful information.

Located on this page is a list of "common chemical compounds with chemical formulae and CAS numbers, indexed by formula"[1]. Performing a cursory search online for the same information available in a JSON format seemed to be hopeless. This information must absolutely be readily available in JSON format.

To avoid reinventing the wheel, a search was conducted to find existing tools that would convert Wikipedia table content to more a data science-applicable file-type like CSV or JSON. (While ultimately a JSON file was desired, if there existed a tool that would output a CSV, the conversion from the CSV to the JSON file would be trivial.) There was, thankfully, an extremely useful tool that promised to convert a Wikipedia table to a JSON file. That resource is located here, online, and here, on Github.

However, even after employing JSON Formatter and Validator to prettify the JSON, the output file was undesirable. The "caption" key seemed to be empty, and the "data" key was an array of arrays. Here is an example of the output file:

{
   "caption":"",
   "data":[
      [
         "Ac2O3",
         "actinium(III) oxide",
         "12002-61-8"
      ],
      [
         "AgBF4",
         "Silver tetrafluoroborate",
         "14104-20-2"
      ],
      ...
   ]
}

The desired final file structure must have consistent key-value pairs and completely eliminate any redundancies. For example:

[
  {
     "chemical":"Ac2O3",
     "symbol":"actinium(III) oxide",
     "CASnumber""12002-61-8"
  },
  {
     "chemical":"AgBF4",
     "symbol":"Silver tetrafluoroborate",
     "CASnumber""14104-20-2"
  },
  ...
]

A simple iteration procedure was performed to extract the necessary strings out of the undesirable file and reconstructed into the desired format. Even though there certainly exist more efficient ways to perform this task, a double-nested forEach loop pushing to an empty array in the Google Chrome Dev Tools console created the desired output JSON file.

let undesirableFileStructure = returnedAPICall
, newStuff = [];
undesirableFileStructure.forEach(itemInOriginalFile => {
  itemInOriginalFile.data.forEach(arrayOfAlphabetizedChemicals => {
    newStuff.push(
      {
        "chemical":arrayOfAlphabetizedChemicals[0],
        "symbol":arrayOfAlphabetizedChemicals[1],
        "name":arrayOfAlphabetizedChemicals[2]
      }
    )
  })
});
copy(JSON.stringify(newStuff));

The resulting JSON file is now available.

References

  • [1] 'Glossary of chemical formulae', Wikipedia, Wikimedia Foundation, 22 May 2021, https://en.wikipedia.org/wiki/Glossary_of_chemical_formulae. Retrieved June 15, 2021.
  • Parenthetical

    For how to get a global variable in Chrome Dev Tools into clipboard: https://stackoverflow.com/a/30027011

    About

    No description, website, or topics provided.

    Resources

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published