Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data licence addition #257

Merged
merged 30 commits into from
Apr 18, 2024
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
47e27a8
Add optional licence argument to add_additional_resource
ItIsJordan Jan 9, 2024
14ccd11
Add licence function to Table class
ItIsJordan Jan 9, 2024
de8661a
Add test for Table data licence function
ItIsJordan Jan 9, 2024
04464bf
Fix incorrect type AdditionalResource docs
ItIsJordan Jan 10, 2024
ea1335f
Update add_additional_resource test
ItIsJordan Jan 10, 2024
1383a3c
Complete test_copy_files
ItIsJordan Jan 10, 2024
b83359c
Add validation check to additional resource licence add
ItIsJordan Jan 12, 2024
63f820d
Update error in TestTable
ItIsJordan Jan 12, 2024
efedcaf
Add additional resource licence check test
ItIsJordan Jan 12, 2024
d880784
Update usage documentation to include licences
ItIsJordan Jan 24, 2024
919a5ca
Some fixes and cleaning up for pylint
GraemeWatt Feb 14, 2024
2cf9046
Remove get_license() from Submission class
GraemeWatt Feb 14, 2024
7211f55
examples: show how to add license information
GraemeWatt Feb 15, 2024
3918d70
tests: fix isinstance, add test, suppress keys
GraemeWatt Feb 15, 2024
f440480
Merge branch 'main' into data-licence-update
GraemeWatt Feb 15, 2024
eb176e3
Merge branch 'main' into data-license-update
GraemeWatt Feb 16, 2024
d75d2c0
Merge branch 'main' into data-license-update
ItIsJordan Feb 27, 2024
d84bef6
Update test data
ItIsJordan Mar 25, 2024
5881aef
Merge branch 'main' into data-license-update
ItIsJordan Mar 25, 2024
4d06096
Merge branch 'main' into data-license-update
ItIsJordan Apr 16, 2024
14f65e3
Update test for coverage
ItIsJordan Apr 16, 2024
edcf916
Pylint fixes
ItIsJordan Apr 16, 2024
c579a96
Update usage.rst
ItIsJordan Apr 16, 2024
60bcbe0
Pylint fix
ItIsJordan Apr 17, 2024
1968670
Update usage documentation
ItIsJordan Apr 17, 2024
e4d7e7e
Remove duplicate ref in usage.rst
ItIsJordan Apr 17, 2024
3742535
Remove license addition from Getting_started.ipynb
GraemeWatt Apr 18, 2024
ec1359d
Merge branch 'main' into data-license-update
GraemeWatt Apr 18, 2024
d87017d
Revert "Remove get_license() from Submission class"
GraemeWatt Apr 18, 2024
2ce0035
Change default license from "CC BY 4.0" to "CC0"
GraemeWatt Apr 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions docs/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -123,13 +123,21 @@ Additional resources, hosted either externally or locally, can be linked with th

sub.add_additional_resource("Web page with auxiliary material", "https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PAPERS/STDM-2012-02/")
sub.add_additional_resource("Some file", "root_file.root", copy_file=True)
sub.add_additional_resource("Some file", "root_file.root", copy_file=True, resource_license={"name": "CC BY 4.0", "url": "https://creativecommons.org/licenses/by/4.0/", "description": "This license enables reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator."})
sub.add_additional_resource("Archive of full likelihoods in the HistFactory JSON format", "Likelihoods.tar.gz", copy_file=True, file_type="HistFactory")

The first argument is a ``description`` and the second is the ``location`` of the external link or local resource file.
The optional argument ``copy_file=True`` (default value of ``False``) will copy a local file into the output directory.
The optional argument ``resource_license`` can be used to define a data license for an additional resource.
The ``resource_license`` is in the form of a dictionary with mandatory string ``name`` and ``url`` values, and an optional ``description``.
The optional argument ``file_type="HistFactory"`` (default value of ``None``) can be used to identify statistical models provided in the HistFactory JSON
format rather than relying on certain trigger words in the ``description`` (see `pyhf section of submission documentation`_).

**Please note:** The default license applied to all data uploaded to HEPData is `CC0`_. You do not
need to specify a license for a resource file unless it differs from `CC0`_.

.. _`CC0`: https://creativecommons.org/public-domain/cc0/

The ``add_link`` function can alternatively be used to add a link to an external resource:

::
Expand Down Expand Up @@ -320,6 +328,20 @@ The documentation for this feature can be found here: `Linking tables`_.

.. _`Linking tables`: https://hepdata-submission.readthedocs.io/en/latest/bidirectional.html#linking-tables

Adding a data license
^^^^^^^^^^^^^^^^^^^^^

You can add data license information to a table using the ``add_data_license`` function of the Table class.
This function takes mandatory ``name`` and ``url`` string arguments, as well as an optional ``description``.

**Please note:** The default license applied to all data uploaded to HEPData is `CC0`_. You do not
need to specify a license for a data table unless it differs from `CC0`_.

::

table.add_data_license("CC BY 4.0", "https://creativecommons.org/licenses/by/4.0/")
table.add_data_license("CC BY 4.0", "https://creativecommons.org/licenses/by/4.0/", "This license enables reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator.")

Uncertainties
+++++++++++++

Expand Down
189 changes: 182 additions & 7 deletions examples/Getting_started.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -265,25 +265,50 @@
"If you want, the original data file can be attached to the table as an additional resource file."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The default license for HEPData records is [CC0](https://creativecommons.org/publicdomain/zero/1.0/legalcode) (see [Terms of Use](https://www.hepdata.net/terms)), but it is possible to specify a different license for both additional resource files and data tables."
GraemeWatt marked this conversation as resolved.
Show resolved Hide resolved
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"table.add_additional_resource(\"Original data file\", \"example_inputs/effacc_signal.txt\", copy_file=True)"
"license = {\"name\": \"CC BY 4.0\", \"url\": \"https://creativecommons.org/licenses/by/4.0/\", \"description\": \"This license enables reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator.\"}"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"table.add_additional_resource(\"Original data file\", \"example_inputs/effacc_signal.txt\", copy_file=True, resource_license=license)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"table.add_data_license(license[\"name\"], license[\"url\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is all that's needed for the table/figure. We still need it to the submission:"
"This is all that's needed for the table/figure. We still need to add it to the submission:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -299,7 +324,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -316,7 +341,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -333,7 +358,7 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 17,
"metadata": {},
"outputs": [
{
Expand All @@ -357,7 +382,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 18,
"metadata": {},
"outputs": [
{
Expand All @@ -372,6 +397,156 @@
"source": [
"!ls example_output"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"---\n",
"additional_resources:\n",
"- description: Created with hepdata_lib 0.14.0\n",
" location: https://doi.org/10.5281/zenodo.1217998\n",
"- description: Webpage with all figures and tables\n",
" location: https://cms-results.web.cern.ch/cms-results/public-results/publications/B2G-16-029/\n",
"- description: arXiv\n",
" location: http://arxiv.org/abs/arXiv:1802.09407\n",
"- description: Original abstract file\n",
" location: abstract.txt\n",
"comment: A search for a new heavy particle decaying to a pair of vector bosons (WW\n",
" or WZ) is presented using data from the CMS detector corresponding to an integrated\n",
" luminosity of $35.9~\\mathrm{fb}^{-1}$ collected in proton-proton collisions at a\n",
" centre-of-mass energy of 13~TeV in 2016. One of the bosons is required to be a W\n",
" boson decaying to $e\\nu$ or $mu\\nu$, while the other boson is required to be reconstructed\n",
" as a single massive jet with substructure compatible with that of a highly-energetic\n",
" quark pair from a W or Z boson decay. The search is performed in the resonance mass\n",
" range between 1.0 and 4.5~TeV. The largest deviation from the background-only hypothesis\n",
" is observed for a mass near 1.4~TeV and corresponds to a local significance of 2.5\n",
" standard deviations. The result is interpreted as an upper bound on the resonance\n",
" production cross section. Comparing the excluded cross section values and the expectations\n",
" from theoretical calculations in the bulk graviton and heavy vector triplet models,\n",
" spin-2 WW resonances with mass smaller than 1.07~TeV and spin-1 WZ resonances lighter\n",
" than 3.05~TeV, respectively, are excluded at 95\\% confidence level.\n",
"record_ids:\n",
"- id: 1657397\n",
" type: inspire\n",
"---\n",
"additional_resources:\n",
"- description: Original data file\n",
" license:\n",
" description: This license enables reusers to distribute, remix, adapt, and build\n",
" upon the material in any medium or format, so long as attribution is given to\n",
" the creator.\n",
" name: CC BY 4.0\n",
" url: https://creativecommons.org/licenses/by/4.0/\n",
" location: effacc_signal.txt\n",
"- description: Image file\n",
" location: signalEffVsMass.png\n",
"- description: Thumbnail image file\n",
" location: thumb_signalEffVsMass.png\n",
"data_file: additional_figure_1.yaml\n",
"data_license:\n",
" name: CC BY 4.0\n",
" url: https://creativecommons.org/licenses/by/4.0/\n",
"description: Signal selection efficiency times acceptance as a function of resonance\n",
" mass for a spin-2 bulk graviton decaying to WW and a spin-1 W' decaying to WZ.\n",
"keywords:\n",
"- name: observables\n",
" values:\n",
" - ACC\n",
" - EFF\n",
"- name: reactions\n",
" values:\n",
" - P P --> GRAVITON --> W+ W-\n",
" - P P --> WPRIME --> W+/W- Z0\n",
"- name: cmenergies\n",
" values:\n",
" - 13000\n",
"location: Data from additional Figure 1\n",
"name: Additional Figure 1\n"
]
}
],
"source": [
"!cat example_output/submission.yaml"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"dependent_variables:\n",
"- header:\n",
" name: Efficiency times acceptance\n",
" qualifiers:\n",
" - name: Efficiency times acceptance\n",
" value: Bulk graviton --> WW\n",
" - name: SQRT(S)\n",
" units: TeV\n",
" value: 13\n",
" values:\n",
" - value: 0.4651\n",
" - value: 0.50336\n",
" - value: 0.5126\n",
" - value: 0.52474\n",
" - value: 0.531\n",
" - value: 0.5391\n",
" - value: 0.54943\n",
" - value: 0.55378\n",
" - value: 0.56216\n",
" - value: 0.56454\n",
" - value: 0.56682\n",
"- header:\n",
" name: Efficiency times acceptance\n",
" qualifiers:\n",
" - name: Efficiency times acceptance\n",
" value: Wprime --> WZ\n",
" - name: SQRT(S)\n",
" units: TeV\n",
" value: 13\n",
" values:\n",
" - value: 0.45136\n",
" - value: 0.5109\n",
" - value: 0.54016\n",
" - value: 0.5513\n",
" - value: 0.56724\n",
" - value: 0.5728\n",
" - value: 0.5856\n",
" - value: 0.58952\n",
" - value: 0.60324\n",
" - value: .nan\n",
" - value: 0.59978\n",
"independent_variables:\n",
"- header:\n",
" name: Resonance mass\n",
" units: GeV\n",
" values:\n",
" - value: 1000.0\n",
" - value: 1200.0\n",
" - value: 1400.0\n",
" - value: 1600.0\n",
" - value: 1800.0\n",
" - value: 2000.0\n",
" - value: 2500.0\n",
" - value: 3000.0\n",
" - value: 3500.0\n",
" - value: 4000.0\n",
" - value: 4500.0\n"
]
}
],
"source": [
"!cat example_output/additional_figure_1.yaml"
]
}
],
"metadata": {
Expand Down
Loading
Loading