-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdata_dictionary.yml
128 lines (119 loc) · 6.29 KB
/
data_dictionary.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
# Meta Metadata
org: {}
product: {}
dataset:
columns:
name:
summary: Column Name
extra_description: Name of the column exactly as it appears in the dataset.
description:
summary: Column Description
extra_description: A brief, plain-language explanation of what the data in the column means.
values:
summary: "Expected/Allowed Values"
extra_description: >
Specifies if there is an expected range and/or format of possible values. For example, if the data type is Date & Time,
this field will note whether the timestamp is MM/DD/YYYY or MM/YYYY. If the Column Name is ice cream, this field might
note that values can be Chocolate, Vanilla or Strawberry.
If relevant, this field specifies the unit of measurement of the data field, e.g. thousands, millions, $ value, miles, feet, year, etc.
limitations:
summary: Field Limitations
extra_description: >
Describes any unique characteristics or potential analytical limitations presented by this field, including:
- the reasoning for any null, zero, or empty values in the data
- if the data in the column was integrated from another dataset or organization
- if the data covered includes a different time period
- the source of the column and how the data in the column was generated.
For example, information on how the data in this column was generated can include whether the data was self-reported directly by a person,
system generated by a database or agency system, derived through analytical manipulation of other fields or records;
or obtained from a different agency.
notes:
summary: Additional Notes
extra_description: >
Provides any additional relevant information about the data in the column, including:
- definitions of acronyms, special term or codes, or jargon that appears in the field values;
- the meaning of confusing or non-intuitive values in the data;
- how the information in this column relates to information in other columns;
- other unique details about this column.
revisions:
date:
summary: "Date"
summary:
summary: "Change Highlights"
notes:
summary: "Comments"
attributes:
# Dataset Information
display_name:
summary: "Dataset Name"
extra_description: ""
agency:
summary: "Data Provided by"
extra_description: The name of the NYC agency providing this data to the public.
each_row_is_a:
summary: "Each row is a..."
extra_description: The unit of analysis/level of aggregation of the dataset
publishing_frequency:
summary: "Publishing Frequency"
extra_description: |
How often changed data is published to this dataset. For an automatically updated dataset, this is the frequency of that automation
publishing_frequency_details:
summary: "Frequency Details"
extra_description: Additional details about the publishing or data change frequency, if needed
data_change_frequency:
summary: Data Change Frequency
extra_description: How often the data underlying this dataset is changed
description:
summary: Dataset Description
extra_description: |
Overview of the information this dataset contains, including overall context and definitions of key terms. This field may include links to supporting datasets, agency websites, or external resources for additional context.
publishing_purpose:
summary: Why is this data collected?
extra_description: |
Purpose behind the collection of this data, including any legal or policy requirements for this data by NYC Executive Order, Local Law, or other policy directive.
data_collection_method:
summary: How is this data collected?
extra_description: |
The methods used to create and update this dataset, including what cleaning or processing was involved prior to dataset publication.
If data collection includes interpreting physical information this field includes technical details.
If data collection includes fielding applications, requests, or complaints, this field includes details about the forms, applications, and processes used.
potential_uses:
summary: How can this data be used?
extra_description: |
Examples of and/or links to projects or agency operations that have used this dataset.
custom:
oti_extra_notes: |
Where relevant, includes links to online projects, agency websites, visualizations, maps, or dashboards.
What are some questions one might answer using this dataset?
disclaimer:
summary: What are the unique characteristics or limitations of this dataset?
extra_description: |
Unique characteristics of this dataset to be aware of, specifically, constraints or limitations to the use of the data.
projection:
summary: Additional geospatial information
extra_description: |
For any datasets with geospatial data, specify the coordinate reference system or projection used and other relevant details.
# Primer / Internal Page
tags:
summary: Dataset Tags
extra_description: |
A list of comma-separated terms, based on the topic of the dataset, that will link to other datasets with that same tag.
custom:
oti_extra_notes: Tags are used along with the dataset name and description to search NYC Open Data.
rows_removed:
summary: Are rows removed from this dataset when the data is updated?
extra_description: |
Related to another Local Law (106 of 2015) for data retention. Basically, if the dataset is updated as append or upsert, the answer is Yes.
can_be_automated:
summary: Can this dataset be feasibly automated?
extra_description: ""
attribution_link:
summary: Is this data also present on a website maintained by or on behalf of the agency?
extra_description: ""
custom:
oti_extra_notes: If so, please provide the website URL.
agency_website_data_updated_automatically:
summary: Is the data on the agency's website updated automatically?
extra_description: ""
custom:
oti_extra_notes: Only applicable if the data is also present on a website maintained by or on behalf of the agency