Skip to content
This repository has been archived by the owner on Nov 23, 2023. It is now read-only.

Allow more characters in dataset titles #1974

Open
1 of 35 tasks
billgeo opened this issue Aug 30, 2022 · 0 comments
Open
1 of 35 tasks

Allow more characters in dataset titles #1974

billgeo opened this issue Aug 30, 2022 · 0 comments
Labels
user story Something valuable for the user

Comments

@billgeo
Copy link
Contributor

billgeo commented Aug 30, 2022

User Story

So that I can create meaningful end dataset title and s3 urls, as a data maintainer, I want to be able to add other characters (e.g. ., (U0027) and macronated characters in the dataset title and therefore the s3 prefix

Should consider downstream issues with some characters in Windows/Linux/JSON etc (this has been documented in #1975 ). Update acceptance criteria when this is done.

Acceptance Criteria

  • Given a dataset title with a fullstop, . or apostrophe in it, when new dataset is created with this dataset title, then the dataset title is accepted and the dataset is created and files are created with the fullstops in the s3 prefix. See Spike: Downstream effects of adding characters to dataset titles #1975 for more information.
  • Given a dataset title with a macronated character in it (e.g. ā,Ō etc), when new dataset is created with this dataset title, then the dataset title is accepted and the dataset is created and files are created with macronated characer in the s3 prefix.
  • Given a dataset with any valid dataset title character, when the user copies the data to their local filesytesm (Mac, Linux, Windows), then they can access the files with a file browser.

Additional context

Discussion from data managers on how they want to name, organise and access their data (particularly in the aerial imagery area) highlights that we should consider adding slashes /and other useful characters . to the allowed characters for a dataset title and therefore it's s3 url/path.

S3 limitations of characters in S3 keys https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html

See confluence page here and slack discussion here

Tasks

Definition of Ready

  • This story is ready to work on
    • Independent (story is independent of all other tasks)
    • Negotiable (team can decide how to design and implement)
    • Valuable (from a user perspective)
    • Estimate value applied (agreed by team)
    • Small (so as to fit within an iteration)
    • Testable (in principle, even if there isn't a test for it yet)
    • Environments are ready to meet definition of done
    • Resources required to implement will be ready
    • Everyone understands and agrees with the tasks to complete the story
    • Release value (e.g. Iteration 3) applied
    • Sprint value (e.g. Aug 1 - Aug 15) applied

Definition of Done

  • This story is done:
    • Acceptance criteria completed
    • Automated tests are passing
    • Code is peer reviewed and pushed to master
    • Deployed successfully to test environment
    • Checked against
      CODING guidelines
    • Relevant new tasks are added to backlog and communicated to the team
    • Important decisions recorded in the issue ticket
    • Readme/Changelog/Diagrams are updated
    • Product Owner has approved acceptance criteria as complete
    • Meets non-functional requirements:
      • Scalability (data): Can scale to 300TB of data and 100,000,000 files and ability to
        increase 10% every year
      • Scability (users): Can scale to 100 concurrent users
      • Cost: Data can be stored at < 0.5 NZD per GB per year
      • Performance: A large dataset (500 GB and 50,000 files - e.g. Akl aerial imagery) can be
        validated, imported and stored within 24 hours
      • Accessibility: Can be used from LINZ networks and the public internet
      • Availability: System available 24 hours a day and 7 days a week, this does not include
        maintenance windows < 4 hours and does not include operational support
      • Recoverability: RPO of fully imported datasets < 4 hours, RTO of a single 3 TB dataset
        < 12 hours
@billgeo billgeo added user story Something valuable for the user needs refinement Needs to be discussed by the team labels Aug 30, 2022
@billgeo billgeo removed the needs refinement Needs to be discussed by the team label Sep 5, 2022
@mfwightman mfwightman moved this from 📋 Backlog to 🔖 Ready in Data Infrastructure Squad Oct 4, 2022
@mfwightman mfwightman moved this from 🔖 Ready to 📋 Backlog in Data Infrastructure Squad Oct 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
user story Something valuable for the user
Development

No branches or pull requests

1 participant