From 63057fd4b63185cab83f7a7e09516e718f94c984 Mon Sep 17 00:00:00 2001 From: vsoch Date: Thu, 21 Mar 2024 22:09:47 -0600 Subject: [PATCH] add citation cff file Signed-off-by: vsoch --- CITATION.cff | 19 +++++++++++++++++++ README.md | 35 +++++++++++++---------------------- docs/index.md | 9 ++++----- 3 files changed, 36 insertions(+), 27 deletions(-) create mode 100644 CITATION.cff diff --git a/CITATION.cff b/CITATION.cff new file mode 100644 index 00000000..5c7f6edb --- /dev/null +++ b/CITATION.cff @@ -0,0 +1,19 @@ +cff-version: 1.2.0 +message: "If you use this software, please cite it as below." +authors: + - family-names: Sochat + given-names: Vanessa + orcid: https://orcid.org/0000-0002-4387-3819 + - family-names: Culquicondor + given-names: Aldo + - family-names: Ojea + given-names: Antonio + - family-names: Milroy + given-names: Daniel + orcid: https://orcid.org/0000-0001-6500-3227 +title: "The Flux Operator" +version: 0.2.0 +identifiers: + - type: doi + value: 10.12688/f1000research.147989.1 +date-released: 2024-03-21 diff --git a/README.md b/README.md index 353dea69..c7caca7f 100644 --- a/README.md +++ b/README.md @@ -3,22 +3,12 @@ ![docs/development/the-operator.jpg](docs/development/the-operator.jpg) [![DOI](https://zenodo.org/badge/528650707.svg)](https://zenodo.org/badge/latestdoi/528650707) - The Flux Operator is a Kubernetes Cluster [Operator](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) -that you can install to your cluster to create and control [Flux Framework](https://flux-framework.org/) "Mini Clusters" +that you can install to your cluster to create and control a [Flux Framework](https://flux-framework.org/) "MiniCluster" to launch jobs to. Read more, including user and developer guides, and project background in our 💛 [Documentation](https://flux-framework.org/flux-operator) 💛 -🚧️ Under Construction! 🚧️ - -**Important!** We recently removed a one-off container that ran before the MiniCluster creation to generate a certificate. -We have found [through testing](https://github.com/kubernetes-sigs/jobset/issues/104) that this somehow served as a warmup -for networking, and this means if you use the latest operator here, you may see slow times in creating the initial -broker setup. More details are available in [this post](https://github.com/converged-computing/operator-experiments/tree/main/google/service-timing). -We have fixed the zeromq timeout bug, and will hopefully be able to reproduce the issue outside of the operator -soon to report upstream. - ## Presentations - [Kubecon 2023](https://t.co/vjRydPx1rb) @@ -27,33 +17,34 @@ soon to report upstream. ## Organization The basic idea is that we present the idea of a **MiniCluster** that is a custom resource definition (CRD) -that defines a job container (that must have Flux) that (when submit) will create a set of config maps, -secrets (e.g., tls), and the final Batch job that has the pod containers running with flux. Since +that defines a job container (that does not need to have Flux) that (when submit) will create a set of config maps, +secrets, and the final Indexed Job that has the pod containers running with Flux. Since this is a batchv1.Job, it will have states that we can track. And you can find the following here: - [Flux Controllers](controllers/flux) are under `controllers/flux` for the `MiniCluster` - - [API Spec](api/v1alpha1/) are under `api/v1alpha1/` also for `MiniCluster` + - [API Spec](api/v1alpha1/) are under `api/v1alpha2/` also for `MiniCluster` - [Packages](pkg) include supporting packages for job conditions (state), if we eventually want that. - [Config](config) includes mostly automatically generated yaml configuration files needed by Kubernetes And the following external resources might be useful: - - [Flux Cloud](https://github.com/converged-computing/flux-cloud): automation of experiments using the Flux Operator + - [Flux Framework](https://flux-framework.org) - [Flux RESTful API](https://github.com/flux-framework/flux-restful-api): provides the interface for submitting jobs, if no command provided to the operator. - [Python SDK](sdk/python): for deploying MiniClusters and port forwarding. - [Flux HPC Examples](https://github.com/rse-ops/flux-hpc) containers and CRD for the operator to run Flux with HPC workloads (under development) + - [Flux Cloud](https://github.com/converged-computing/flux-cloud): automation of experiments using the Flux Operator + +**Note** we welcome contributions to code or to suggest features or identify bugs! + +## Citation -**Note** this project is actively under development, and you can expect change and improvements! -We apologize for bugs you run into, and hope you tell us soon so we can work on resolving them. +You can follow the CITATION.cff (right sidebar in GitHub) to cite, or [view the paper directly here](https://doi.org/10.12688/f1000research.147989.1) +A direct (copy paste) citation is the following: -## TODO +> Sochat V, Culquicondor A, Ojea A and Milroy D. The Flux Operator (version 1). F1000Research 2024, 13:203 (https://doi.org/10.12688/f1000research.147989.1) - - add [jar](wget https://repo1.maven.org/maven2/org/openapitools/openapi-generator-cli/6.6.0/openapi-generator-cli-6.6.0.jar -O openapi-generator-cli.jar) to makefile - - Ensure that curve.cert is a variable (path) in case we create a custom one. - - Convert ML (e.g., mnist and pytorch) to use base containers instead of Singularity - - When JobSet is available we need it to say job is successful when main application container is done. ## License diff --git a/docs/index.md b/docs/index.md index d72df462..6947ff17 100644 --- a/docs/index.md +++ b/docs/index.md @@ -5,17 +5,16 @@ Welcome to the Flux Operator Documentation! The Flux Operator is a Kubernetes Cluster [Operator](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) -that you can install to your cluster to create and control [Flux Framework](https://flux-framework.org/) `MiniCluster`s +that you can install to your cluster to create and control a [Flux Framework](https://flux-framework.org/) `MiniCluster` to launch jobs to. With the Flux Operator you can: 1. Create an ephemeral MiniCluster to run one job 2. Create a persistent MiniCluster to submit jobs to via user interface, command line, or language SDK. 3. View logs and job information via the same interfaces! -The Flux Operator is currently 🚧️ Under Construction! 🚧️ -We are working on core functionality along with providing a library of -examples from HPC. This is a *converged computing* project that aims -to unite the worlds and technologies typical of cloud computing and +The Flux Operator is recently [published](https://f1000research.com/articles/13-203/v1) +and we are actively using it to run HPC workloads in cloud environments. +This is a *converged computing* project that aims to unite the worlds and technologies typical of cloud computing and high performance computing. To get started, check out the links below!