Skip to content

Commit

Permalink
add support to specify scheduler policy and release (#197)
Browse files Browse the repository at this point in the history
Signed-off-by: vsoch <[email protected]>
Co-authored-by: vsoch <[email protected]>
  • Loading branch information
vsoch and vsoch authored Jul 27, 2023
1 parent ff14117 commit c2d0d6e
Show file tree
Hide file tree
Showing 28 changed files with 516 additions and 112 deletions.
142 changes: 142 additions & 0 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
name: flux operator tag and release

on:
workflow_dispatch:
inputs:
release_tag:
description: Custom release tag
type: string
required: true

jobs:
build-arm:
runs-on: ubuntu-latest
name: make and build arm
steps:
- name: Checkout Repository
uses: actions/checkout@v3
- name: Set tag
run: |
echo "Tag for release is ${{ inputs.release_tag }}"
echo "tag=${{ inputs.release_tag }}" >> ${GITHUB_ENV}
- uses: actions/setup-go@v3
with:
go-version: ^1.18.1
- name: GHCR Login
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Add custom buildx ARM builder
run: |
docker buildx create --name armbuilder
docker buildx use armbuilder
docker buildx inspect --bootstrap
- name: Deploy Container
env:
tag: ${{ env.tag }}
run: make arm-deploy ARMIMG=ghcr.io/flux-framework/flux-operator:${tag}-arm

build:
runs-on: ubuntu-latest
strategy:
matrix:
command: [bundle, catalog, docker]
name: make and build ${{ matrix.command }}
steps:
- name: Checkout Repository
uses: actions/checkout@v3
- uses: actions/setup-go@v3
with:
go-version: ^1.18.1
- name: Set tag
run: |
echo "Tag for release is ${{ inputs.release_tag }}"
echo "tag=${{ inputs.release_tag }}" >> ${GITHUB_ENV}
- name: GHCR Login
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Build Container
env:
tag: ${{ env.tag }}
run: |
image=ghcr.io/flux-framework/flux-operator-${{ matrix.command }}:v${tag}
img=ghcr.io/flux-framework/flux-operator:v${tag}
make ${{ matrix.command }}-build BUNDLE_IMG=${image} IMG=${img} CATALOG_IMG=${image}
- name: Deploy Container
env:
tag: ${{ env.tag }}
run: |
image=ghcr.io/flux-framework/flux-operator-${{ matrix.command }}:v${tag}
img=ghcr.io/flux-framework/flux-operator:v${tag}
make ${{ matrix.command }}-push BUNDLE_IMG=${image} IMG=${img} CATALOG_IMG=${image}
release:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- uses: actions/setup-go@v3
with:
go-version: ^1.18.1
- name: Set tag
run: |
echo "Tag for release is ${{ inputs.release_tag }}"
echo "tag=${{ inputs.release_tag }}" >> ${GITHUB_ENV}
- name: Install
run: conda create --quiet --name fo twine
- name: Install dependencies
run: |
export PATH="/usr/share/miniconda/bin:$PATH"
source activate fo
pip install -e .
pip install setuptools wheel twine
- name: Build and publish
env:
TWINE_USERNAME: ${{ secrets.PYPI_USER }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASS }}
tag: ${{ env.tag }}
run: |
export PATH="/usr/share/miniconda/bin:$PATH"
source activate fo
cd sdk/python/v1alpha1/
python setup.py sdist bdist_wheel
cd dist
wheelfile=$(ls fluxoperator-*.whl)
wheelfile=$(echo "$wheelfile" | sed "s/fluxoperator-//")
wheelfile=$(echo "$wheelfile" | sed "s/-py3-none-any.whl//")
echo "Release for Python is ${wheelfile}"
echo "Release for flux operator is ${tag}"
if [[ "${wheelfile}" == "${tag}" ]]; then
echo "Versions are correct, publishing."
twine upload dist/*
else
echo "Versions are not correct, please fix and upload locally."
fi
- name: Build release manifests
env:
tag: ${{ env.tag }}
run: |
make build-config-arm ARMIMG=ghcr.io/flux-framework/flux-operator:${tag}-arm
make build-config IMG=ghcr.io/flux-framework/flux-operator:v${tag}
- name: Release Flux Operator
uses: softprops/action-gh-release@v1
with:
name: Flux Operator Release v${{ env.tag }}
tag_name: ${{ env.tag }}
body_path: CHANGELOG.md
body: "flux operator release ${{ env.tag }}"
files: |
examples/dist/flux-operator-arm.yaml
examples/dist/flux-operator.yaml
env:
GITHUB_REPOSITORY: flux-framework/flux-operator
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ and **Merged pull requests**. Critical items to know are:
The versions coincide with releases on pip. Only major versions will be released as tags on Github.

## [0.0.x](https://github.com/flux-framework/flux-operator/tree/main) (0.0.x)
- First release supporting experimenting bursting / scaling and customization (0.1.0)
- Support for automated testing of examples/tests (0.0.x)
- Early support for a basic multi-user mode.
- Addition of local volumes for workflows that share data.
Expand Down
1 change: 0 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,6 @@ api: generate api
rm -rf ./sdk/python/${API_VERSION}/fluxoperator/model/*
rm -rf ./sdk/python/${API_VERSION}/fluxoperator/test/test_*.py
java -jar ${SWAGGER_JAR} generate -i ${SWAGGER_API_JSON} -g python-legacy -o ./sdk/python/${API_VERSION} -c ./hack/python-sdk/swagger_config.json --git-repo-id flux-operator --git-user-id flux-framework
cp ./hack/python-sdk/template/* ./sdk/python/${API_VERSION}/

# These were needed for the python (not python-legacy)
# cp ./hack/python-sdk/fluxoperator/* ./sdk/python/${API_VERSION}/fluxoperator/model/
Expand Down
12 changes: 12 additions & 0 deletions api/v1alpha1/minicluster_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -385,6 +385,10 @@ type FluxSpec struct {
//+optional
CurveCertSecret string `json:"curveCertSecret"`

// Custom attributes for the fluxion scheduler
//+optional
Scheduler FluxScheduler `json:"scheduler"`

// Expect a secret (named according to this string)
// for a munge key. This is intended for bursting.
// Assumed to be at /etc/munge/munge.key
Expand All @@ -404,6 +408,14 @@ type FluxSpec struct {
BrokerConfig string `json:"brokerConfig"`
}

// FluxScheduler attributes
type FluxScheduler struct {

// Scheduler queue policy, defaults to "fcfs" can also be "easy"
// +optional
QueuePolicy string `json:"queuePolicy"`
}

// Bursting Config
// For simplicity, we internally handle the name of the job (hostnames)
type Bursting struct {
Expand Down
16 changes: 16 additions & 0 deletions api/v1alpha1/swagger.json
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,17 @@
}
}
},
"FluxScheduler": {
"description": "FluxScheduler attributes",
"type": "object",
"properties": {
"queuePolicy": {
"description": "Scheduler queue policy, defaults to \"fcfs\" can also be \"easy\"",
"type": "string",
"default": ""
}
}
},
"FluxSpec": {
"type": "object",
"properties": {
Expand Down Expand Up @@ -241,6 +252,11 @@
"type": "string",
"default": ""
},
"scheduler": {
"description": "Custom attributes for the fluxion scheduler",
"default": {},
"$ref": "#/definitions/FluxScheduler"
},
"wrap": {
"description": "Commands for flux start --wrap",
"type": "string"
Expand Down
16 changes: 16 additions & 0 deletions api/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

31 changes: 30 additions & 1 deletion api/v1alpha1/zz_generated.openapi.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions chart/templates/minicluster-crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -371,6 +371,14 @@ spec:
be set in the user interface to override here. This is only valid
for a FluxRunner "runFlux" true
type: string
scheduler:
description: Custom attributes for the fluxion scheduler
properties:
queuePolicy:
description: Scheduler queue policy, defaults to "fcfs" can
also be "easy"
type: string
type: object
wrap:
description: Commands for flux start --wrap
type: string
Expand Down
8 changes: 8 additions & 0 deletions config/crd/bases/flux-framework.org_miniclusters.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -374,6 +374,14 @@ spec:
be set in the user interface to override here. This is only
valid for a FluxRunner "runFlux" true
type: string
scheduler:
description: Custom attributes for the fluxion scheduler
properties:
queuePolicy:
description: Scheduler queue policy, defaults to "fcfs" can
also be "easy"
type: string
type: object
wrap:
description: Commands for flux start --wrap
type: string
Expand Down
8 changes: 7 additions & 1 deletion controllers/flux/templates/broker.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,10 @@ hosts = [{host="{{ .Spec.Flux.Bursting.LeadBroker.Address }}", bind="tcp://eth0:
[archive]
dbpath = "/var/lib/flux/job-archive.sqlite"
period = "1m"
busytimeout = "50s"
busytimeout = "50s"

# Configure the flux-sched (fluxion) scheduler policies
# The 'lonodex' match policy selects node-exclusive scheduling, and can be
# commented out if jobs may share nodes.
[sched-fluxion-qmanager]
queue-policy = "{{ if .Spec.Flux.Scheduler.QueuePolicy }}{{ .Spec.Flux.Scheduler.QueuePolicy }}{{ else }}fcfs{{ end }}"
28 changes: 28 additions & 0 deletions docs/getting_started/custom-resource-definition.md
Original file line number Diff line number Diff line change
Expand Up @@ -358,6 +358,34 @@ flux:

In the above, we would add `--wrap=strace,-e,network,-tt` to flux start commands.

#### scheduler

Under flux->scheduler you can define attributes for the scheduler. We currently allow
setting a custom queue policy. The default (if unset) looks like this for first come first serve:

```yaml
flux:
scheduler:
queuePolicy: fcfs
```

To change to a policy that supports backfilling, you can change this to "easy":

```yaml
flux:
scheduler:
queuePolicy: easy
```

And the broker.toml config will update appropriately:

```toml
[sched-fluxion-qmanager]
queue-policy = "easy"
```

You can learn more about queues [here](https://flux-framework.readthedocs.io/en/latest/guides/admin-guide.html?h=system#adding-queues). Please [open an issue](https://github.com/flux-framework/flux-operator/issues) if you want support for something that you don't see. Also note that you can set an entire [broker config](#broker-config) for more detailed customization.

#### minimalService

By default, the Flux MiniCluster will be created with a headless service across the cluster,
Expand Down
8 changes: 8 additions & 0 deletions examples/dist/flux-operator-arm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -380,6 +380,14 @@ spec:
be set in the user interface to override here. This is only
valid for a FluxRunner "runFlux" true
type: string
scheduler:
description: Custom attributes for the fluxion scheduler
properties:
queuePolicy:
description: Scheduler queue policy, defaults to "fcfs" can
also be "easy"
type: string
type: object
wrap:
description: Commands for flux start --wrap
type: string
Expand Down
Loading

0 comments on commit c2d0d6e

Please sign in to comment.