Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto scaling: Work in Progress #36

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

Auto scaling: Work in Progress #36

wants to merge 12 commits into from

Conversation

hjaensch7
Copy link
Contributor

No description provided.

@hjaensch7 hjaensch7 requested a review from Others April 15, 2020 01:00
@hjaensch7 hjaensch7 self-assigned this Apr 15, 2020
@hjaensch7 hjaensch7 added the enhancement New feature or request label Apr 15, 2020
@hjaensch7 hjaensch7 requested a review from pcodes April 15, 2020 03:23
deployment/actionManger.go Outdated Show resolved Hide resolved
for _, w := range workers {
status, err := w.Status()
if err != nil {
log.Warning.Println("error getting worker status:", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better error message plox

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much more info would you like in this error message? The blank status? or the worker perhaps?

// Collect status of each comp on each worker
compMap := getCurrentInstanceState(scaler.workers)

log.Info.Println("----------------------------")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

???

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks pretty for testing

deployment/actionManger.go Outdated Show resolved Hide resolved
@@ -136,7 +142,7 @@ func (mgr *ActionManager) HandleDirtyState() error {
correctCompID := worker.ComponentID{
User: activeComp.User,
Repo: activeComp.Repo,
Hash: correctHash,
Hash: correctHash.hash,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird naming here

deployment/actionManger.go Outdated Show resolved Hide resolved
return errors.New("Something weird happened.")
}

func (mgr *ActionManager) findWorkerToDeployTo(compID worker.ComponentID) (*worker.V9Worker, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this function already exist in this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function ensures that the worker it returns doesn't have the component running on it.

deployment/actionManger.go Show resolved Hide resolved
Copy link
Contributor

@Others Others left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much closer! I think the only big problem remaining is the ensureNWorkersAreRunning function

Comment on lines +28 to +37
workerIDs := make([]string, len(scaler.workers))
for i := range scaler.workers {
name := fmt.Sprintf("worker_%d", i)
id, err := scaler.driver.FindWorkerID(name)
if err != nil {
log.Error.Println("error getting worker id:", err)
continue
}
workerIDs[i] = id
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now calculate this "worker name" info in a bunch of places. We should pull it out to a helper for sure

Comment on lines +82 to +83
compMap[cID].averageStats.Hits += componentStats.Hits
compMap[cID].averageStats.Hits /= float64(compMap[cID].instanceCount)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this divide the number every time you increment it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the component is already in the map we need to update the average. So this is like a rolling average.

deployment/actionManger.go Show resolved Hide resolved

dirtyStateNotifier: dirtyStateNotifier,
}

//Thread for handling hash changes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space before the beginning of comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait do we want a space or not?


mgr.NotifyComponentStateChanged()
}
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment: should we be batching hash updates in one lock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking this as well. The question is like, do we want a thread for each channel. One thread will be updated by the autoscaler periodically. The other will be changing the hash as needed.

deployment/auto_scaler.go Show resolved Hide resolved
deployment/actionManger.go Show resolved Hide resolved
@hjaensch7 hjaensch7 requested a review from Others May 4, 2020 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants