-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sea states clustering GM vs KM #254
Comments
@ssolson, @ryancoe, @cmichelenstrofer, @jtgrasb just tagging you because at some point we probably all have talked about this. |
In my mind, what you want to do is study how do your quantities of interest (e.g., AEP) change when you vary the number of clusters and change the algorithm. @ssolson - Didn't you write a paper where you did this? I can't seem to find it. |
@ryancoe I stopped the study I wrote up at the sea states. I did forward you some preliminary results I found using WecOptTool although I don't think it will be of much help. The useful part of the paper I wrote is captured in https://github.com/MHKiT-Software/MHKiT-Python/blob/master/examples/PacWave_resource_characterization_example.ipynb |
@dtgaebe that is a cool idea to use an idealized spectrum and then compare the clusters to the original seastate. Based on what you show here it looks like it makes sense to at a minimum extend the PACWAVE example to include your comparison using the ratio and potentially look to summarize a couple steps into function for MHKiT. I'm interested to hear your thoughts on next steps for you and perhaps where MHKiT could help with the data processing. |
Note also that @cmichelenstrofer is working on something that intersects this where's he's considering expanding the parameterization of sea states based on machine learning. |
@ssolson I think that cool idea came from you, or whoever wrote the PacWave resource assessment example because the ratio is already in there. Personally, I find it very useful to sort and visualize the clusters after weight - independent of the algorithm. So this might be a nice addition to MhKit. I would think that developers would use sea state clusters for gain scheduling. So, as Ryan suggested, it would be interesting how the different clusters impact performance and predicted performance and how sensitive the results are to number of clusters and algorithm. We did a study with sensitivity analysis to bulk spectral bulk parameters here: OCEAN2021. We could potentially do something similar here. |
I was working with this example: PacWave_resource_characterization_example but tried using
sklearn.cluster.KMeans
instead ofsklearn.mixture.GaussianMixture
to cluster the data.KMeans
does require normalization prior to the clustering.I found that
KMeans
requires a smaller number of clusters to converge to the total amount of energy in the data compared toGM
. Here illustrated as the ratio between the representative sea states and the total energy (as in the example)GM
KM
I also organized the clusters after weight and visualized the weight with the color scale in the plots
The results of the different cluster algorithms are different, but I don't know if we should be considering other criteria apart from representing the total amount of energy.
The quick and dirty code to extend the PacWave_resource_characterization_example to reproduce the figure and table are:
The text was updated successfully, but these errors were encountered: