Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2024.05.07 #41

Closed
10 of 25 tasks
seanmcilroy29 opened this issue May 7, 2024 · 7 comments
Closed
10 of 25 tasks

2024.05.07 #41

seanmcilroy29 opened this issue May 7, 2024 · 7 comments
Assignees

Comments

@seanmcilroy29
Copy link
Contributor

seanmcilroy29 commented May 7, 2024


2024.05.07 Agenda/Minutes


Time 0800 (PT) / 1600 (BST) - See the time in your timezone

  • Chair – Adrian Cockcroft
  • Chair - Pindy Bhullar (UBS)
  • Convener – Sean Mcilroy (Linux Foundation)

Antitrust Policy

Joint Development Foundation meetings may involve participation by industry competitors, and the Joint Development Foundation intends to conduct all of its activities in accordance with applicable antitrust and competition laws. It is, therefore, extremely important that attendees adhere to meeting agendas and be aware of and not participate in any activities that are prohibited under applicable US state, federal or foreign antitrust and competition laws.

If you have questions about these matters, please contact your company counsel or counsel to the Joint Development Foundation, DLA Piper.

Recordings

WG agreed to record all Meetings. This meeting recording will be available until the next scheduled meeting.

Roll Call

Please add 'Attended' to this issue during the meeting to denote attendance.

Any untracked attendees will be added by the GSF team below:

  • Full Name, Affiliation, (optional) GitHub username

Agenda

Tool Discussion - Invited guest

Renewable Energy Percentage

Discussion Board

Open Issues Project Board

Open Issues for review

AOB

  • Topics added during the meeting

Action Items

  • Update the cloud metadata plugin readme to include information on the regional metadata dataset.
  • Provide a pull request updating the GSF data CSV file with the current regional metadata dataset.
  • Contact Hubblo to discuss collaborating on energy monitoring and potentially joining a future meeting.
  • Investigate available power/energy metrics for AWS training instances and report findings.
  • Check in with the Cloud Carbon Footprint initiative to explore using regional metadata datasets to improve their data.
@seanmcilroy29
Copy link
Contributor Author

Attended

@adrianco
Copy link
Contributor

adrianco commented May 7, 2024

attended

@Henry-WattTime
Copy link

Attended

@coopere
Copy link

coopere commented May 7, 2024

attended

2 similar comments
@eilamt
Copy link

eilamt commented May 7, 2024

attended

@rootfs
Copy link

rootfs commented May 7, 2024

attended

@seanmcilroy29
Copy link
Contributor Author

MoM

Adrian opens the meeting at 0800 (PT) / 1600 (BST)
Invited guest - Benoit Petit (Hubblo)

Meeting Summary

Adrian and Chris discussed the challenges of managing data in the cloud, particularly with regard to standardisation. They discussed the difficulty of mapping geolocation data and region names, while Tamar expressed her interest in standardising data management. Later, the group talked about the difficulties of accurately measuring carbon emissions in cloud computing due to the lack of data provided by cloud providers. Adrian and Tim emphasised the need for a more sophisticated approach to allocating carbon emissions, while Tamar highlighted the importance of understanding how to format and transform energy data. Adrian and Cooper discussed the challenges of obtaining accurate and timely data on cloud computing’s carbon footprint. Tamar mentioned the Visa API feature, which can be used to evaluate the impact of AWS cloud instances on the entire lifecycle.

Meeting Notes

Power consumption data in cloud instances, including provenance and API standards.
Adrian and Henry discussed data on water usage efficiency. Adrian suggested adding a new column to the dataset, while Henry expressed concerns about the data's origin. When Chris inquired about the source of the water data, Adrian mentioned the possibility of Microsoft, AWS, and Google providing such data with the addition of a column of reference URLs. Later, Adrian and Ben discussed the agent bandwidth project's progress and potential solutions for measuring GPU energy consumption. An AI researcher requested a standardized API for power consumption data across different cloud providers. Tamar suggested an open telemetry project and stressed the importance of effective communication with cloud providers.

Improving power consumption modelling in cloud computing.
Tim proposes to develop a model that can estimate power consumption based on resource counters and suggests sharing models in a repository to compare and improve them. Adrian suggests involving the cloud provider benchmarking team to run the Kepler model and share results. However, the model approach works well only when an Intel chip is available and breaks down with Graviton and other custom chips. Adrian and Tim discuss the need for better data and validation methods for AI models in cloud computing. Adrian suggests gradually improving data quality by establishing a relationship with cloud providers.

Power and energy data accuracy in cloud computing.
Ben: Cloud providers have worst-case kilowatt-hour data but also use telemetry for efficiency monitoring. Ben discusses challenges in modelling power consumption for complex systems, such as those with GPUs and shared memory clusters. Adrian notes that even simple systems are broken into VMs, nodes, containers, and pods, making allocation and attribution more complex. Adrian discusses challenges with obtaining accurate carbon footprint data for cloud computing workloads.

Data privacy and security in the cloud industry.
Hyperscales possess power and energy data, but they are too exclusive, and there have been limited attempts to share emissions data internally. Adrian suggests that data with reduced accuracy but delivered frequently could be valuable, and minute-by-minute data could be used to conceal technical details while maintaining a long-term average. Adrian discusses methods to obscure data, focusing on the TDP curve and power usage. He shares the team's accumulated knowledge on power generation, carbon intensity, and data centre context. Adrian provides a diagram of tools for estimating carbon footprint, which includes Kepler, Prometheus, and Carbonara. The diagram is a public repository where all acquired information is stored, and everyone can request access and add new tools.

Calculating the carbon footprint of cloud services, including object storage and allocation of energy usage.
Tamar discusses three initiatives: visa API, normalising data, and methodology for cloud providers' impacts. Tamar suggests using data from cloud providers' data centres to estimate energy impact. Tim explains how energy consumption for object storage services is calculated, considering factors like size and IO. Adrian notes that service providers must allocate energy consumption to customers per EU regulations and other jurisdictions. Adrian: Cloud providers don't give enough carbon data, so a team is working on a workaround. The team focuses on the missing links between cloud providers and carbon data.

Carbon measurement and data analysis for cloud providers.
Tim and Adrian discussed the issue of data accuracy in carbon measurement. Tim suggested a middle ground between financial and detailed models to address the challenges. Adrian agreed and noted that climate modelling often follows a similar progression from financial to process to detailed models. Adrian shared that he was struggling to map geolocation data and region names for electricity consumption data, and he needs to engage with Microsoft directly to sort out confusion regarding Azure data centres and cloud regions. Adrian encouraged the listeners to review open issues and add comments. Sean acknowledged the help of Tamar and Ben in organizing the meeting. The group discussed how to standardize solutions for different organizations working on similar projects. Tamar suggested a synchronous dispute through GitHub to achieve this goal.

Action Items

  • Contribute additional relevant projects, tools, and data sources to the mapping diagram.
  • Add links from the discussion to the mapping diagram on the real-time cloud GitHub mirror.
  • Follow up with Microsoft to clarify the mapping of data centres to cloud regions in the impact framework dataset.
  • Engage Ben more in the long term to stay updated on related projects
  • Members - Review issues on GitHub and provide comments/input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants