Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update overview.mdx #50

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion core-concepts/architecture/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ The execution layer is responsible for large-scale data processing and execution
This module enhances the network security of the consensus layer and execution layer in the dual-chain architecture through a double staked mechanism. Including:

- Shared Staked Pool:
- Native Token Stake: $CNB native token captures network value
- Native Token Stake: $CBT native token captures network value
- Eigenlayer Restake: Introduces Ethereum tokens with low volatility and deep liquidity
- Double Staked Model: By aggregating all validators' stake and their power mapping for various assets, the total cryptographic economic security is calculated.

Expand Down
4 changes: 2 additions & 2 deletions theia/Resources/Roadmap.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@ Theia is the next-generation crypto world model that provides foundation knowled

## Phase 1: Theia Demo Online

📅 June 20, 2024, 12:25:00 PM +UTC
📅 Q3, 2024, 12:25:00 PM +UTC

To let every crypto participator touch the crypto intelligence of Theia, we first open the interactive demo for Theia. It connects users with crypto knowledge in an exciting way - talk to Theia and get what you want.

## Phase 2: Theia Agent Ecosystem

📅 August 1, 2024, 12:25:00 PM +UTC
📅 Q4, 2024, 12:25:00 PM +UTC

**1. Knowledge Construction and Theia Expert Model (Agent)**

Expand Down
17 changes: 16 additions & 1 deletion theia/TheiaChat/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,19 @@ Every user interaction with TheiaChat and every task model built are enhancement

These are powerful complement to the Chainbase Network. It is also the first time that users can contribute data to enhance the Chainbase Network. Before this, only developers could enhance the network by building Manuscripts in the network. **This is the historical transformation of Chainbase Network from targeting 300,000 developers to targeting 30 million to 300 million users.**

In addition, every Changlog is a transaction on the Chainbase Network, which is permanently recorded on the Chainbase Network, and is public, censorship-resistant and cannot be manipulated. Its traceability guarantees the data income rights of every data producer.
In addition, every Changlog is a transaction on the Chainbase Network, which is permanently recorded on the Chainbase Network, and is public, censorship-resistant and cannot be manipulated. Its traceability guarantees the data income rights of every data producer.

### Enhance Chainbase Network by doing RLHF

Users can click on the right of TheiaChat after asking a question to make Theia smarter. The principle behind this behavior is Reinforcement Learning with Human Feedback (RLHF), which is a reinforcement learning technique designed to train models by combining human feedback to make their behavior more in line with human expectations.

The implementation of RLHF typically involves the following steps:

1. Pre-training model: First, the model is pre-trained on a large amount of pre-collected data. This step can use traditional supervised learning methods.

2. Human feedback collection: By having humans evaluate the model's output, collect human feedback on the model's behavior. This feedback is usually presented in the form of rewards (or punishments), indicating whether a certain behavior is good or bad.

3. Reinforcement learning training: Use the collected human feedback as a reward signal, further optimize the model's policy through reinforcement learning algorithms (such as policy gradient method). Specifically, after the model generates output, adjust the rewards based on human feedback to update the model's policy, making it more inclined to produce results that are popular with humans.

4. Iterative training: Repeat the process of collecting human feedback and reinforcement learning, gradually improving the model's performance.