This repository has been archived by the owner on Sep 23, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 15
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Refactor and add new providers, improve configuration, and upda…
…te dependencies - Added a new `google-tts.go` file with the implementation of various methods for the Google Text-to-Speech provider. - Made changes to the configuration variables and added new structs in `internal/config/config.go`. - Made changes in the `main.go` file, including renaming functions, adding authentication middleware, and adding new routes. - Updated the default values and added new default values in `internal/config/parse.go`. - Made changes to import statements, renamed functions and fields, and added support for the Google Text-to-Speech provider in `internal/talker.go`. - Updated the `.gitignore` file to ignore specific files. - Added functionality, error handling, and logging statements in `pkg/providers/whisper.go`. - Added a new sample configuration file with additional sections and configurations in `configs/talk.sample.yaml`. - Updated the versions of various dependencies in the `go.mod` file. - Deleted the `configs/config.sample.yaml` file. - Added a new method and made changes to existing methods in `pkg/providers/chatgpt.go`. - Made changes to interface names, function signatures, and added comments in `pkg/providers/provider.go`. - Made changes to field types, added a new method, and modified existing methods in `pkg/providers/elevenlabs.go`. - Made changes to method names, added a new struct, modified method calls, and updated return messages in `internal/handler.go`.
- Loading branch information
1 parent
c10dd1a
commit 2e073c1
Showing
15 changed files
with
483 additions
and
153 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
server: | ||
# Optional. Use port 8000 if not specified. | ||
port: 8000 | ||
# Optional. Use true if not specified. Perform a startup request to each provider of speech-to-text, | ||
# text-to-speech, and llm upon server initialization. | ||
# Shutdown the server if there are any errors, such as invalid API key or connection error. | ||
# Log a warning if there are any issues, such as quota exhaustion or incorrect transcriptions, indicating a potential problem | ||
# These requests consume a minimal amount of quota or even no quota. | ||
providers-must-function: true | ||
# Optional. Enable basic auth only when there is at least one pair of username and password. | ||
basic-auth: | ||
- username1: password1 | ||
- username2: password2 | ||
- username3: # match only empty password | ||
- username4: "*" # match any password, including empty password | ||
|
||
speech-to-text: | ||
open-ai-whisper: | ||
api-key: sk-abc123abc123abc123abc123abc123abc123 | ||
|
||
text-to-speech: | ||
elevenlabs: | ||
api-key: abc123abc123abc1 | ||
# Optional. Use this voice whenever available, or randomly select a voice from the voice list | ||
voice-id: P9sd8KYc82I23b9dUJm | ||
# Optional. Range: 0.0~1.0. Use 0.5 if not specified. Increasing `stability` can make speech more stable and less expressive. | ||
stability: 0.5 | ||
#Optional. Range: 0.0~1.0. Use 50(50%) if not specified. Increasing `clarity` brings more clarity and more background artifacts. | ||
clarity: 0.5 | ||
google-text-to-speech: | ||
# Download a key file from Google Cloud. see https://cloud.google.com/iam/docs/keys-create-delete#iam-service-account-keys-create-console | ||
path-to-keyfile: ./google_credentials.json | ||
# Optional. Use en-US if not specified. Choose a language-code from https://www.rfc-editor.org/rfc/bcp/bcp47.txt | ||
language-code: en-US | ||
# Optional. Use this voice whenever available, or randomly select a voice from the voice list | ||
voice-id: en-US-Standard-C | ||
# Optional. Use female if not specified. | ||
# When `voice-id` is not set, `gender` will be used to choose a voice. | ||
# | ||
# The preferred gender of the voice. If not set, the service will | ||
# choose a voice based on the other parameters such as language_code and | ||
# name. Note that this is only a preference, not requirement; if a | ||
# voice of the appropriate gender is not available, the synthesizer should | ||
# substitute a voice with a different gender rather than failing the request. | ||
# | ||
# Options: [male, female, neutral] | ||
gender: female | ||
# Optional. Range [0.25, 4.0]. Use 1.0 if not specified. Speaking speed | ||
speaking-rate: 1.0 | ||
# Optional. Range [-20.0, 20.0]. Use 0 if not specified. Unit: semitone(12 semitone = 1 octave) | ||
pitch: 0 | ||
# Optional. Range [-96.0, 16.0]. Use 0 if not specified. The bigger value comes with louder voice. | ||
volume-gain-db: 0 | ||
|
||
llm: | ||
open-ai-chat-gpt: | ||
# typically, you would use the same API key as speech-to-text.open-ai-whisper | ||
api-key: sk-abc123abc123abc123abc123abc123abc123 | ||
# Optional. Use gpt-3.5-turbo if not specified. For model list, see https://platform.openai.com/docs/models/gpt-4 | ||
model: gpt-3.5-turbo | ||
# Optional. Use 2000 if not specified | ||
max-generation-token: 2000 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.