diff --git a/docs/advanced/overview.md b/docs/advanced/overview.md index 6488bd4..88dc195 100644 --- a/docs/advanced/overview.md +++ b/docs/advanced/overview.md @@ -7,29 +7,29 @@ This section provides in-depth guides on leveraging specific features of ClientA Different AI providers offer unique parameters and features. Understanding these can help you fine-tune your AI interactions for optimal results. 1. **Ollama Specific Guide:** Learn about Ollama's unique parameters, including context handling, streaming options, and custom templates. - - [Ollama Specific Guide](ollama_specific.md) + - [Ollama Specific Guide](ollama_specific.md) 2. **OpenAI Specific Guide:** Explore OpenAI's advanced features, such as logit bias and model-specific parameters. - - [OpenAI Specific Guide](openai_specific.md) + - [OpenAI Specific Guide](openai_specific.md) 3. **Replicate Specific Guide**: Discover Replicate's distinctive offerings, including model versioning and custom deployment options. - - [Replicate Specific Guide](replicate_specific.md) + - [Replicate Specific Guide](replicate_specific.md) ## Advanced Usage Topics 4. **Optimizing Performance:** Tips and tricks for improving response time, reducing token usage, and enhancing overall efficiency. - - Soon + - Soon 5. **Handling Long Conversations:** Strategies for managing context in extended dialogues and multi-turn interactions. - - Soon + - Soon 6. **Custom Prompting Techniques:** Advanced prompting methods to extract more accurate and relevant responses from AI models. - - Soon + - Soon 7. **Error Handling and Retry Strategies:** Best practices for gracefully managing API errors and implementing effective retry mechanisms. - - [Error Handling and Retry Strategies](error_handling.md) + - [Error Handling and Retry Strategies](error_handling.md) 8. **Security and Privacy Considerations:** Guidelines for ensuring data security and maintaining user privacy when working with AI APIs. - - Soon + - Soon Each guide in this section is designed to provide you with a deeper understanding of ClientAI's capabilities and how to leverage them effectively in your projects. \ No newline at end of file diff --git a/docs/extending.md b/docs/extending.md index 42fa9e4..9afae74 100644 --- a/docs/extending.md +++ b/docs/extending.md @@ -6,12 +6,15 @@ This guide will walk you through the process of adding support for a new AI prov To add a new provider, you'll need to: -1. Create a new directory for the provider -2. Implement the provider-specific types -3. Implement the provider class -4. Update the main ClientAI class -5. Update the package constants -6. Add tests for the new provider +1. [Create a new directory for the provider](#step-1-create-a-new-directory) +2. [Implement the provider-specific types](#step-2-implement-provider-specific-types) +3. [Implement the provider class](#step-3-implement-the-provider-class) +4. [Implement Unified Error Handling](#step-4-implement-unified-error-handling) +5. [Update the main ClientAI class](#step-5-update-the-main-clientai-class) +6. [Update the package constants](#step-6-update-package-constants-and-dependencies) +7. [Add tests for the new provider](#step-7-add-tests) +8. [Test Error Handling](#step-8-test-error-handling) +9. [Update Documentation](#step-9-update-documentation) Let's go through each step in detail. @@ -106,82 +109,82 @@ Before implementing the provider class, set up error handling for your provider. 1. First, import the necessary error types: -```python -# clientai/newai/provider.py - -from ..exceptions import ( - APIError, - AuthenticationError, - ClientAIError, - InvalidRequestError, - ModelError, - RateLimitError, - TimeoutError, -) -``` + ```python + # clientai/newai/provider.py + + from ..exceptions import ( + APIError, + AuthenticationError, + ClientAIError, + InvalidRequestError, + ModelError, + RateLimitError, + TimeoutError, + ) + ``` 2. Implement the error mapping method in your provider class: -```python -class Provider(AIProvider): - ... - def _map_exception_to_clientai_error( - self, - e: Exception, - status_code: Optional[int] = None - ) -> ClientAIError: - """ - Maps NewAI-specific exceptions to ClientAI exceptions. - - Args: - e: The caught exception - status_code: Optional HTTP status code - - Returns: - ClientAIError: The appropriate ClientAI exception - """ - error_message = str(e) - status_code = status_code or getattr(e, "status_code", None) - - # Map NewAI-specific exceptions to ClientAI exceptions - if isinstance(e, NewAIAuthError): - return AuthenticationError( - error_message, - status_code=401, - original_error=e - ) - elif isinstance(e, NewAIRateLimitError): - return RateLimitError( - error_message, - status_code=429, - original_error=e - ) - elif "model not found" in error_message.lower(): - return ModelError( - error_message, - status_code=404, - original_error=e - ) - elif isinstance(e, NewAIInvalidRequestError): - return InvalidRequestError( - error_message, - status_code=400, - original_error=e - ) - elif isinstance(e, NewAITimeoutError): - return TimeoutError( + ```python + class Provider(AIProvider): + ... + def _map_exception_to_clientai_error( + self, + e: Exception, + status_code: Optional[int] = None + ) -> ClientAIError: + """ + Maps NewAI-specific exceptions to ClientAI exceptions. + + Args: + e: The caught exception + status_code: Optional HTTP status code + + Returns: + ClientAIError: The appropriate ClientAI exception + """ + error_message = str(e) + status_code = status_code or getattr(e, "status_code", None) + + # Map NewAI-specific exceptions to ClientAI exceptions + if isinstance(e, NewAIAuthError): + return AuthenticationError( + error_message, + status_code=401, + original_error=e + ) + elif isinstance(e, NewAIRateLimitError): + return RateLimitError( + error_message, + status_code=429, + original_error=e + ) + elif "model not found" in error_message.lower(): + return ModelError( + error_message, + status_code=404, + original_error=e + ) + elif isinstance(e, NewAIInvalidRequestError): + return InvalidRequestError( + error_message, + status_code=400, + original_error=e + ) + elif isinstance(e, NewAITimeoutError): + return TimeoutError( + error_message, + status_code=408, + original_error=e + ) + + # Default to APIError for unknown errors + return APIError( error_message, - status_code=408, + status_code, original_error=e ) - - # Default to APIError for unknown errors - return APIError( - error_message, - status_code, - original_error=e - ) -``` + ``` ## Step 5: Update the Main ClientAI Class @@ -236,54 +239,54 @@ Update the `clientai/client_ai.py` file to include support for your new provider 1. In the `clientai/_constants.py` file, add a constant for your new provider: -```python -NEWAI_INSTALLED = find_spec("newai") is not None -``` + ```python + NEWAI_INSTALLED = find_spec("newai") is not None + ``` 2. Update the `clientai/__init__.py` file to export the new constant: -```python -from ._constants import NEWAI_INSTALLED -__all__ = [ - # ... existing exports ... - "NEWAI_INSTALLED", -] -``` + ```python + from ._constants import NEWAI_INSTALLED + __all__ = [ + # ... existing exports ... + "NEWAI_INSTALLED", + ] + ``` 3. Update the `pyproject.toml` file to include the new provider as an optional dependency: -```toml -[tool.poetry.dependencies] -python = "^3.9" -pydantic = "^2.9.2" -openai = {version = "^1.50.2", optional = true} -replicate = {version = "^0.34.1", optional = true} -ollama = {version = "^0.3.3", optional = true} -newai-package = {version = "^1.0.0", optional = true} # Add this line -``` + ```toml + [tool.poetry.dependencies] + python = "^3.9" + pydantic = "^2.9.2" + openai = {version = "^1.50.2", optional = true} + replicate = {version = "^0.34.1", optional = true} + ollama = {version = "^0.3.3", optional = true} + newai-package = {version = "^1.0.0", optional = true} # Add this line + ``` 4. Define an optional group for the new provider: -```toml -[tool.poetry.group.newai] -optional = true + ```toml + [tool.poetry.group.newai] + optional = true -[tool.poetry.group.newai.dependencies] -newai-package = "^1.0.0" -``` + [tool.poetry.group.newai.dependencies] + newai-package = "^1.0.0" + ``` 5. Include the new provider in the development dependencies: -```toml -[tool.poetry.group.dev.dependencies] -ruff = "^0.6.8" -pytest = "^8.3.3" -mypy = "1.9.0" -openai = "^1.50.2" -replicate = "^0.34.1" -ollama = "^0.3.3" -newai-package = "^1.0.0" # Add this line -``` + ```toml + [tool.poetry.group.dev.dependencies] + ruff = "^0.6.8" + pytest = "^8.3.3" + mypy = "1.9.0" + openai = "^1.50.2" + replicate = "^0.34.1" + ollama = "^0.3.3" + newai-package = "^1.0.0" # Add this line + ``` 6. Run `poetry update` to update the `poetry.lock` file with the new dependencies. diff --git a/docs/quick-start.md b/docs/quick-start.md index e9fb56b..428da08 100644 --- a/docs/quick-start.md +++ b/docs/quick-start.md @@ -105,6 +105,37 @@ response = client.chat( print(response) ``` +### Ollama Server Management + +If you're running Ollama locally, ClientAI provides a convenient way to manage the Ollama server: + +```python title="ollama_manager.py" +from clientai.ollama import OllamaManager + +# Start and automatically stop the server using a context manager +with OllamaManager() as manager: + # Server is now running + client = ClientAI('ollama') + response = client.generate_text("Hello, world!", model="llama2") + print(response) +``` + +You can also configure basic server settings: + +```python +from clientai.ollama import OllamaManager, OllamaServerConfig + +config = OllamaServerConfig( + host="127.0.0.1", + port=11434, + gpu_layers=35 # Optional: Number of layers to run on GPU +) + +with OllamaManager(config) as manager: + # Your code here + pass +``` + ## Next Steps Now that you've seen the basics of ClientAI, you can: diff --git a/docs/usage/ollama_manager.md b/docs/usage/ollama_manager.md index 2fa7215..7cf2162 100644 --- a/docs/usage/ollama_manager.md +++ b/docs/usage/ollama_manager.md @@ -5,6 +5,7 @@ Ollama Manager provides a streamlined way to prototype and develop applications using Ollama's AI models. Instead of manually managing the Ollama server process, installing it as a service, or running it in a separate terminal, Ollama Manager handles the entire lifecycle programmatically. **Key Benefits for Prototyping:** + - Start/stop Ollama server automatically within your Python code - Configure resources dynamically based on your needs - Handle multiple server instances for testing @@ -234,57 +235,68 @@ config = OllamaServerConfig( Each setting explained: **Server Settings:** + - `host`: IP address to bind the server to - - Default: "127.0.0.1" (localhost) - - Use "0.0.0.0" to allow external connections + - Default: "127.0.0.1" (localhost) + - Use "0.0.0.0" to allow external connections + - `port`: Port number for the server - - Default: 11434 - - Change if default port is in use + - Default: 11434 + - Change if default port is in use + - `timeout`: Maximum time to wait for server startup - - Unit: seconds - - Increase for slower systems + - Unit: seconds + - Increase for slower systems + - `check_interval`: Time between server health checks - - Unit: seconds - - Adjust based on system responsiveness + - Unit: seconds + - Adjust based on system responsiveness **GPU Settings:** + - `gpu_layers`: Number of model layers to offload to GPU - - Higher = more GPU utilization - - Lower = more CPU utilization - - Model-dependent (typically 24-40) + - Higher = more GPU utilization + - Lower = more CPU utilization + - Model-dependent (typically 24-40) + - `gpu_memory_fraction`: Portion of GPU memory to use - - Range: 0.0 to 1.0 - - Higher values may improve performance - - Lower values leave room for other applications + - Range: 0.0 to 1.0 + - Higher values may improve performance + - Lower values leave room for other applications + - `gpu_devices`: Specific GPU devices to use - - Single GPU: `gpu_devices=0` - - Multiple GPUs: `gpu_devices=[0, 1]` - - None: `gpu_devices=None` + - Single GPU: `gpu_devices=0` + - Multiple GPUs: `gpu_devices=[0, 1]` + - None: `gpu_devices=None` **CPU Settings:** + - `cpu_threads`: Number of CPU threads to use - - Default: System dependent - - Recommended: Leave some threads for system - - Example: `os.cpu_count() - 2` + - Default: System dependent + - Recommended: Leave some threads for system + - Example: `os.cpu_count() - 2` + - `memory_limit`: Maximum RAM allocation - - Must use `GiB` or `MiB` units - - Examples: "8GiB", "16384MiB" - - Should not exceed available system RAM + - Must use `GiB` or `MiB` units + - Examples: "8GiB", "16384MiB" + - Should not exceed available system RAM **Compute Settings:** + - `compute_unit`: Preferred compute device - - "auto": Let Ollama decide (recommended) - - "cpu": Force CPU-only operation - - "gpu": Force GPU operation if available + - "auto": Let Ollama decide (recommended) + - "cpu": Force CPU-only operation + - "gpu": Force GPU operation if available **Additional Settings:** + - `env_vars`: Additional environment variables - - Used for platform-specific settings - - Example: `{"CUDA_VISIBLE_DEVICES": "0,1"}` + - Used for platform-specific settings + - Example: `{"CUDA_VISIBLE_DEVICES": "0,1"}` + - `extra_args`: Additional CLI arguments - - Passed directly to Ollama server - - Example: `["--verbose", "--debug"]` -``` + - Passed directly to Ollama server + - Example: `["--verbose", "--debug"]` ## Common Use Cases