Merge branch 'develop' into 'fb-leap-1141'

Workflow run: https://github.com/HumanSignal/label-studio/actions/runs/9744672894
HumanSignal · Jul 1, 2024 · d18c5f3 · d18c5f3
2 parents 5007e0d + 6fe6c5f
commit d18c5f3
Show file tree

Hide file tree

Showing 31 changed files with 404 additions and 34 deletions.
diff --git a/.github/workflows/docker-build-ontop.yml b/.github/workflows/docker-build-ontop.yml
@@ -91,7 +91,7 @@ jobs:
           password: ${{ secrets.DOCKERHUB_TOKEN }}
 
       - name: Push Docker image
-        uses: docker/build-push-action@v6.1.0
+        uses: docker/build-push-action@v6.2.0
         id: docker_build_and_push
         with:
           context: .

diff --git a/.github/workflows/docker-build-ubi.yml b/.github/workflows/docker-build-ubi.yml
@@ -150,7 +150,7 @@ jobs:
             core.setOutput("ubi-tags", ubiTags.join(','));
 
       - name: Build and push ubi
-        uses: docker/build-push-action@v6.1.0
+        uses: docker/build-push-action@v6.2.0
         id: docker_build_and_push_ubi
         with:
           context: .

diff --git a/.github/workflows/docker-build.yml b/.github/workflows/docker-build.yml
@@ -112,7 +112,7 @@ jobs:
           fi
 
       - name: Push Docker image
-        uses: docker/build-push-action@v6.1.0
+        uses: docker/build-push-action@v6.2.0
         id: docker_build_and_push
         with:
           context: .

diff --git a/.github/workflows/docker-release-promote.yml b/.github/workflows/docker-release-promote.yml
@@ -208,7 +208,7 @@ jobs:
           EOF
 
       - name: Build and Push Release Ubuntu Docker image
-        uses: docker/build-push-action@v6.1.0
+        uses: docker/build-push-action@v6.2.0
         id: docker_build
         with:
           context: ${{ steps.release_dockerfile.outputs.release_dir }}

diff --git a/docs/source/guide/label_studio_compare.md b/docs/source/guide/label_studio_compare.md
@@ -138,6 +138,25 @@ Label Studio is available as open source software as well as an [Enterprise clo
     <td style="text-align:center">✔️</td>
   </tr>
 
+  <tr>
+    <td colspan="3"><b>Prompts (Beta)</b></td>
+  </tr>
+   <tr>
+    <td><a href="https://docs.humansignal.com/guide/prompts_overview">Fully automated data labeling using GenAI.</a></td>
+    <td style="text-align:center">❌</td>
+    <td style="text-align:center">✔️</td>
+  </tr>
+  <tr>
+    <td><a href="https://docs.humansignal.com/guide/prompts_draft">Evaluate and fine-tune LLM prompts against a ground truth dataset.</a></td>
+    <td style="text-align:center">❌</td>
+    <td style="text-align:center">✔️</td>
+  </tr>
+  <tr>
+    <td><a href="https://docs.humansignal.com/guide/prompts_predictions">Bootstrap your labeling project using auto-generated predictions.</a></td>
+    <td style="text-align:center">❌</td>
+    <td style="text-align:center">✔️</td>
+  </tr>
+
   <tr>
     <td colspan="4"><b>User Management</b></td>
   </tr>
@@ -244,3 +263,4 @@ Label Studio is available as open source software as well as an [Enterprise clo
     <td style="text-align:center">✔️</td>
   </tr>
 </table>
+
diff --git a/docs/source/guide/project_settings_lse.md b/docs/source/guide/project_settings_lse.md
@@ -157,7 +157,7 @@ Configure additional settings for annotators.
 
 <dd>
 
-If you have an ML backend or model connected, you can use this setting to determine whether tasks should be pre-labeled using predictions from the model. For more information, see [Integrate Label Studio into your machine learning pipeline](ml). 
+If you have an ML backend or model connected, or if you're using [Prompts](prompts_overview) to generate predictions, you can use this setting to determine whether tasks should be pre-labeled using predictions. For more information, see [Integrate Label Studio into your machine learning pipeline](ml) and [Generate predictions from a prompt](prompts_predictions). 
 
 Use the drop-down menu to select the predictions source. For example, you can select a [connected model](#Model) or a set of [predictions](#Predictions). 
 
@@ -479,7 +479,11 @@ And the following actions are available from the overflow menu next to a connect
 
 ## Predictions
 
-From here you can view predictions that have been imported or generated when executing the **Batch Predictions** action from the Data Manager. For more information on using predictions, see [Import pre-annotated data into Label Studio](predictions). 
+From here you can view predictions that have been imported, generated with [Prompts](prompts_predictions), or generated when executing the **Batch Predictions** action from the Data Manager. For more information on using predictions, see [Import pre-annotated data into Label Studio](predictions). 
+
+To remove predictions from the project, click the overflow menu next to the predictions set and select **Delete**.  
+
+To determine which predictions are show to annotators, use the [**Annotation > Live Predictions** section](#Annotation). 
 
 ## Cloud storage
 

diff --git a/docs/source/guide/prompts_create.md b/docs/source/guide/prompts_create.md
@@ -0,0 +1,70 @@
+---
+title: Create a Prompt
+short: Create a Prompt
+tier: enterprise
+type: guide
+order: 0
+order_enterprise: 228
+meta_title: Create a Prompt
+meta_description: How to create a Prompt
+section: Prompts
+date: 2024-06-11 16:53:16
+---
+
+
+## Prerequisites
+
+* An OpenAI API key. 
+* A project that meets the following criteria:
+  * Text-based data set (meaning you are annotating text and not image or video files). 
+  * The labeling configuration for the project must be set up to use single-class classification (`choice="single"`). 
+  * (Optional, depending on your [use case](prompts_overview#Use-cases) and if you want to evaluate the accuracy of your prompt): At least one task with a [ground truth annotation](quality#Define-ground-truth-annotations-for-a-project). 
+
+## API key
+
+You can only specify one OpenAI API key per organization, and it only needs to be added once. 
+
+Once added, it is automatically used for all new Prompts. 
+
+To remove the key, click **API Keys** in the upper right of the Prompts page. You'll have the option to remove the key and add a new one. 
+
+## Create a Prompt
+
+From the Prompts page, click **Create Prompt** in the upper right and then complete the following fields:
+
+<div class="noheader rowheader">
+
+| | |
+| --- | --- |
+| Name | Enter a name for the Prompt. |
+| Description | Enter a description for the Prompt.  |
+| Type | Select the Prompt model type. At this time, we only support [text classification](#Text-classification). |
+| Target Project| Select the project you want to use. If you don't have any eligible projects, you will see an error message. <br><br>See the note below.  |
+| Classes | This list is automatically generated from the labeling configuration of the target project. |
+
+</div>
+
+!!! note Eligible projects
+    Target projects must meet the following criteria:
+    * The labeling configuration for the project must use single class classification (e.g. `choice="single"`). 
+    * The project must include text data (e.g. it cannot only include unsupported data types such as image, audio, video).
+    * You must have access to the project. If you are in the Manager role, you need to be added to the project to have access. 
+
+![Screenshot of the create model page](/images/prompts/model_create.png)
+
+## Types
+
+### Text classification 
+
+At present, Prompts only supports single-label text classification tasks.  
+
+Text classification is the process of assigning predefined categories or labels to segments of text based on their content. This involves analyzing the text and determining which category or label best describes its subject, sentiment, or purpose. The goal is to organize and categorize textual data in a way that makes it easier to analyze, search, and utilize. 
+
+Text classification labeling tasks are fundamental in many applications, enabling efficient data organization, improving searchability, and providing valuable insights through data analysis. Some examples include:
+
+* **Spam Detection**: Classifying emails as "spam" or "ham" (not spam). 
+* **Sentiment Analysis**: Categorizing user reviews as "positive," "negative," or "neutral."
+* **Topic Categorization**: Assigning articles to categories like "politics," "sports," "technology," etc.
+* **Support Ticket Classification**: Labeling customer support tickets based on the issue type, such as "billing," "technical support," or "account management."
+* **Content Moderation**: Identifying and labeling inappropriate content on social media platforms, such as "offensive language," "hate speech," or "harassment."
+
diff --git a/docs/source/guide/prompts_draft.md b/docs/source/guide/prompts_draft.md
@@ -0,0 +1,68 @@
+---
+title: Draft and run prompts
+short: Draft and run prompts
+tier: enterprise
+type: guide
+order: 0
+order_enterprise: 231
+meta_title: Draft your Prompt
+meta_description: Create and evaluate an LLM prompt
+section: Prompts
+date: 2024-06-12 14:09:09
+---
+
+With your [Prompt created](prompts_create), you can begin drafting your prompt content to run against baseline tasks.
+
+## Draft a prompt and generate predictions
+
+
+1. Select your base model. For a description of all OpenAI models, see [OpenAI's models overview](https://platform.openai.com/docs/models/models-overview).
+2. In the **Prompt** field, enter your prompt. Keep in mind the following:
+    * You must include the text class. (In the demo below, this is the `review` class.) Click the text class name to insert it into the prompt. 
+    * Although not strictly required, you should provide definitions for each class to ensure prediction accuracy and to help [add context](#Add-context). 
+3. Select your baseline:
+   * **All Project Tasks** - Generate predictions for all tasks in the project. Depending on the size of your project, this might take some time to process. This does not generate an accuracy score for the prompt. 
+
+        See the [Bootstrapping projects with prompts](prompts_overview#Bootstrapping-projects-with-Prompts) use case.
+   * **Sample Tasks** - Generate predictions for the first 20 tasks in the project. This does not generate an accuracy score for the prompt. 
+
+        See the [Bootstrapping projects with prompts](prompts_overview#Bootstrapping-projects-with-Prompts) use case.
+   * **Ground Truths** - Generate predictions and a prompt accuracy score for all tasks with ground truth annotations. This option is only available if your project has ground truth annotations. 
+
+        See the [Auto-labeling with Prompts](prompts_overview#Auto-labeling-with-Prompts) use case and the [Prompt evaluation and fine-tuning](prompts_overview#Prompt-evaluation-and-fine-tuning). 
+4. Click **Save**. 
+5. Click **Evaluate** (if running against a ground truth baseline) or **Run**. 
+
+!!! warning
+    When you click **Evaluate** or **Run**, you will create predictions for each task in the baseline you selected and overwrite any previous predictions you generated with this prompt. 
+
+    Evaluating your Prompts can result in multiple predictions on your tasks: if you have multiple Prompts for one Project, or if you click both **Evaluate**/**Run** and **Get Predictions for All Tasks from a Prompt**, you will see multiple predictions for tasks in the Data Manager. 
+
+<br><br>
+<video src="../images/prompts/prompts.mp4" controls="controls" style="max-width: 800px;" class="gif-border" />
+
+## Drafting effective prompts
+
+For a comprehensive guide to drafting prompts, see [The Prompt Report: A Systematic Survey of Prompting Techniques](https://arxiv.org/abs/2406.06608) or OpenAI's guide to [Prompt Engineering](https://platform.openai.com/docs/guides/prompt-engineering). 
+
+### Text placement
+
+When you place your text class in the prompt (`review` in the demo above), this placeholder will be replaced by the actual text.
+
+Depending on the length and complexity of your text, inserting it into the middle of another sentence or thought could potentially confuse the LLM. 
+
+For example, instead of "*Classify `text` as one of the following:*", try to structure it as something like, "*Given the following text: `text`. Classify this text as one of the following:*." 
+
+### Define your objective 
+
+The first step to composing an effective prompt is to clearly define the task you want to accomplish. Your prompt should explicitly state that the goal is to classify the given text into predefined categories. This sets clear expectations for the model. For instance, instead of a vague request like "Analyze this text," you should say, "Classify the following text into categories such as 'spam' or 'not spam'." Clarity helps the model understand the exact task and reduces ambiguity in the responses.
+
+### Add context
+
+Context is crucial in guiding the model towards accurate classification. Providing background information or examples can significantly enhance the effectiveness of the prompt. For example, if you are classifying customer reviews, include a brief description of what constitutes a positive, negative, or neutral review. You could frame it as, "Classify the following customer review as 'positive,' 'negative,' or 'neutral.' A positive review indicates customer satisfaction, a negative review indicates dissatisfaction, and a neutral review is neither overly positive nor negative." This additional context helps the model align its responses with your specific requirements.
+
+### Specificity 
+
+Specificity in your prompt enhances the precision of the model's output. This includes specifying the format you want for the response, any particular keywords or phrases that are important, and any other relevant details. For instance, "Please classify the following text and provide the category in a single word: 'positive,' 'negative,' or 'neutral.'" By being specific, you help ensure that the model's output is consistent and aligned with your expectations. 
+
+