diff --git a/docs/tutorials/tfx/template.ipynb b/docs/tutorials/tfx/template.ipynb index 8ac66d4c7d..8c21af67f1 100644 --- a/docs/tutorials/tfx/template.ipynb +++ b/docs/tutorials/tfx/template.ipynb @@ -48,15 +48,15 @@ "Note: We recommend running this tutorial on Google Cloud Vertex AI Workbench. [Launch this notebook on Vertex AI Workbench](https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?q=download_url%3Dhttps%253A%252F%252Fraw.githubusercontent.com%252Ftensorflow%252Ftfx%252Fmaster%252Fdocs%252Ftutorials%252Ftfx%252Ftemplate.ipynb).\n", "\n", "\n", - "\u003cdiv class=\"devsite-table-wrapper\"\u003e\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n", - "\u003ctd\u003e\u003ca target=\"_blank\" href=\"https://www.tensorflow.org/tfx/tutorials/tfx/template\"\u003e\n", - "\u003cimg src=\"https://www.tensorflow.org/images/tf_logo_32px.png\"/\u003eView on TensorFlow.org\u003c/a\u003e\u003c/td\u003e\n", - "\u003ctd\u003e\u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/tfx/blob/master/docs/tutorials/tfx/template.ipynb\"\u003e\n", - "\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\"\u003eRun in Google Colab\u003c/a\u003e\u003c/td\u003e\n", - "\u003ctd\u003e\u003ca target=\"_blank\" href=\"https://github.com/tensorflow/tfx/tree/master/docs/tutorials/tfx/template.ipynb\"\u003e\n", - "\u003cimg width=32px src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\"\u003eView source on GitHub\u003c/a\u003e\u003c/td\u003e\n", - "\u003ctd\u003e\u003ca href=\"https://storage.googleapis.com/tensorflow_docs/tfx/docs/tutorials/tfx/template.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/download_logo_32px.png\" /\u003eDownload notebook\u003c/a\u003e\u003c/td\u003e\n", - "\u003c/table\u003e\u003c/div\u003e" + "
\n", + "\n", + "\n", + "\n", + "\n", + "
\n", + "View on TensorFlow.org\n", + "Run in Google Colab\n", + "View source on GitHubDownload notebook
" ] }, { @@ -156,7 +156,7 @@ "outputs": [], "source": [ "# Read GCP project id from env.\n", - "shell_output=!gcloud config list --format 'value(core.project)' 2\u003e/dev/null\n", + "shell_output=!gcloud config list --format 'value(core.project)' 2>/dev/null\n", "GOOGLE_CLOUD_PROJECT=shell_output[0]\n", "%env GOOGLE_CLOUD_PROJECT={GOOGLE_CLOUD_PROJECT}\n", "print(\"GCP project ID:\" + GOOGLE_CLOUD_PROJECT)" @@ -168,9 +168,9 @@ "id": "A_6r4uzE0oky" }, "source": [ - "We also need to access your KFP cluster. You can access it in your Google Cloud Console under \"AI Platform \u003e Pipeline\" menu. The \"endpoint\" of the KFP cluster can be found from the URL of the Pipelines dashboard, or you can get it from the URL of the Getting Started page where you launched this notebook. Let's create an `ENDPOINT` environment variable and set it to the KFP cluster endpoint. **ENDPOINT should contain only the hostname part of the URL.** For example, if the URL of the KFP dashboard is `https://1e9deb537390ca22-dot-asia-east1.pipelines.googleusercontent.com/#/start`, ENDPOINT value becomes `1e9deb537390ca22-dot-asia-east1.pipelines.googleusercontent.com`.\n", + "We also need to access your KFP cluster. You can access it in your Google Cloud Console under \"AI Platform > Pipeline\" menu. The \"endpoint\" of the KFP cluster can be found from the URL of the Pipelines dashboard, or you can get it from the URL of the Getting Started page where you launched this notebook. Let's create an `ENDPOINT` environment variable and set it to the KFP cluster endpoint. **ENDPOINT should contain only the hostname part of the URL.** For example, if the URL of the KFP dashboard is `https://1e9deb537390ca22-dot-asia-east1.pipelines.googleusercontent.com/#/start`, ENDPOINT value becomes `1e9deb537390ca22-dot-asia-east1.pipelines.googleusercontent.com`.\n", "\n", - "\u003e**NOTE: You MUST set your ENDPOINT value below.**" + ">**NOTE: You MUST set your ENDPOINT value below.**" ] }, { @@ -295,7 +295,7 @@ "id": "1tEYUQxH0olO" }, "source": [ - "\u003eNOTE: Don't forget to change directory in `File Browser` on the left by clicking into the project directory once it is created." + ">NOTE: Don't forget to change directory in `File Browser` on the left by clicking into the project directory once it is created." ] }, { @@ -344,7 +344,7 @@ "outputs": [], "source": [ "!{sys.executable} -m models.features_test\n", - "!{sys.executable} -m models.keras.model_test\n" + "!{sys.executable} -m models.keras_model.model_test\n" ] }, { @@ -355,7 +355,7 @@ "source": [ "## Step 4. Run your first TFX pipeline\n", "\n", - "Components in the TFX pipeline will generate outputs for each run as [ML Metadata Artifacts](https://www.tensorflow.org/tfx/guide/mlmd), and they need to be stored somewhere. You can use any storage which the KFP cluster can access, and for this example we will use Google Cloud Storage (GCS). A default GCS bucket should have been created automatically. Its name will be `\u003cyour-project-id\u003e-kubeflowpipelines-default`.\n" + "Components in the TFX pipeline will generate outputs for each run as [ML Metadata Artifacts](https://www.tensorflow.org/tfx/guide/mlmd), and they need to be stored somewhere. You can use any storage which the KFP cluster can access, and for this example we will use Google Cloud Storage (GCS). A default GCS bucket should have been created automatically. Its name will be `-kubeflowpipelines-default`.\n" ] }, { @@ -386,7 +386,7 @@ "source": [ "Let's create a TFX pipeline using the `tfx pipeline create` command.\n", "\n", - "\u003eNote: When creating a pipeline for KFP, we need a container image which will be used to run our pipeline. And `skaffold` will build the image for us. Because skaffold pulls base images from the docker hub, it will take 5~10 minutes when we build the image for the first time, but it will take much less time from the second build." + ">Note: When creating a pipeline for KFP, we need a container image which will be used to run our pipeline. And `skaffold` will build the image for us. Because skaffold pulls base images from the docker hub, it will take 5~10 minutes when we build the image for the first time, but it will take much less time from the second build." ] }, { @@ -443,7 +443,7 @@ "However, we recommend visiting the KFP Dashboard. You can access the KFP Dashboard from the Cloud AI Platform Pipelines menu in Google Cloud Console. Once you visit the dashboard, you will be able to find the pipeline, and access a wealth of information about the pipeline.\n", "For example, you can find your runs under the *Experiments* menu, and when you open your execution run under Experiments you can find all your artifacts from the pipeline under *Artifacts* menu.\n", "\n", - "\u003eNote: If your pipeline run fails, you can see detailed logs for each TFX component in the Experiments tab in the KFP Dashboard.\n", + ">Note: If your pipeline run fails, you can see detailed logs for each TFX component in the Experiments tab in the KFP Dashboard.\n", " \n", "One of the major sources of failure is permission related problems. Please make sure your KFP cluster has permissions to access Google Cloud APIs. This can be configured [when you create a KFP cluster in GCP](https://cloud.google.com/ai-platform/pipelines/docs/setting-up), or see [Troubleshooting document in GCP](https://cloud.google.com/ai-platform/pipelines/docs/troubleshooting)." ] @@ -458,7 +458,7 @@ "\n", "In this step, you will add components for data validation including `StatisticsGen`, `SchemaGen`, and `ExampleValidator`. If you are interested in data validation, please see [Get started with Tensorflow Data Validation](https://www.tensorflow.org/tfx/data_validation/get_started).\n", "\n", - "\u003e**Double-click to change directory to `pipeline` and double-click again to open `pipeline.py`**. Find and uncomment the 3 lines which add `StatisticsGen`, `SchemaGen`, and `ExampleValidator` to the pipeline. (Tip: search for comments containing `TODO(step 5):`). Make sure to save `pipeline.py` after you edit it.\n", + ">**Double-click to change directory to `pipeline` and double-click again to open `pipeline.py`**. Find and uncomment the 3 lines which add `StatisticsGen`, `SchemaGen`, and `ExampleValidator` to the pipeline. (Tip: search for comments containing `TODO(step 5):`). Make sure to save `pipeline.py` after you edit it.\n", "\n", "You now need to update the existing pipeline with modified pipeline definition. Use the `tfx pipeline update` command to update your pipeline, followed by the `tfx run create` command to create a new execution run of your updated pipeline.\n" ] @@ -500,7 +500,7 @@ "\n", "In this step, you will add components for training and model validation including `Transform`, `Trainer`, `Resolver`, `Evaluator`, and `Pusher`.\n", "\n", - "\u003e**Double-click to open `pipeline.py`**. Find and uncomment the 5 lines which add `Transform`, `Trainer`, `Resolver`, `Evaluator` and `Pusher` to the pipeline. (Tip: search for `TODO(step 6):`)\n", + ">**Double-click to open `pipeline.py`**. Find and uncomment the 5 lines which add `Transform`, `Trainer`, `Resolver`, `Evaluator` and `Pusher` to the pipeline. (Tip: search for `TODO(step 6):`)\n", "\n", "As you did before, you now need to update the existing pipeline with the modified pipeline definition. The instructions are the same as Step 5. Update the pipeline using `tfx pipeline update`, and create an execution run using `tfx run create`.\n" ] @@ -545,17 +545,17 @@ "\n", "[BigQuery](https://cloud.google.com/bigquery) is a serverless, highly scalable, and cost-effective cloud data warehouse. BigQuery can be used as a source for training examples in TFX. In this step, we will add `BigQueryExampleGen` to the pipeline.\n", "\n", - "\u003e**Double-click to open `pipeline.py`**. Comment out `CsvExampleGen` and uncomment the line which creates an instance of `BigQueryExampleGen`. You also need to uncomment the `query` argument of the `create_pipeline` function.\n", + ">**Double-click to open `pipeline.py`**. Comment out `CsvExampleGen` and uncomment the line which creates an instance of `BigQueryExampleGen`. You also need to uncomment the `query` argument of the `create_pipeline` function.\n", "\n", "We need to specify which GCP project to use for BigQuery, and this is done by setting `--project` in `beam_pipeline_args` when creating a pipeline.\n", "\n", - "\u003e**Double-click to open `configs.py`**. Uncomment the definition of `GOOGLE_CLOUD_REGION`, `BIG_QUERY_WITH_DIRECT_RUNNER_BEAM_PIPELINE_ARGS` and `BIG_QUERY_QUERY`. You should replace the region value in this file with the correct values for your GCP project.\n", + ">**Double-click to open `configs.py`**. Uncomment the definition of `GOOGLE_CLOUD_REGION`, `BIG_QUERY_WITH_DIRECT_RUNNER_BEAM_PIPELINE_ARGS` and `BIG_QUERY_QUERY`. You should replace the region value in this file with the correct values for your GCP project.\n", "\n", - "\u003e**Note: You MUST set your GCP region in the `configs.py` file before proceeding.**\n", + ">**Note: You MUST set your GCP region in the `configs.py` file before proceeding.**\n", "\n", - "\u003e**Change directory one level up.** Click the name of the directory above the file list. The name of the directory is the name of the pipeline which is `my_pipeline` if you didn't change.\n", + ">**Change directory one level up.** Click the name of the directory above the file list. The name of the directory is the name of the pipeline which is `my_pipeline` if you didn't change.\n", "\n", - "\u003e**Double-click to open `kubeflow_runner.py`**. Uncomment two arguments, `query` and `beam_pipeline_args`, for the `create_pipeline` function.\n", + ">**Double-click to open `kubeflow_runner.py`**. Uncomment two arguments, `query` and `beam_pipeline_args`, for the `create_pipeline` function.\n", "\n", "Now the pipeline is ready to use BigQuery as an example source. Update the pipeline as before and create a new execution run as we did in step 5 and 6." ] @@ -584,11 +584,11 @@ "\n", "Several [TFX Components uses Apache Beam](https://www.tensorflow.org/tfx/guide/beam) to implement data-parallel pipelines, and it means that you can distribute data processing workloads using [Google Cloud Dataflow](https://cloud.google.com/dataflow/). In this step, we will set the Kubeflow orchestrator to use dataflow as the data processing back-end for Apache Beam.\n", "\n", - "\u003e**Double-click `pipeline` to change directory, and double-click to open `configs.py`**. Uncomment the definition of `GOOGLE_CLOUD_REGION`, and `DATAFLOW_BEAM_PIPELINE_ARGS`.\n", + ">**Double-click `pipeline` to change directory, and double-click to open `configs.py`**. Uncomment the definition of `GOOGLE_CLOUD_REGION`, and `DATAFLOW_BEAM_PIPELINE_ARGS`.\n", "\n", - "\u003e**Change directory one level up.** Click the name of the directory above the file list. The name of the directory is the name of the pipeline which is `my_pipeline` if you didn't change.\n", + ">**Change directory one level up.** Click the name of the directory above the file list. The name of the directory is the name of the pipeline which is `my_pipeline` if you didn't change.\n", "\n", - "\u003e**Double-click to open `kubeflow_runner.py`**. Uncomment `beam_pipeline_args`. (Also make sure to comment out current `beam_pipeline_args` that you added in Step 7.)\n", + ">**Double-click to open `kubeflow_runner.py`**. Uncomment `beam_pipeline_args`. (Also make sure to comment out current `beam_pipeline_args` that you added in Step 7.)\n", "\n", "Now the pipeline is ready to use Dataflow. Update the pipeline and create an execution run as we did in step 5 and 6." ] @@ -626,11 +626,11 @@ "\n", "TFX interoperates with several managed GCP services, such as [Cloud AI Platform for Training and Prediction](https://cloud.google.com/ai-platform/). You can set your `Trainer` component to use Cloud AI Platform Training, a managed service for training ML models. Moreover, when your model is built and ready to be served, you can *push* your model to Cloud AI Platform Prediction for serving. In this step, we will set our `Trainer` and `Pusher` component to use Cloud AI Platform services.\n", "\n", - "\u003eBefore editing files, you might first have to enable *AI Platform Training \u0026 Prediction API*.\n", + ">Before editing files, you might first have to enable *AI Platform Training & Prediction API*.\n", "\n", - "\u003e**Double-click `pipeline` to change directory, and double-click to open `configs.py`**. Uncomment the definition of `GOOGLE_CLOUD_REGION`, `GCP_AI_PLATFORM_TRAINING_ARGS` and `GCP_AI_PLATFORM_SERVING_ARGS`. We will use our custom built container image to train a model in Cloud AI Platform Training, so we should set `masterConfig.imageUri` in `GCP_AI_PLATFORM_TRAINING_ARGS` to the same value as `CUSTOM_TFX_IMAGE` above.\n", + ">**Double-click `pipeline` to change directory, and double-click to open `configs.py`**. Uncomment the definition of `GOOGLE_CLOUD_REGION`, `GCP_AI_PLATFORM_TRAINING_ARGS` and `GCP_AI_PLATFORM_SERVING_ARGS`. We will use our custom built container image to train a model in Cloud AI Platform Training, so we should set `masterConfig.imageUri` in `GCP_AI_PLATFORM_TRAINING_ARGS` to the same value as `CUSTOM_TFX_IMAGE` above.\n", "\n", - "\u003e**Change directory one level up, and double-click to open `kubeflow_runner.py`**. Uncomment `ai_platform_training_args` and `ai_platform_serving_args`.\n", + ">**Change directory one level up, and double-click to open `kubeflow_runner.py`**. Uncomment `ai_platform_training_args` and `ai_platform_serving_args`.\n", "\n", "Update the pipeline and create an execution run as we did in step 5 and 6." ] diff --git a/docs/tutorials/tfx/template_local.ipynb b/docs/tutorials/tfx/template_local.ipynb index 4e57f82020..309f045cc0 100644 --- a/docs/tutorials/tfx/template_local.ipynb +++ b/docs/tutorials/tfx/template_local.ipynb @@ -45,15 +45,15 @@ "id": "XdSXv1DrxdLL" }, "source": [ - "\u003cdiv class=\"devsite-table-wrapper\"\u003e\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n", - "\u003ctd\u003e\u003ca target=\"_blank\" href=\"https://www.tensorflow.org/tfx/tutorials/tfx/template_local\"\u003e\n", - "\u003cimg src=\"https://www.tensorflow.org/images/tf_logo_32px.png\"/\u003eView on TensorFlow.org\u003c/a\u003e\u003c/td\u003e\n", - "\u003ctd\u003e\u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/tfx/blob/master/docs/tutorials/tfx/template_local.ipynb\"\u003e\n", - "\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\"\u003eRun in Google Colab\u003c/a\u003e\u003c/td\u003e\n", - "\u003ctd\u003e\u003ca target=\"_blank\" href=\"https://github.com/tensorflow/tfx/tree/master/docs/tutorials/tfx/template_local.ipynb\"\u003e\n", - "\u003cimg width=32px src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\"\u003eView source on GitHub\u003c/a\u003e\u003c/td\u003e\n", - "\u003ctd\u003e\u003ca href=\"https://storage.googleapis.com/tensorflow_docs/tfx/docs/tutorials/tfx/template_local.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/download_logo_32px.png\" /\u003eDownload notebook\u003c/a\u003e\u003c/td\u003e\n", - "\u003c/table\u003e\u003c/div\u003e" + "
\n", + "\n", + "\n", + "\n", + "\n", + "
\n", + "View on TensorFlow.org\n", + "Run in Google Colab\n", + "View source on GitHubDownload notebook
" ] }, { @@ -79,7 +79,7 @@ "## Prerequisites\n", "\n", "* Linux / MacOS\n", - "* Python \u003e= 3.5.3\n", + "* Python >= 3.5.3\n", "\n", "You can get all prerequisites easily by [running this notebook on Google Colab](https://colab.sandbox.google.com/github/tensorflow/tfx/blob/master/docs/tutorials/tfx/template_local.ipynb).\n" ] @@ -128,7 +128,7 @@ "source": [ "NOTE: There might be some errors during package installation. For example,\n", "\n", - "\u003eERROR: some-package 0.some_version.1 has requirement other-package!=2.0.,\u0026lt;3,\u0026gt;=1.15, but you'll have other-package 2.0.0 which is incompatible.\n", + ">ERROR: some-package 0.some_version.1 has requirement other-package!=2.0.,<3,>=1.15, but you'll have other-package 2.0.0 which is incompatible.\n", "\n", "Please ignore these errors at this moment." ] @@ -326,7 +326,7 @@ "outputs": [], "source": [ "!{sys.executable} -m models.features_test\n", - "!{sys.executable} -m models.keras.model_test\n" + "!{sys.executable} -m models.keras_model.model_test\n" ] }, { @@ -398,13 +398,13 @@ "\n", "We will modify copied pipeline definition in `pipeline/pipeline.py`. If you are working on your local environment, use your favorite editor to edit the file. If you are working on Google Colab, \n", "\n", - "\u003e**Click folder icon on the left to open `Files` view**.\n", + ">**Click folder icon on the left to open `Files` view**.\n", "\n", - "\u003e**Click `my_pipeline` to open the directory and click `pipeline` directory to open and double-click `pipeline.py` to open the file**.\n", + ">**Click `my_pipeline` to open the directory and click `pipeline` directory to open and double-click `pipeline.py` to open the file**.\n", "\n", - "\u003eFind and uncomment the 3 lines which add `StatisticsGen`, `SchemaGen`, and `ExampleValidator` to the pipeline. (Tip: find comments containing `TODO(step 5):`).\n", + ">Find and uncomment the 3 lines which add `StatisticsGen`, `SchemaGen`, and `ExampleValidator` to the pipeline. (Tip: find comments containing `TODO(step 5):`).\n", "\n", - "\u003e Your change will be saved automatically in a few seconds. Make sure that the `*` mark in front of the `pipeline.py` disappeared in the tab title. **There is no save button or shortcut for the file editor in Colab. Python files in file editor can be saved to the runtime environment even in `playground` mode.**\n", + "> Your change will be saved automatically in a few seconds. Make sure that the `*` mark in front of the `pipeline.py` disappeared in the tab title. **There is no save button or shortcut for the file editor in Colab. Python files in file editor can be saved to the runtime environment even in `playground` mode.**\n", "\n", "You now need to update the existing pipeline with modified pipeline definition. Use the `tfx pipeline update` command to update your pipeline, followed by the `tfx run create` command to create a new execution run of your updated pipeline.\n", "\n", @@ -449,7 +449,7 @@ "\n", "In this step, you will add components for training and model validation including `Transform`, `Trainer`, `Resolver`, `Evaluator`, and `Pusher`.\n", "\n", - "\u003e **Open `pipeline/pipeline.py`**. Find and uncomment 5 lines which add `Transform`, `Trainer`, `Resolver`, `Evaluator` and `Pusher` to the pipeline. (Tip: find `TODO(step 6):`)\n", + "> **Open `pipeline/pipeline.py`**. Find and uncomment 5 lines which add `Transform`, `Trainer`, `Resolver`, `Evaluator` and `Pusher` to the pipeline. (Tip: find `TODO(step 6):`)\n", "\n", "As you did before, you now need to update the existing pipeline with the modified pipeline definition. The instructions are the same as Step 5. Update the pipeline using `tfx pipeline update`, and create an execution run using `tfx run create`.\n", "\n", @@ -548,13 +548,13 @@ "id": "MhClPWEuuOaP" }, "source": [ - "\u003e **Open `pipeline/pipeline.py`**. Comment out `CsvExampleGen` and uncomment the line which create an instance of `BigQueryExampleGen`. You also need to uncomment `query` argument of the `create_pipeline` function.\n", + "> **Open `pipeline/pipeline.py`**. Comment out `CsvExampleGen` and uncomment the line which create an instance of `BigQueryExampleGen`. You also need to uncomment `query` argument of the `create_pipeline` function.\n", "\n", "We need to specify which GCP project to use for BigQuery again, and this is done by setting `--project` in `beam_pipeline_args` when creating a pipeline.\n", "\n", - "\u003e **Open `pipeline/configs.py`**. Uncomment the definition of `BIG_QUERY__WITH_DIRECT_RUNNER_BEAM_PIPELINE_ARGS` and `BIG_QUERY_QUERY`. You should replace the project id and the region value in this file with the correct values for your GCP project.\n", + "> **Open `pipeline/configs.py`**. Uncomment the definition of `BIG_QUERY__WITH_DIRECT_RUNNER_BEAM_PIPELINE_ARGS` and `BIG_QUERY_QUERY`. You should replace the project id and the region value in this file with the correct values for your GCP project.\n", "\n", - "\u003e **Open `local_runner.py`**. Uncomment two arguments, `query` and `beam_pipeline_args`, for create_pipeline() method.\n", + "> **Open `local_runner.py`**. Uncomment two arguments, `query` and `beam_pipeline_args`, for create_pipeline() method.\n", "\n", "Now the pipeline is ready to use BigQuery as an example source. Update the pipeline and create a run as we did in step 5 and 6." ]