How to use the Cloud Speech-to-Text transform

This document provides instructions on how to leverage the Google Cloud Speech-to-Text transform plugin, to convert audio files to text.

Instructions

Ensure that you have enabled the Speech-to-Text API.
In this article, we will be using Google Cloud Storage source to pull data into the pipeline, so upload a speech file to a GCS bucket. Below is a sample file named hello.wav that you can use.
Use the left navigation bar to enter the Studio view.
From the list of plugins available on the left side, select Google Cloud Storage from the Source section, Google Cloud Speech-to-Text from the Transform section, and Google Cloud Storage from the Sink section. Note that you can use another source and sink, depending on where you want to get your audio data from and where you want to send your audio data to.
Connect the three plugins on the canvas, from source to transform to sink:
For the Google Cloud Storage source, configure the GCS Path and make sure that the Format is ‘blob’:
For the transform, I’ve set specified the sampling rate to be 16000, and set the ‘parts’ and ‘text’ fields. Click “Get Schema” and then “Apply” to automatically apply the output schema.
Configure the sink with the path of where you want the output data to go.
Name the pipeline and click Deploy:
Click on Run to run the pipeline. It will take a few minutes to complete:
Once the pipeline succeeds, you can view your transcribed text data in GCS, or whichever sink you configure!