Convert articles to videos with ChatGPT

Repurposing content is a smart way to get more value from your existing content and reach new audiences. For example, think of the benefit you'll reap if you could automate the process of converting text to video.

Videos are more engaging and interactive. They make it easier for viewers to understand and retain information. Plus, they have a higher chance of being shared on social media platforms which helps broaden your reach.

But doing this manually can be a tedious and time-consuming process. The good news is, with the help of Artificial Intelligence (AI) tools, you don't need to do this manually. In this article, you'll learn how to convert text into videos using Shotstack and OpenAI APIs without spending too much time and resources.

About Shotstack

Shotstack is an API-based media automation platform that helps developers build scalable video applications via simple API calls.

The Shotstack API caters to different use cases; for example, the Create API exposes the powerful text-to-speech and text-to-image capabilities we will use in this tutorial. And the Edit API makes it easy to edit and process videos with code.

Pre-requisites

To follow along with this guide, you will need:

A Shotstack API key: Sign up for a free account and you will get one.
An OpenAI API key: Sign up and you will get $5 of credits for free.
Node.js 18+ (this article uses the Fetch API)
Basic programming knowledge, preferably in JavaScript/Node.js, though not mandatory.

Convert text to video using Node.js

By the end of this tutorial, you will create a Node.js script to automate the process of converting an article into a video.

To make it easy for you to follow along, we'll break the script down into six parts. And explain each part in detail in the next sections. Let's get started!

Create a new .js file. You can call it anything you want but for this tutorial, let's call it script.js.

1. Define constants and a helper function

Copy and paste the code below in the script.js file.

const OPENAI_KEY = "OPENAI_API_KEY"; // Replace with your OpenAI API Key
const SHOTSTACK_KEY = "SHOTSTACK_API_KEY"; // Replace with your Shotstack API Key
const SHOTSTACK_STAGE = "stage";
const audioAssets = [];
const imageAssets = [];

function sleep(ms) {
  return new Promise((resolve) => {
    setTimeout(() => {
      resolve();
    }, ms);
  });
}

Now, let's explain what this code does.

Defines three constants: `OPENAI_KEY`, `SHOTSTACK_KEY`, and `SHOTSTACK_STAGE`.

The first two constants store the API keys. Replace them with your actual OpenAI and Shotstack API keys. The third constant is a string showing the development environment for interacting with the Shotstack API.

Initializes two arrays: for audioAssets`and`imageAssets`.

We'll use these arrays to store the URLs of the audio and image assets the script will generate.

Creates the `sleep` helper function.

The sleep introduces delays between the API requests. These delays will allow time for the asynchronous tasks (generating the assets) to finish before moving on.

The processes take some time to complete, and we don't want the script to continue until the assets are ready.

2. Summarize the article and generate a JSON response

The next step is to make a POST request to the OpenAI API. We'll provide the text, together with a prompt. The prompt instructs the AI to summarize the content for a 3-slide YouTube shorts-styled video. And also to give the response in a specified format.

In our first example, we'll convert the text for the introduction to a Wikipedia article on Albert Einsein into a video. But feel free to use any text of your choice - you can still follow along just fine.

Add the following code to the script.js file.


const sendOpenAIRequest = async () => {
  const prompt = 'Using the text from an article that I will provide, please summarise the content for a 30-second YouTube shorts style video. The video should have 3 slides. Each slide should focus on one fact about the content of the supplied text. Each slide should also have the following information - a title of fewer than 7 words, a voice-over script of less than 30 words, and a relevant prompt to generate an image using an AI text-to-image service. The information should be returned as a JSON in the following format [{"title": title_text, "script": voice_over_script, "image": image_prompt }]. Here is the article content: ';
  const articleContent = `Albert Einstein (/ˈaɪnstaɪn/ EYEN-styne;[4] German: [ˈalbɛɐt ˈʔaɪnʃtaɪn] ⓘ; 14 March 1879 – 18 April 1955) was a German-born theoretical physicist who is widely held to be one of the greatest and most influential scientists of all time. Best known for developing the theory of relativity, Einstein also made important contributions to quantum mechanics, and was thus a central figure in the revolutionary reshaping of the scientific understanding of nature that modern physics accomplished in the first decades of the twentieth century.[1][5] His mass–energy equivalence formula E = mc2, which arises from relativity theory, has been called "the world's most famous equation".[6] He received the 1921 Nobel Prize in Physics "for his services to theoretical physics, and especially for his discovery of the law of the photoelectric effect",[7] a pivotal step in the development of quantum theory. His work is also known for its influence on the philosophy of science.[8][9] In a 1999 poll of 130 leading physicists worldwide by the British journal Physics World, Einstein was ranked the greatest physicist of all time.[10] His intellectual achievements and originality have made the word Einstein broadly synonymous with genius.[11]`;

  console.log("Sending request to OpenAI API...");

  const response = await fetch("https://api.openai.com/v1/chat/completions", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${OPENAI_KEY}`,
    },
    body: JSON.stringify({
      model: "gpt-4",
      messages: [
        {
          role: "user",
          content: prompt + articleContent,
        },
      ],
    }),
  });

  if (response.status !== 200) {
    console.error(
      "Request to OpenAI failed with status code: ",
      response.status
    );
    process.exit();
  }

  const responseJson = await response.json();

  return responseJson.choices[0].message.content;
};

In the code above, we have defined a sendOpenAIRequest function, which does the following;

Defines a prompt variable with instructions on how we want the AI to summarize the article and the format of the response we expect.
Defines another variable, articleContent to hold the text of the article we want to summarize.
Uses the fetch API to send a POST request to the OpenAI API endpoint for chat completions. The request body contains parameters like model indicating which model to use. Note also, that the value of content is a concatenation of the prompt and articleContent variables.

If the POST request is successful, the function parses the response and returns the value of responseJson.choices[0].message.content. Below is an example of what that will look like.

[
  {
    "title": "Einstein: A Groundbreaking Physicist",
    "script": "Albert Einstein, a German-born theoretical physicist, is widely recognized as one of the greatest and most influential scientists of all time.",
    "image": "a portrait of Albert Einstein"
  },
  {
    "title": "Einstein's Revolutionary Theories",
    "script": "Einstein developed the theory of relativity and made significant contributions to quantum mechanics. His mass–energy equivalence formula E = mc2 reshaped modern physics.",
    "image": "Albert Einstein writing his famous equation on a chalkboard"
  },
  {
    "title": "Einstein's Global Influence",
    "script": "Awarded the 1921 Nobel Prize in Physics, Einstein greatly influenced scientific philosophy. His originality made the word 'Einstein' synonymous with genius.",
    "image": "Albert Einstein with the Nobel Prize"
  }
]

3. Generate the audio assets

The code in this section handles creating the audio assets based on the return value of the sendOpenAIRequest function.

Add the code below to the script.js file.


const generateVoiceAssets = async (openAIResponse) => {
  const assetIds = [];
  const apiResponseArray = JSON.parse(openAIResponse);

  console.log("Sending requests to Shotstack text to speech API...");

  apiResponseArray.forEach(async (item) => {
    const response = await fetch(
      `https://api.shotstack.io/create/${SHOTSTACK_STAGE}/assets/`,
      {
        method: "POST",
        headers: {
          "Content-Type": "application/json",
          "x-api-key": SHOTSTACK_KEY,
        },
        body: JSON.stringify({
          provider: "shotstack",
          options: {
            type: "text-to-speech",
            text: item.script,
            voice: "Matthew",
            language: "en-US",
          },
        }),
      }
    );

    const responseJson = await response.json();

    if (response.status === 201) {
      const assetId = responseJson.data.id;
      assetIds.push(assetId);
    } else {
      console.error(
        "Request to Shotstack text-to-speech failed with status code: ",
        response.status
      );
      process.exit();
    }
  });

  console.log(
    "Pausing execution for 60 seconds to allow audio assets to be generated."
  );
  await sleep(60000);

  assetIds.forEach(async (assetId) => {
    const response = await fetch(
      `https://api.shotstack.io/create/${SHOTSTACK_STAGE}/assets/${assetId}`,
      {
        method: "GET",
        headers: {
          "Content-Type": "application/json",
          "x-api-key": SHOTSTACK_KEY,
        },
      }
    );

    const responseJson = await response.json();

    if (
      response.status === 200 &&
      responseJson.data.attributes.status === "done"
    ) {
      audioAssets.push(responseJson.data.attributes.url);
    } else {
      console.log(`Audio file for asset ID ${assetId} is still not ready`);
      process.exit();
    }
  });
};

This part of the script defines a function named generateVoiceAssets. It takes in the parameter openAIResponse (the returned value of the sendOpenAIRequest function from the previous step).

In this function, we perform the following tasks;

Initializes an empty array assetIds to store the render IDs of the assets we will generate.
Parses the openAIResponse input parameter and stores the value in the apiResponseArray variable.
For each item in apiResponseArray, we make a POST request to the Shotstack text-to-speech API endpoint. This creates an audio based on the value of the script property for each item in the array.
The function extracts the render ID from the JSON response and adds it to the assetIds array.
After looping through all the items, we pause the execution of the script for 60 seconds with the sleep function defined in part 1. This allows time for the audio assets to be generated.
After the sleep function completes, we send GET requests to the Shotstack API to get the URL for each audio asset. And add these URLs to the audioAssets array.

4. Generate the image assets

The next step is to create the image assets. Paste the following code snippet to your script.js file.


const generateImageAssets = async (openAIResponse) => {
  const assetIds = [];

  const apiResponseArray = JSON.parse(openAIResponse);

  console.log("Sending requests to Shotstack text to image API...");

  apiResponseArray.forEach(async (item) => {
    const response = await fetch(
      `https://api.shotstack.io/create/${SHOTSTACK_STAGE}/assets/`,
      {
        method: "POST",
        headers: {
          "Content-Type": "application/json",
          "x-api-key": SHOTSTACK_KEY,
        },
        body: JSON.stringify({
          provider: "shotstack",
          options: {
            type: "text-to-image",
            prompt: item.image,
            width: 1024,
            height: 1024,
          },
        }),
      }
    );

    const responseJson = await response.json();

    if (response.status === 201) {
      const assetId = responseJson.data.id;
      assetIds.push(assetId);
    } else {
      console.error(
        "Request to Shotstack text-to-image failed with status code: ",
        response.status
      );
      process.exit();
    }
  });

  console.log(
    "Pausing execution for 20 seconds to allow image assets to be generated."
  );
  await sleep(20000);

  await Promise.all(
    assetIds.map(async (assetId) => {
      const response = await fetch(
        `https://api.shotstack.io/create/${SHOTSTACK_STAGE}/assets/${assetId}`,
        {
          method: "GET",
          headers: {
            "Content-Type": "application/json",
            "x-api-key": SHOTSTACK_KEY,
          },
        }
      );

      const responseJson = await response.json();

      if (
        response.status === 200 &&
        responseJson.data.attributes.status === "done"
      ) {
        imageAssets.push(responseJson.data.attributes.url);
      } else {
        console.log(`Image file for asset ID ${assetId} is still not ready`);
        process.exit();
      }
    })
  );
};

This part defines a function named generateImageAssets that takes in the response of the sendOpenAIRequest function as a parameter. We use this function to generate images using the Create API.

Here is the breakdown of what the function does.

Initializes an empty array assetIds to store the render IDs of the assets we'll generate.
Parses the openAIResponse input parameter and stores the value in the apiResponseArray variable.
For each item in apiResponseArray, we send a POST request to the Shotstack text-to-image API endpoint. The prompt for the request is the value of each item's image property.
The function extracts the render ID from the JSON response and adds it to the assetIds array.
After completing the requests for all items, we pause the execution of the script for 20 seconds. This allows time to generate the image assets.
When the sleep function completes, we send GET requests to the API to get the URLs of the images. And add the URLs to the imageAssets array.

5. Generate the video

At this point, you have the URLs for both the audio and image assets. What is left is to use these assets to create the video. We can easily do that using Shotstack Studio, a browser-based video editor for creating dynamic video templates.

The Shotstack Studio interface

We've already created a JSON template which we will add as the payload when making the request to the API

Add the code below to script.js, and let's explain what it does.


const generateVideo = async (openAIResponse) => {
  const apiResponseArray = JSON.parse(openAIResponse);

  console.log(
    "Sending requests to Shotstack Edit API to generate the video..."
  );

  const payload = {
    merge: [],
    timeline: {
      background: "#000000",
      tracks: [
        {
          clips: [
            {
              asset: {
                type: "audio",
                src: "https://templates.shotstack.io/basic/asset/audio/music/unminus/dentreprise-en-feu.mp3",
                volume: 0.5,
                effect: "fadeInFadeOut",
              },
              start: 0,
              length: 30,
            },
          ],
        },
        {
          clips: [
            {
              asset: {
                type: "audio",
                src: "{{ VOICE_1 }}",
              },
              start: 0,
              length: 10,
            },
            {
              asset: {
                type: "audio",
                src: "{{ VOICE_2 }}",
              },
              start: 10,
              length: 10,
            },
            {
              asset: {
                type: "audio",
                src: "{{ VOICE_3 }}",
              },
              start: 20,
              length: 10,
            },
          ],
        },
        {
          clips: [
            {
              asset: {
                type: "html",
                width: 654,
                height: 225,
                position: "bottom",
                html: '<p data-html-type="text">{{ TITLE_1 }}</p>',
                css: "p { color: #1b1b1b; font-size: 56px; font-family: 'Didact Gothic'; text-align: center; }",
              },
              start: 0,
              length: 10,
              fit: "none",
              scale: 1,
              offset: {
                x: 0.013,
                y: -0.3,
              },
              position: "center",
              transition: {
                in: "fade",
                out: "fade",
              },
            },
            {
              asset: {
                type: "html",
                width: 654,
                height: 225,
                position: "bottom",
                html: '<p data-html-type="text">{{ TITLE_2 }}</p>',
                css: "p { color: #1b1b1b; font-size: 56px; font-family: 'Didact Gothic'; text-align: center; }",
              },
              start: 10,
              length: 10,
              fit: "none",
              scale: 1,
              offset: {
                x: 0.013,
                y: -0.3,
              },
              position: "center",
              transition: {
                in: "fade",
                out: "fade",
              },
            },
            {
              asset: {
                type: "html",
                width: 654,
                height: 225,
                position: "bottom",
                html: '<p data-html-type="text">{{ TITLE_3 }}</p>',
                css: "p { color: #1b1b1b; font-size: 56px; font-family: 'Didact Gothic'; text-align: center; }",
              },
              start: 20,
              length: 10,
              fit: "none",
              scale: 1,
              offset: {
                x: 0.013,
                y: -0.3,
              },
              position: "center",
              transition: {
                in: "fade",
                out: "fade",
              },
            },
          ],
        },
        {
          clips: [
            {
              asset: {
                type: "image",
                src: "{{ IMAGE_1 }}",
              },
              start: 0,
              length: 10,
              effect: "zoomIn",
            },
            {
              asset: {
                type: "image",
                src: "{{ IMAGE_2 }}",
              },
              start: 10,
              length: 10,
              effect: "zoomOut",
            },
            {
              asset: {
                type: "image",
                src: "{{ IMAGE_3 }}",
              },
              start: 20,
              length: 10,
              effect: "zoomIn",
            },
          ],
        },
      ],
    },
    output: {
      format: "mp4",
      fps: 25,
      size: {
        width: 720,
        height: 720,
      },
      destinations: [],
    },
  };

  apiResponseArray.forEach((item, index) => {
    payload.merge.push({
      find: `IMAGE_${index + 1}`,
      replace: imageAssets[index],
    });

    payload.merge.push({
      find: `TITLE_${index + 1}`,
      replace: item.title,
    });

    payload.merge.push({
      find: `VOICE_${index + 1}`,
      replace: audioAssets[index],
    });
  });

  const postResponse = await fetch(
    `https://api.shotstack.io/${SHOTSTACK_STAGE}/render`,
    {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        "x-api-key": SHOTSTACK_KEY,
      },
      body: JSON.stringify(payload),
    }
  );

  const postResponseJson = await postResponse.json();
  let assetId;

  if (postResponse.status === 201) {
    assetId = postResponseJson.response.id;
  } else {
    console.error(
      "Request to Shotstack Edit API failed with status code: ",
      postResponse.status
    );
    process.exit();
  }

  console.log(
    "Pausing execution for 30 seconds to allow the video to be generated."
  );
  await sleep(30000);

  const getResponse = await fetch(
    `https://api.shotstack.io/${SHOTSTACK_STAGE}/render/${assetId}`,
    {
      method: "GET",
      headers: {
        "Content-Type": "application/json",
        "x-api-key": SHOTSTACK_KEY,
      },
    }
  );

  const getResponseJson = await getResponse.json();

  if (
    getResponse.status === 200 &&
    getResponseJson.response.status === "done"
  ) {
    console.log(
      `You can view the video file at: ${getResponseJson.response.url}`
    );
  } else {
    console.log(`Video file for asset ID ${assetId} is still not ready`);
    process.exit();
  }
};

This generateVideo function also takes in the returned value of the sendOpenAIRequest function as a parameter. And it does the following;

Parses the openAIResponse input parameter and stores the value in the apiResponseArray variable.
Defines a payload object generated from a JSON template we created using Shotstack Studio. It contains the specifications for the video, including its timeline, tracks, clips, and output format. Note how the payload object initially has an empty merge array. We will fill this array in the next step.
Using the forEach method, the function populates the payload.merge array with the image URLs, titles, and audio URLs.
Next, we send a POST request to the Edit API's render endpoint, using payload as the body of the request.
We parse the JSON response and get the render ID of the generated video.
If the request is finished successfully, we "sleep" the script for 30 seconds to allow the video time to finish rendering.
When the sleep function completes, we send a GET request to get the URL of the rendered video.

6. The driver function

Finally, let's write the code to execute the functions in the right order. Add the following to your script.js file.


async function main() {
  const openAIResponse = await sendOpenAIRequest();

  await generateVoiceAssets(openAIResponse);
  await generateImageAssets(openAIResponse);
  await generateVideo(openAIResponse);
}

main();

Here, we have defined a main function that calls the sendOpenAIRequest, generateVioceAssets, and generateVideo functions, one after the other.

Final Script

Here is the final code for the script.js file. Your file should look something like this.


const OPENAI_KEY = "OPENAI_API_KEY"; // Replace with your OpenAI API Key
const SHOTSTACK_KEY = "SHOTSTACK_API_KEY"; // Replace with your Shotstack API Key
const SHOTSTACK_STAGE = "stage";

const audioAssets = [];
const imageAssets = [];

function sleep(ms) {
  return new Promise((resolve) => {
    setTimeout(() => {
      resolve();
    }, ms);
  });
}


const sendOpenAIRequest = async () => {
  const prompt =`Using the text from an article that I will provide, please summarise the content for a 30 second YouTube shorts style video. The video should have 3 slides. Each slide should focus on one fact about the content of the supplied text. Each slide should have the following information - a title less than 7 words, a voice over script less than 30 words, a relevant prompt to generate an image using an AI text to image service. The information should be returned as a JSON in the following format [{"title": title_text, "script": voice_over_script, "image": image_prompt }]. Here is the article content: `;

  const articleContent = `Albert Einstein (/ˈaɪnstaɪn/ EYEN-styne;[4] German: [ˈalbɛɐt ˈʔaɪnʃtaɪn] ⓘ; 14 March 1879 – 18 April 1955) was a German-born theoretical physicist who is widely held to be one of the greatest and most influential scientists of all time. Best known for developing the theory of relativity, Einstein also made important contributions to quantum mechanics, and was thus a central figure in the revolutionary reshaping of the scientific understanding of nature that modern physics accomplished in the first decades of the twentieth century.[1][5] His mass–energy equivalence formula E = mc2, which arises from relativity theory, has been called "the world's most famous equation".[6] He received the 1921 Nobel Prize in Physics "for his services to theoretical physics, and especially for his discovery of the law of the photoelectric effect",[7] a pivotal step in the development of quantum theory. His work is also known for its influence on the philosophy of science.[8][9] In a 1999 poll of 130 leading physicists worldwide by the British journal Physics World, Einstein was ranked the greatest physicist of all time.[10] His intellectual achievements and originality have made the word Einstein broadly synonymous with genius.[11]`;

  console.log("Sending request to OpenAI API...");

  const response = await fetch("https://api.openai.com/v1/chat/completions", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${OPENAI_KEY}`,
    },
    body: JSON.stringify({
      model: "gpt-4",
      messages: [
        {
          role: "user",
          content: prompt + articleContent,
        },
      ],
    }),
  });

  if (response.status !== 200) {
    console.error(
      "Request to OpenAI failed with status code: ",
      response.status
    );
    process.exit();
  }

  const responseJson = await response.json();

  return responseJson.choices[0].message.content;
};


const generateVoiceAssets = async (openAIResponse) => {
  const assetIds = [];
  const apiResponseArray = JSON.parse(openAIResponse);

  console.log("Sending requests to Shotstack text to speech API...");

  apiResponseArray.forEach(async (item) => {
    const response = await fetch(
      `https://api.shotstack.io/create/${SHOTSTACK_STAGE}/assets/`,
      {
        method: "POST",
        headers: {
          "Content-Type": "application/json",
          "x-api-key": SHOTSTACK_KEY,
        },
        body: JSON.stringify({
          provider: "shotstack",
          options: {
            type: "text-to-speech",
            text: item.script,
            voice: "Matthew",
            language: "en-US",
          },
        }),
      }
    );

    const responseJson = await response.json();

    if (response.status === 201) {
      const assetId = responseJson.data.id;
      assetIds.push(assetId);
    } else {
      console.error(
        "Request to Shotstack text-to-speech failed with status code: ",
        response.status
      );
      process.exit();
    }
  });

  console.log(
    "Pausing execution for 60 seconds to allow audio assets to be generated."
  );
  await sleep(60000);

  assetIds.forEach(async (assetId) => {
    const response = await fetch(
      `https://api.shotstack.io/create/${SHOTSTACK_STAGE}/assets/${assetId}`,
      {
        method: "GET",
        headers: {
          "Content-Type": "application/json",
          "x-api-key": SHOTSTACK_KEY,
        },
      }
    );

    const responseJson = await response.json();

    if (
      response.status === 200 &&
      responseJson.data.attributes.status === "done"
    ) {
      audioAssets.push(responseJson.data.attributes.url);
    } else {
      console.log(`Audio file for asset ID ${assetId} is still not ready`);
      process.exit();
    }
  });
};


const generateImageAssets = async (openAIResponse) => {
  const assetIds = [];

  const apiResponseArray = JSON.parse(openAIResponse);

  console.log("Sending requests to Shotstack text to image API...");

  apiResponseArray.forEach(async (item) => {
    const response = await fetch(
      `https://api.shotstack.io/create/${SHOTSTACK_STAGE}/assets/`,
      {
        method: "POST",
        headers: {
          "Content-Type": "application/json",
          "x-api-key": SHOTSTACK_KEY,
        },
        body: JSON.stringify({
          provider: "shotstack",
          options: {
            type: "text-to-image",
            prompt: item.image,
            width: 1024,
            height: 1024,
          },
        }),
      }
    );

    const responseJson = await response.json();

    if (response.status === 201) {
      const assetId = responseJson.data.id;
      assetIds.push(assetId);
    } else {
      console.error(
        "Request to Shotstack text-to-image failed with status code: ",
        response.status
      );
      process.exit();
    }
  });

  console.log(
    "Pausing execution for 20 seconds to allow image assets to be generated."
  );
  await sleep(20000);

  await Promise.all(
    assetIds.map(async (assetId) => {
      const response = await fetch(
        `https://api.shotstack.io/create/${SHOTSTACK_STAGE}/assets/${assetId}`,
        {
          method: "GET",
          headers: {
            "Content-Type": "application/json",
            "x-api-key": SHOTSTACK_KEY,
          },
        }
      );

      const responseJson = await response.json();

      if (
        response.status === 200 &&
        responseJson.data.attributes.status === "done"
      ) {
        imageAssets.push(responseJson.data.attributes.url);
      } else {
        console.log(`Image file for asset ID ${assetId} is still not ready`);
        process.exit();
      }
    })
  );
};


const generateVideo = async (openAIResponse) => {
  const apiResponseArray = JSON.parse(openAIResponse);

  console.log(
    "Sending requests to Shotstack Edit API to generate the video..."
  );

  const payload = {
    merge: [],
    timeline: {
      background: "#000000",
      tracks: [
        {
          clips: [
            {
              asset: {
                type: "audio",
                src: "https://templates.shotstack.io/basic/asset/audio/music/unminus/dentreprise-en-feu.mp3",
                volume: 0.5,
                effect: "fadeInFadeOut",
              },
              start: 0,
              length: 30,
            },
          ],
        },
        {
          clips: [
            {
              asset: {
                type: "audio",
                src: "{{ VOICE_1 }}",
              },
              start: 0,
              length: 10,
            },
            {
              asset: {
                type: "audio",
                src: "{{ VOICE_2 }}",
              },
              start: 10,
              length: 10,
            },
            {
              asset: {
                type: "audio",
                src: "{{ VOICE_3 }}",
              },
              start: 20,
              length: 10,
            },
          ],
        },
        {
          clips: [
            {
              asset: {
                type: "html",
                width: 654,
                height: 225,
                position: "bottom",
                html: '<p data-html-type="text">{{ TITLE_1 }}</p>',
                css: "p { color: #1b1b1b; font-size: 56px; font-family: 'Didact Gothic'; text-align: center; }",
              },
              start: 0,
              length: 10,
              fit: "none",
              scale: 1,
              offset: {
                x: 0.013,
                y: -0.3,
              },
              position: "center",
              transition: {
                in: "fade",
                out: "fade",
              },
            },
            {
              asset: {
                type: "html",
                width: 654,
                height: 225,
                position: "bottom",
                html: '<p data-html-type="text">{{ TITLE_2 }}</p>',
                css: "p { color: #1b1b1b; font-size: 56px; font-family: 'Didact Gothic'; text-align: center; }",
              },
              start: 10,
              length: 10,
              fit: "none",
              scale: 1,
              offset: {
                x: 0.013,
                y: -0.3,
              },
              position: "center",
              transition: {
                in: "fade",
                out: "fade",
              },
            },
            {
              asset: {
                type: "html",
                width: 654,
                height: 225,
                position: "bottom",
                html: '<p data-html-type="text">{{ TITLE_3 }}</p>',
                css: "p { color: #1b1b1b; font-size: 56px; font-family: 'Didact Gothic'; text-align: center; }",
              },
              start: 20,
              length: 10,
              fit: "none",
              scale: 1,
              offset: {
                x: 0.013,
                y: -0.3,
              },
              position: "center",
              transition: {
                in: "fade",
                out: "fade",
              },
            },
          ],
        },
        {
          clips: [
            {
              asset: {
                type: "image",
                src: "{{ IMAGE_1 }}",
              },
              start: 0,
              length: 10,
              effect: "zoomIn",
            },
            {
              asset: {
                type: "image",
                src: "{{ IMAGE_2 }}",
              },
              start: 10,
              length: 10,
              effect: "zoomOut",
            },
            {
              asset: {
                type: "image",
                src: "{{ IMAGE_3 }}",
              },
              start: 20,
              length: 10,
              effect: "zoomIn",
            },
          ],
        },
      ],
    },
    output: {
      format: "mp4",
      fps: 25,
      size: {
        width: 720,
        height: 720,
      },
      destinations: [],
    },
  };

  apiResponseArray.forEach((item, index) => {
    payload.merge.push({
      find: `IMAGE_${index + 1}`,
      replace: imageAssets[index],
    });

    payload.merge.push({
      find: `TITLE_${index + 1}`,
      replace: item.title,
    });

    payload.merge.push({
      find: `VOICE_${index + 1}`,
      replace: audioAssets[index],
    });
  });

  const postResponse = await fetch(
    `https://api.shotstack.io/${SHOTSTACK_STAGE}/render`,
    {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        "x-api-key": SHOTSTACK_KEY,
      },
      body: JSON.stringify(payload),
    }
  );

  const postResponseJson = await postResponse.json();
  let assetId;

  if (postResponse.status === 201) {
    assetId = postResponseJson.response.id;
  } else {
    console.error(
      "Request to Shotstack Edit API failed with status code: ",
      postResponse.status
    );
    process.exit();
  }

  console.log(
    "Pausing execution for 30 seconds to allow the video to be generated."
  );
  await sleep(30000);

  const getResponse = await fetch(
    `https://api.shotstack.io/${SHOTSTACK_STAGE}/render/${assetId}`,
    {
      method: "GET",
      headers: {
        "Content-Type": "application/json",
        "x-api-key": SHOTSTACK_KEY,
      },
    }
  );

  const getResponseJson = await getResponse.json();

  if (
    getResponse.status === 200 &&
    getResponseJson.response.status === "done"
  ) {
    console.log(
      `You can view the video file at: ${getResponseJson.response.url}`
    );
  } else {
    console.log(`Video file for asset ID ${assetId} is still not ready`);
    process.exit();
  }
};


async function main() {
  const openAIResponse = await sendOpenAIRequest();

  await generateVoiceAssets(openAIResponse);
  await generateImageAssets(openAIResponse);
  await generateVideo(openAIResponse);
}

main();

Running the script

The entire script is ready now. You can run the code in your terminal using the command below. Also, you can replace the value of the articleContent variable with any text of your choice.

node script.js

On successful execution of the script, you should see the following logs in your terminal window.

Sending request to OpenAI API...
Sending requests to Shotstack text to speech API...
Pausing execution for 60 seconds to allow audio assets to be generated.
Sending requests to Shotstack text to image API...
Pausing execution for 20 seconds to allow image assets to be generated.
Sending requests to Shotstack Edit API to generate the video...
Pausing execution for 30 seconds to allow the video to be generated.

You can view the video file at: {{ OUTPUT_VIDEO_URL_WILL_BE_POSTED_HERE }}

To access the generated video, visit the URL in the last log line to download or view it in your browser. It should look like this:

https://shotstack-api-stage-output.s3-ap-southeast-2.amazonaws.com/t2siieowih/3a1e0505-29b0-45a6-9032-ebaaf63b43bb.mp4

Note: The video URL will expire after 24 hours. You can download it or transfer it to your own storage or hosting platform.

Second example

Let's try another example with a different text. This time, we'll use the introduction to a Wikipedia article on the planet Mars.

Create a new file new.js (name it whatever you want). Copy and paste the code from script.js to the new.js file.

The only change will be the value of the articleContent variable. Replace it with the new text (in this case the text from the Wikipedia article on Mars).

Run the new script with the command below to convert the text into a video.

node new.js

Below is the video generated from the article on Mars.

Final word

Thumbs up if you've come this far. You now know how to automate the process of converting text to video using the Shotstack API, OpenAI API, and Node.js. For this tutorial, we converted two Wikipedia articles into YouTube shorts-styled videos.

But as we demonstrated, you can use any text you want. You can convert your blogs, news articles, newsletters, training materials, or other long-form text content into compelling videos using the process outlined in this tutorial.

This process can help you get more out of your content whiles saving you time and effort. You can also use Shotstack for other media automation task. Check out our developer guides for similar tutorials like this.

Convert articles to videos with ChatGPT

About Shotstack

Pre-requisites

Convert text to video using Node.js

1. Define constants and a helper function

Defines three constants: `OPENAI_KEY`, `SHOTSTACK_KEY`, and `SHOTSTACK_STAGE`.

Initializes two arrays: for audioAssets`and`imageAssets`.

Creates the `sleep` helper function.

2. Summarize the article and generate a JSON response

3. Generate the audio assets

4. Generate the image assets

5. Generate the video

6. The driver function

Final Script

Running the script

Second example

Final word

Become an Automated Video Editing Pro

You might also like

Build an AI video summarizer app using ChatGPT

Add an AI voice over to a video using an API

Add text to video using the Edit API

PRODUCT

SDK'S

SOLUTIONS

INDUSTRIES

RESOURCES

DEMOS & TOOLS

ABOUT

STUDIO TEMPLATES

WORKFLOW TEMPLATES

Convert articles to videos with ChatGPT

About Shotstack

Pre-requisites

Convert text to video using Node.js

1. Define constants and a helper function

Defines three constants: OPENAI_KEY, SHOTSTACK_KEY, and SHOTSTACK_STAGE.

Initializes two arrays: for audioAssetsandimageAssets`.

Creates the sleep helper function.

2. Summarize the article and generate a JSON response

3. Generate the audio assets

4. Generate the image assets

5. Generate the video

6. The driver function

Final Script

Running the script

Second example

Final word

Become an Automated Video Editing Pro

You might also like

Build an AI video summarizer app using ChatGPT

Add an AI voice over to a video using an API

Add text to video using the Edit API

Defines three constants: `OPENAI_KEY`, `SHOTSTACK_KEY`, and `SHOTSTACK_STAGE`.

Initializes two arrays: for audioAssets`and`imageAssets`.

Creates the `sleep` helper function.