
In this tutorial, we are going to show you how you can use n8n to web scrape blog articles from Google SERP (Search Engine Results Page) for a particular keyword without having to pay over $100/month on keyword research tools such as Ahrefs, SemRush, Moz.
Table of Contents
- Setting Up OpenAI
- Setting Up Airtable
- Step 1: Trigger – When Clicking “Execute Workflow”
- Step 2: Set Airtable in n8n
- Step 3: Fetch Website HTML via HTTP Request
- Step 4: Split and Process Multiple Pages
- Step 5: Optional – Fetch Additional HTML Data
- Step 6: Use OpenAI to Structure and Analyze Data
- Step 7: Format Output Data for Airtable
We will be using the following tools in this example:
- Airtable
- n8n
- OpenAI
There will be additional steps to step-up Airtable and OpenAI for which we will go in-depth.
Setting Up OpenAI API for n8n
For this workflow to work, we will be using OpenAI API, this will give us access to the various models from OpenAI.
This is different from using ChatGPT.

Once you create a profile or log in, you will need to generate an API Key which we will use in n8n.
Generating Your OpenAI API Key

- In the top right corner, click the circular icon, then click 'Your profile'.
- Under the settings column on the left side, you will click 'API keys'.

- On the API Keys page, click '+ Create new secret key' on the top right corner.

- You will have to enter details such as the 'Name' and 'Project' for the API Key. Once that is complete you will get the only chance to say your API Key so store it in a safe location as you will not get another chance to.

For now keep this aside but this will be used later when working on the n8n workflow.
Setting Up Airtable for Your n8n Workflow

When you sign up the first time you will be greeted by an AI to help speed up things. We do not require that and can just go ahead and cancel the on-boarding.
Once done, you will be greeted by a page that looks like this.

To learn more about Airtable, and go from zero to one. Please check out this YouTube video below.
Within Airtable, there are two main things we should know. That is an Airtable Base and an Airtable Table.
In the below image, you will see the Web Scraping SEO Research and Output data, URLs to Scrape and Output SEO Structure. So, there can be only one base and under that base there can be multiple tables.

For us to get n8n connected to Airtable we will need to know the Base ID and the Table ID.
You can find this from the URL, in the example given below, the numbers highlighted in yellow is the Base ID and the one in purple is the Table ID.
Again, keep a record of this as we will have to use it in n8n.


Step 1: Trigger the n8n Workflow
This is the start of the workflow, it can almost be described as a 'Start' button for our workflow.

Step 2: Configure Airtable in n8n
If you remember in the sections before from the Airtable URL, we identified the Airtable base ID and Airtable table ID, we are going to set them as variables in this step.
The reason being, in the instance we have to change the Airtable IDs in the future then we just go to the variable and update it, instead of changing all instances of that Airtable ID throughout the workflow.


Once you set the two variables, you can then click the execute button and this will finalize our two values.
The next thing we can do is go to Airtable and create three columns within the URLs to Scrape table.
The columns will be as follows:
- URL
- Primary Keyword
- Record ID
Specifically for the Record ID, we will use the following as the settings:


Once you have the overall structure set, you can go ahead and input URLs for a specific keyword.
A simple way of doing this, is opening Google Chrome and typing in your desired primary keyword, and from the SERPs take the URLs and copy in the Airtable column.
Step 3: Fetch Website HTML via HTTP Request
In this step, we are connecting n8n to Airtable via the API. You can use this link to learn more how to connect Airtable.
You can use the curl option, too.
curl https://api.airtable.com/v0/YOUR_BASE_ID/YOUR_TABLE_ID_OR_NAME -H \ "Authorization: Bearer YOUR_TOKEN"The good thing with n8n is you can drag and drop variables into the parameters fields.

If you did the setup correctly, there should be one output element with the number of rows you entered into the airtable table, like the image below.
This proves that you connect your Airtable correctly.

Step 4: Split and Process Multiple Pages

The Split Out function is used when there are nested elements, this allows us to iterate through each individual element within the object.
You can clearly see in the Input, we have only one item and then in the Output you will see 5 items.
Step 5: Fetch Additional HTML Data
After splitting, another HTTP Request is sent. In this step, we will get the website HTML structure through a HTTP Request. And as you can see there are certain parts to the HTML Structure that are not of value to us.
We want to make sure we only use that part that will help us with our final decision.

Step 6: Use OpenAI to Structure and Analyze Data
Once we have the website HTML, you will realize that the raw HTML is messy and hard to work with.
To fix this, we will need to pass it through an AI Agent and ask the AI Agent to extract useful information from it.
- For example, you can extract:
title
meta description
headings (H1–H4)
internal links
external links
But before we get to the fun part, we will have to connect our AI Agent with a LLM model, this is what gives the AI Agent the brains to think, in this case we will be using OpenAI's LLMs. If you remember in the previous steps, we generated a secret API Key, we will have to use that key to connect from n8n to OpenAI.
This YouTube video will help you get a deeper understanding.

Then, we will have to give the AI Agent specific instructions of what we want to AI Agent to do.
Here is the System Prompt:
"You are an expert web scraping assistant, your task is to take the input and categorize it into different section such as meta title, meta description, title, date published, author, description, heading(h1,h2,h3,h4, h5), internal links, external links"
Here is the User Prompt:
"You task is to convert the html data into easy to read structured data."
With these prompts, you can go ahead and modify it to see what results you can get, so feel free to play around here.
Once you execute this step, you will get this beautifully formatted output. You can see in the JSON format, how it is divided into smaller objects with key, value pairs.

Step 7: Format Output Data for Airtable
We do not want the output data to be in the n8n environment only, we would like to store it in a table for usability in the future.
We will have to set the output into our desired variables in n8n.

This will be inline with how we set the table in Airtable.

Our table in Airtable will be set with the following column headings:
- Meta Title
- Meta Description
- Date Published
- H1
- H2
- H3
- H4
- Internal Links
- External Links
Step 8: Create a Record and Save to Airtable
The last thing we will have to do is insert the various values for each column into our specific output table in Airtable. Each scraped page becomes a new row in your Airtable base.
With a single click of a button, you are now able to get insights of the SERPs for a specific keyword that will aid in your content analysis without having to pay hundred of dollars per month for an SEO tool. This lets you build a live database of structured web content.

Frequently Asked Questions
1. What is n8n and how does it help with web scraping?
n8n is a workflow automation tool that allows you to automate tasks like web scraping, data processing, and content analysis without extensive coding.
2. How can I use OpenAI with n8n for content analysis?
OpenAI can analyze and structure scraped data, helping you extract meaningful insights from website content automatically within your n8n workflow.
3. Do I need coding skills to scrape websites using n8n and OpenAI?
No, n8n is a no-code automation tool, and OpenAI handles content analysis, so you can build workflows without programming experience.
4. Can I scrape multiple pages automatically using n8n?
Yes, n8n can handle pagination and split workflows to scrape multiple pages efficiently, making large-scale scraping easier.
5. Is using n8n and OpenAI cheaper than tools like Ahrefs or Semrush?
Yes, automating web scraping and content analysis with n8n and OpenAI can cost a fraction of traditional SEO platforms while giving similar insights.
6. What types of websites can I scrape with n8n?
You can scrape most websites accessible via HTTP requests, including blogs, e-commerce sites, and content-heavy pages, as long as scraping complies with their terms of service.
7. How do I get started with the OpenAI API for n8n?
You need to sign up for OpenAI, generate an API key, and configure it in your n8n workflow to analyze scraped website content.
8. Can I store scraped data in Airtable using n8n?
Yes, n8n can connect to Airtable to automatically store and format scraped content for easy analysis and reporting.
9. How do I ensure my scraping workflow is ethical and legal?
Always review a website’s terms of service, respect robots.txt rules, and avoid overloading servers with excessive requests.
10. What are some common use cases for n8n and OpenAI web scraping?
Use cases include SEO content analysis, competitive research, market analysis, automated lead generation, and mapping website content structures.