Creating a website aggregator with ChatGPT, React, and Node.js š
A website aggregator is a website that collects data from other websites across the internet and puts the information in one place where visitors can access it.

There are many versions of website aggregators; some are search engines such as Google and Duck Duck go, and some can have more of aĀ Product HuntĀ structure where you can see a picture and a short text.
You will usually scrape the website, take their metatags and h1-6 tags, scan their sitemap.xml, and use some pattern to sort the information.
Today I am going to use a different solution šI will take the entire website content, send it to ChatGPT, and ask them to give me the information I need.
Itās kinda crazy to see ChatGPT parses the website content
So lettttsss do it š

In this article, youāll learn how to build a website aggregator which scrapes content from a website and determines the websiteās title and description using ChatGPT.
What is ChatGPT?
ChatGPTĀ is an AI language model trained byĀ OpenAIĀ to generate text and interact with users in a human-like conversational manner. It is worth mentioning that ChatGPT is free and open to public use.
Users can submit requests and get information or answers to questions from a wide range of topics such as history, science, mathematics, and current events in just a few seconds.
ChatGPT performs other tasks, such as proofreading, paraphrasing, and translation. It can also help with writing, debugging, and explaining code snippets. Its wide range of capabilities is the reason why ChatGPT has been trending.
ChatGPT is not available as an API yet š In order to use we will have to scrape our way in š
Novu ā the first open-source notification infrastructure
Just a quick background about us. Novu is the first open-sourceĀ notification infrastructure. We basically help to manage all the product notifications. It can beĀ In-AppĀ (the bell icon like you have in Facebook āĀ Websockets), Emails, SMSs and so on.

I would be super happy if you could give us a star! And let me also know in the comments ā¤ļø https://github.com/novuhq/novu
Limitation with ChatGPT
As previously mentioned, ChatGPT is not accessible through a public API. Instead, we can use web scraping techniques to access it. This involves automating the process of logging in to the OpenAI website, solving the captcha (you can use 2captcha for this), and sending an API request with the OpenAI cookies. Fortunately, there is a public library that can handle these tasks for us. Keep in mind that this is not a formal API, so you may encounter limitations if you attempt to make a large number of requests. Additionally, it is not suitable for real-time requests. If you want to use it, consider implementing a queue system for background processing.
Project Set up
Here, Iāll guide you through creating the project environment for the web application. Weāll use React.js for the front end and Node.js for the backend server.
Create the project folder for the web application by running the code below:
Setting up the Node.js server
Navigate into the server folder and create aĀ package.jsonĀ file.
Install Express, Nodemon, and the CORS library.
ExpressJSĀ is a fast, minimalist framework that provides several features for building web applications in Node.js,Ā CORSĀ is a Node.js package that allows communication between different domains, andĀ NodemonĀ is a Node.js tool that automatically restarts the server after detecting file changes.
Create anĀ index.jsĀ file ā the entry point to the web server.
Set up a Node.js server using ExpressJS. The code snippet below returns a JSON object when you visit theĀ http://localhost:4000/apiĀ in your browser.
Install the unofficialĀ ChatGPT API libraryĀ andĀ Puppeteer. The ChatGPT API uses Puppeteer as an optional peer dependency to automate bypassing the Cloudflare protections.
To use the ChatGPT API within theĀ server/index.jsĀ file, you need to configure the file to use both theĀ requireĀ andĀ importĀ keywords for importing libraries.
Therefore, update theĀ server/package.jsonĀ file to contain the type keyword.
Add the code snippet below at the top of theĀ server/index.jsĀ file.
Once you have completed the last two steps, you can now use ChatGPT within theĀ index.jsĀ file.
Configure Nodemon by adding the start command to the list of scripts in theĀ package.jsonĀ file. The code snippet below starts the server using Nodemon.
Congratulations! You can now start the server by using the command below.
Setting up the React application
Navigate into the client folder via your terminal and create a new React.js project.
Delete the redundant files, such as the logo and the test files from the React app, and update theĀ App.jsĀ file to display āHello Worldā as below.
Navigate into theĀ src/index.cssĀ file and copy the code below. It contains all the CSS required for styling this project.
Update theĀ App.jsĀ file to display an input field that allows you to provide the websiteās URL.
Congratulations! Youāve successfully created the applicationās user interface. In the following sections, Iāll walk you through scraping data from websites using Puppeteer and getting a websiteās description and title via ChatGPT.
How to scrape data using Puppeteer in Node.js
PuppeteerĀ is a Node.js library that automates several browser actions such as form submission, crawling single-page applications, UI testing, and in particular, web scraping and generating screenshots of web pages.
Here, Iāll guide you through scraping the websiteās content via Puppeteer in Node.js. Weāll send the website url provided by the user to the Node.js server and scrape the websiteās content via its URL.
Create an endpoint on the server that accepts the websiteās URL from the React app.
Import the Puppeteer library and scrape the websiteās content as done below:
Add a function within the React app that sends the URL to theĀ api/url/Ā endpoint.
From the code snippet above, we added a loading state that describes the state of the API request.
Create aĀ LoadingĀ component that is shown to the users when the request is pending.
Display theĀ LoadingĀ component whenever the content is yet to be available.
Congratulations! Youāve learnt how to scrape content from websites using Puppeteer. In the upcoming section, youāll learn how to communicate with ChatGPT in Node.js by generating websitesā descriptions and brand names.
How to communicate with ChatGPT in Node.js
ChatGPT is not yet available as a public API. Therefore, to use it, we have to scrape our way in ā meaning weāll perform a full browser automation that logs in to the OpenAI website, solves the captcha, and send an API request with the OpenAI cookies.
Fortunately, a public library that does this isĀ availableĀ and has been installed as part of the project requirement.
Import theĀ ChatGPT API libraryĀ and create a function that sends a request to ChatGPT.
Chat GPT is super intelligent, and it will answer any question we will ask it. So basically, we will send it to write us the brand name and the description based on the complete website HTML.The brand name can usually be found on the āog:site_name,ā but to show you how cool it is, we will let ChatGPT extract it. As for the description, itās pretty crazy. It will tell us what the site is about and summarize everything!
Next,Update theĀ api/urlĀ route to as done below:
To display the response within the React application, create a state that holds the serverās response.
Lastly, update theĀ App.jsĀ layout to display the serverās response to the user.
Congratulations!š Youāve completed the project for this tutorial.
Here is a sample of the result gotten from the application:
Conclusion
So far, we have covered,
- what ChatGPT is,
- how to scrape website content using Puppeteer, and
- how to communicate with ChatGPT in a Node.js application
This tutorial walks you through an example of an application you can build using Puppeteer and ChatGPT. ChatGPT can be seen as the ultimate personal assistant, very useful in various fields to enable us to work smarter and better.
The source code for this tutorial is available here:
https://github.com/novuhq/blog/tree/main/website-aggregator-with-chatgpt-react
Thank you for reading!
Help me out!
If you feel like this article helped you understand WebSockets better! I would be super happy if you could give us a star! And let me also know in the comments ā¤ļø https://github.com/novuhq/novu

