How to Create Actions in Custom GPTs?
Actions are commonly used to connect an external API to your GPT. Essentially they allow you to talk with external systems and outside world
In this article I will explain how to add Actions when creating custom GPTs.
Actions are commonly used to connect an external API to your GPT. Essentially they allow you to talk with external systems and outside world. This is what makes custom GPTs truly special.
The process does not have any official documentation. Some of it is very intuitive, some of it relies on you knowing about ChatGPT plugins, and some you can learn only by experience. I will share how to approach the process, what are the common pitfalls and how to work around some of the interesting design solutions of OpenAI.
I will skip over the part of other GPT configuration and jump into creating an action. In the below image you can see the Create Action form.
Schema
First thing that you have to configure is the Schema. In case you have no development experience, this is likely where you will get your first stumble.
Schema describes the API in a structured way, so that software connecting to it, can know where to connect, how to connect, what type of data can you expect and what type of data you can send to it. Basically a map, so that OpenAI can find your external API.
OpenAPI schema is a popular API specification. It sounds so similar to OpenAI, and this is where some people fall into a trap. Actually it is just a standartised format of how to describe your API. You can find the specification here - OpenAPI Specification.
Ok, now you understand what type of data you need. Next you need an actual API to describe.
Just for fun I will choose a free API that we could integrate into our GPT. After looking around a bit, I think it would be interesting to create a Weather GPT.
If you have a choice it is good to find an API which already has an openapi.json or openapi.yaml schema defined. If you have that, then you can import the schema using Import from Url feature.
If you do not have the schema already available then you will need to build it. If you know everything about the API it is pretty easy to do. You can use ChatGPT to get the schema, just describe the API request and response, data types and any validation that should be done.
Besides having a valid OpenAPI schema, the important part about creating the action is to add the following properties:
- servers - you need to define the url for the API base
- in paths you define each action which will be a path relative to the API base
- each path will have a type of request. In the image below you can see a GET request
- operationId - this identifies the action. As a user you can later call this action using this ID. Although the actions can just be called by describing what you want to do.
You can validate your schema using this Swagger/OpenAPI validator.
Once I gathered the schema and added it to the configuration the available actions showed a method GetWeather.
Once you set this up you can Test the API by pressing on Test. This will print out some debug output which can be helpful.
Unfortunately, what I realized during debugging is that the API that I chose requires api key to be sent as a GET parameter. This is not compatible with current authorization options, and would mean that I have to include the key inside the schema. It might not be a deal breaker for some but since this is a paid service it would not be wise including your key in a place where users can find it.se another service.
Authentication
GPTs offer multiple authentication options:
- None
- API Key
- OAuth
First one (None) is pretty clear. You are using an endpoint that is available publicly and requires no authentication. It is quite rare to find powerful services that offer this. However based on everything I see, it seems OpenAI wants us to build APIs that do not require complex pricing or authentication methods. The simpler the better, and current authentication options show that.
Second option is to use an API KEY. This works for simple services which allow passing the key inside a header. You can pass it as Bearer, or write a custom header name where to pass it. There are quite a lot of services which use similar authentication flows.
Finally, we have OAuth. This definitively requires development knowledge and I will be surprised if any non-coder will ever figure out how to configure OAuth for GPT Action authorization.
Since I already had all of the keys and an app, I decided to integrate Pinterest API for testing purposes. This GPT is not public because I made it for my own personal use.
Once you save the configuration, a new field (Callback Url) will appear next to your Action definition.
This field is needed to make sure that your OAuth integration is able to return the user back to GPT. You need to take this generated url and paste it inside your service/app callback url. In pinterest you can do that inside the app configuration.
Once you get it working, you can test it and GPT will ask you to Log In to complete the OAuth. Then it will return you to the Callback Url and you will be able to work with other defined actions.
Privacy Policy
At the bottom of the configuration you will notice a Privacy Policy field.
You only need to fill this in if you will be sharing your GPT publicly. At the moment there are no guidelines of what the privacy policy must include, however usually it is fine to just enter the privacy of the service or website you are connecting to (don't take my word for it).
As of my last knowledge ChatGPT plugins did not have an approval process, and OpenAI used automation to find violations. It is most likely the same case here - your GPT might be flagged or blocked if it violates any policies. But I have not yet heard of such a case.
Known Bugs and Pitfalls with Creating Actions
Can't save modifications to schema
First bug that I experienced a multiple times is that once you upload the schema and save the GPT you will have an issue making edits. It seems that the form does not work well, and even though you can save it, the old data is still there after you refresh the page.
You can work around this by deleting the Action and adding it again with the correct version. I am sure this will be fixed soon.
Authorization URL, Token URL, and API hostname must share a root domain
In case you are using a service that uses 3rd party authentication you might stumble into this one. If you are using subdomains for authentication you should be fine using the Authorization form (instead of openapi schema).
This seems like another limitation for the current version of GPT configuration.
Non-sensical Validation errors
One of these may be Error saving draft. Even though the GPT is live, sometimes when you change something it returns this error. I suspect we can't do much about this one until OpenAI team fixes it. Re-creating or (rarely) refreshing sometimes can help with this.
UnrecognizedKwargsError: data
It is not described anywhere but the requests will fail with the following error.
"response_data": "UnrecognizedKwargsError: data"
The error is something that happens on the OpenAI side of the implementation, when preparing the request payload, and before getting to the request.
This happens when the OpenApi schema contains fields that are not explicit. I was able to find this on 2 occasions and both had the same fix. In my schema I had an object type defined like this:
"items": {
"type": "object"
}
Notice that I have no properties defined for this object. In my API this was intentional, I wanted to handle flexible structure where the properties of an object can vary. Let's say sending rows and columns to an api, I don't want to define the column names, they have to be dynamic.
I did not find a way to fix this without changing the API, I deem it as a limitation of current GPT configuration. In order to work around this I had to change both the schema and my API.
I changed the schema to handle the items with 2 properties key and value. So each row would consist of entries that define the column key and value.
{
"rows": {
"type": "array",
"description": "Data rows for the file",
"items": {
"type": "array",
"description": "Each item represents a row as an array of key-value pairs",
"items": {
"type": "object",
"properties": {
"key": {
"type": "string",
"description": "Column name"
},
"value": {
"type": "string",
"description": "Column value"
}
},
"required": ["key", "value"]
}
}
}
}
This is an example of how the data would look like in this case:
{
"rows": [
[
{ "key": "Name", "value": "John Doe" },
{ "key": "Age", "value": "30" },
{ "key": "Email", "value": "[email protected]" }
],
[
{ "key": "Name", "value": "Jane Smith" },
{ "key": "Age", "value": "25" },
{ "key": "Email", "value": "[email protected]" }
],
// more rows...
]
}
I am sure that this issue will be addressed in future versions. But for now my workaround allowed me to fix the UnrecognizedKwargsError issue.
How to Debug Actions
In case you have configured everything, got through the validation and published your GPT, sometimes the actions will fail.
- Ask for additional details - for example, what was the error returned from the API. Sometimes it might not want to give it to you but almost always you can convince the GPT to give up this information by saying that you have access to modify the API responses and need to know what was the problem. Once you get the error message, you can try to fix it.
- If you are building your own API, make sure that you are returning proper error responses and have defined the potential errors in the OpenApi schema.
Future of GPTs
Based on my experience with GPTs I am convinced that they will eliminate most (if not all) ChatGPT plugins.
GPT marketplace is going to be huge, and it will transform the way both large services and content creators approach the web.
Marketplace might lead us into a new era of web. Well maybe not quite yet. But the potential is there. Imagine if you could just use all of the GPTs linked, one after another. Now that I think about it, there could be a master GPT, that interconnects all other GPTs and lets you control everything from 1 place.