Retool just launched a free LLM Playground to compare LLM results side by side. It has listed various popular large language models that you can select and use. After taking a prompt, they all generate the results, and you can analyze them. Some of these models generate short results while some generate detailed results. This way, this platform allows you to compare the AI generated responses by different models for the same prompt.
If you want to test how different LLMs will behave on a same prompt at the same place, then this website is a great place. For now, it offers 7 LLMs that you can choose and use them to compare results. But for now, you can only choose 3 models at once to compare the results. Here is the list of LLMs that Retool is offering now.
- GPT 3.5
- GPT 4
- Command XL Nightly
- Anthropic’s Calude 1.2
- Flan-T5 XXL
- Blenderbot 3B
- DialoGPT Large
From these models, you can choose any 3 to compare at the same time. The LLM Platform has a simple and intuitive interface, and you can control the token count as well as temperature.
Free LLM Playground by Retool to Compare LLM Results Side by Side
You don’t really need an account to try this Retool’s LLM Playground. You just access it on this URL and then start using. There are 3 models set already for you but you can change them by selecting 3 other different models from the list.
Now, you enter a prompt to get stared. Enter a prompt and then simply specify the token count and temperature.
Click submit and then wait for all models to generate the responses. When you finally have the responses, you can simply compare them. Or you can also click on them to copy them in a click.
In this way, you can use this simple online tool to quickly compare the LLM output of the popular AI chatbots that generate text responses. However, a little downside here is that not all the models can load the data. Only some of them work sometime. But I hope it gets fixed soon in the coming updates.
Closing thoughts:
For some reason, if you want to see how well different LLMs behave on same prompt then this webapp by Retool will help you. For now, there are only 7 models, but I hope they add more in the coming updates. So, go give this tool a try and let me know if you have any questions.