The last 4 weeks, I’ve literally done nothing that’s not related to ChatGPT – But I assume that’s true for most of us. A couple of days ago I released a new version of Aista Magic Cloud, allowing users to scrape their website(s), or any other website for that matter, to generated training data from the site. This allows me to create a “custom ChatGPT model”, where the model should be at least in theory able to answer intelligently according to how you would want it to answer. Basically …
This turns your existing website into a custom ChatGPT brain
Examples of use cases I see myself are;
- Support chat bots for companies
- Legal advice chat bots on lawyer companies websites
- Investment advice on investment websites
- Medical advice on websites for medical companies
But really, I suspect the above is just the tip of the iceberg. In the following video, I am demonstrating how I was able to use it to scrape Aista.com and docs.aista.com to generate a chat prompt that would answer questions roughly 50% correctly. 50% of course is far from good enough, but the system also allows you to continuously improve your models, through human supervision, generating new training data, and such iterating over and over again, until you reach close to 100% accuracy.
After a simple 5 minutes long scraping session, the system is already capable of answering questions it had no idea how to answer at the primary website for ChatGPT. Not spectacularly good numbers, but I suspect with larger websites, creating more training data, it would result in a higher success ratio. In addition, the “reinforcement training parts” allows for supervising the model’s questions and answers, allowing human beings to monitor its questions and answers, and such over time incrementally improve its answers rapidly, by “sanity checking” its questions and answers.
Does anyone here have a great suggestion to a (small) website they want me to scrape? It would need at least some 3,000 to 5,000 high quality training snippets I suspect before it’s really good at answering. Comment with a link to a nice website, with good semantic data, not too large preferably, tops some 500 pages in total – And I’ll scrape it for you, and create your own ChatGPT chat bot … 😉