Mochi Human Text
This project is mostly an experiment to see if I can:
- Scrape my website as training data.
- Use that training data to fine-tune a model to generate human-written text that sounds like me.
- Expand steps 1-2 to anyone else's website.
Here's the website.
How it works
There is some prompting you can do to fool AI content detectors. Though a lot of these tactics are short-lived. As soon as a tactic is exposed, detectors can be trained to consider those tactics as AI. As an example, you can fool AI detectors by:
- Introducing grammatical errors.
- Misspelling words.
- Never using commas.
However, it is easy to add a large corpus of AI-generated text with those tactics and update models to detect those.
To combat this arms-race situation, you can train an AI to write just like you by using a large corpus of your own written text. This is what Mochi Human Text does.
If an AI writes exactly like you and is still flagged as AI content, you could argue that it is a false positive on the detector's side. This is because if you were to write the text in the same way, an overfitted AI content detector would flag your writing as AI content.
Though most AI content detectors will likely err on the side of false negatives than false positives, as the latter carry consequences in academic and certain business settings.