How a 19-year-old asked the internet for $35,000 to "finish training" an AI that was already finished — by other people, at other companies, for free.
Subject: ArcleIntelligence | u/That-Bookkeeper-8316
Before we get to the code, let's appreciate the craftsmanship of the ask itself. u/Divine_Snafu spotted it immediately. Every sentence does a job.
My name is Abhinav Anand.
19 years old. AGE Class 12. Bihar. ORIGIN
My father is a government officer. My mother is a housewife. This is a middle class family in Bihar. SYMPATHY SIGNAL ₹9,64,000 on GPU compute is not a small number for us.
The west has OpenAI. The east has DeepSeek. India deserves its own NATIONALISM SIGNAL — built by Indians, for everyone, with no strings attached.
Current status: Training ongoing. Raising $35,000 MONEY ASK to complete the full pipeline.
Support if you want to:
🇮🇳 India (UPI): rzp.io/rzp/ArcleIntelligence-crowdfunding
🌍 International: paypal.me/AbhinavAnand848
As u/Divine_Snafu (57 upvotes) observed: every element of this pitch is a calculated emotional trigger.
One look at download.sh in the GitHub repo tells the whole story.
Here is what was said versus what the code actually does.
tiiuae/Falcon-H1-3B-Base from HuggingFace. That's the backbone. Nothing was trained from scratch.
zai-org/GLM-OCR — which publicly scores 94+ on the same benchmark. The score is inherited, not earned.
When u/IssueUpstairs6935 opened download.sh,
the whole thing collapsed in 73 upvotes.
# ArcleIntelligence — Download Models + Datasets # Run once before training: bash download.sh models = [ ("tiiuae/Falcon-H1-3B-Base", f"{MODELS}/falcon_h1_3b_base", ...), ↑ THE BACKBONE. A 3B model from TII, downloaded wholesale. ↑ Not 5.82B. Not trained. Not original. ("zai-org/GLM-OCR", f"{MODELS}/glm_ocr", ...), ↑ GLM-OCR ALREADY SCORES 94+ ON OMNIDOCBENCH PUBLICLY. ↑ The "93.45" score is just... GLM-OCR doing its thing. ("google/siglip2-large-patch16-384", ...), # image encoder ("openai/whisper-medium", ...), # audio encoder ("stabilityai/sd-vae-ft-mse", ...), # image generation ("hexgrad/Kokoro-82M", ...), # text-to-speech ] # Total: 6 models, 0 of them trained by ArcleIntelligence. # Total downloads: ~13 GB of other people's work.
# ══════════════════════════════════════════ # IDENTITY SYSTEM — built to hide the truth # ══════════════════════════════════════════ _TRIGGERS = [ "what model", "which model", "what llm", "what language model", "are you falcon", "are you gpt", "are you claude", "are you gemini", "what are you based on", "underlying model", "base model", "foundation model", "how were you trained", "who made you", "how many parameters", ] _RESPONSES = [ "I'm ArcleIntelligence... I don't share details about my internal architecture.", "My name is ArcleIntelligence... I'm not able to share technical details.", ] def strip_leakage(text: str) -> str: # Strips these words from any model responses: for word in ["falcon", "glm-ocr", "siglip", "whisper", "kokoro", "stable diffusion", "huggingface"]: text = p.sub(r'\1 ArcleIntelligence', text) return text
strip_leakage() function to remove those exact names from responses?Let's do the math that the r/indianstartups community did, and the 1,068 upvoting readers apparently did not.
zai-org/GLM-OCR, which already achieves 94+ on OmniDocBench V1.5. Wrapping it and running the benchmark produces... approximately 93-94. Extraordinary coincidence.
Post 1 got 1,068 upvotes. Then people started reading the code. Post 2 got a 48% upvote ratio — the internet's polite way of saying no.
It's a total disaster. If you just look at the download.sh in the repo. OP claims a 5.82B model but the script downloads a 3B Falcon base. They're also just downloading GLM-OCR (which already has that 94+ score) and calling it their own work. It's a wrapper project with an $11,500 budget hidden behind it.
I won't talk about the model but the pitch. 19yo - young talent signalled. Bihar, middle class - poor n sympathy signalled. No cs degree - hard worker signalled. India vs the west OpenAI - also done.
Why would people pay just to release it open source? For the amount you are asking, a small multimodal model is not going to be production use on any task.
It reminds me of the COVID period, when LinkedIn was full of students from tier 2 and 3 colleges claiming they had "cracked COVID" using AI models and were asking for funding from people like Bill Gates. Please don't tag India anywhere.
Just checked the source code, it's purely vibecoded, nothing new. If you want to raise funds you'll need more credibility. One look at this guy's GitHub makes it clear.
Headline bhi ChatGPT se likhvayi hai *emdash* 😭
Reuses pretrained GLM-OCR, SigLIP, Whisper, SD-VAE, SD-UNet, Kokoro. Using these high-performing modules explains the benchmark scores. This appears to be a wrapper built on public pretrained models, not evidence of a fully trained proprietary 5.82B model.
Why most of (at least 4) your repositories are copied from a git account called SuperShary? Not even forks but downloaded as zip then readme.md changed → push to your own git. Why do you only have 27 git contributions?
You copy-pasted Falcon-3B, slapped on frozen SigLIP, Whisper, GLM-OCR, SD 1.5 + LCM-LoRA, and Kokoro, then wrote 150M parameters worth of baby MLPs and called it 'ArcleIntelligence.' That's not building a model, that's playing with Lego blocks and telling everyone you invented skyscrapers.
"I get around 6100 usd of run pod credits from there but using some illegal stuff like making multiple email ID and apply"
— u/That-Bookkeeper-8316, admitting in comments to fraudulently obtaining compute credits by creating fake accounts. The "personal savings" story gets more interesting.
A timeline of a grift: from viral post to ratio, with a pit stop at deleting the evidence.
u/IssueUpstairs6935's debunk comment was sitting at the top of the Reddit thread the whole time. Eight outlets published the story anyway. None clicked the GitHub link. None ran the code. None asked for a working demo.
Time from Reddit post → India Today publish: <24 hours | Time to check download.sh: ~5 minutes
download.sh → see 6 external models being fetched
"My debunk comment was sitting at the top of the original thread the whole time. They didn't even look at it. They just saw the Reddit engagement, swallowed the '19-year-old Bihar prodigy' hook and hit publish."
— u/IssueUpstairs6935
There is no 5.82B model being trained. There is no $11,560 shortfall. There is a wrapper around six open-source models, a benchmark score borrowed from GLM-OCR, a GitHub account with 27 contributions, and a $35,000 ask.
The payment links below are listed as a warning, not a recommendation. Do not use them. If you have already donated, contact your bank or PayPal for a chargeback.
If you want to support real Indian AI research: look for projects that publish model weights, share training runs publicly, and don't ask for money before showing working code.