Whisper AI · 35+ languages

Convert Audio & Video to Text with AI

Paste a video link or drop an audio file — Whisper-powered AI turns speech into accurate text

or
Try for Free
30 free minutesNo credit card required

35+

languages supported

1000+

sites supported

30

free minutes

Everything you need for audio & video

Speech Recognition

Accurate transcription in 35+ languages with automatic speaker detection and timestamped output

Any Source

Copy a link from YouTube, Instagram, VK, Vimeo, Google Drive, and 1,000+ other platforms

Smart Summary

AI extracts key points, important facts, and conclusions — a concise overview in an adaptive format

Flexible Export

Download results as PDF, Word, TXT, Markdown, CSV, or subtitles (SRT/VTT) — all with speaker labels

Three steps to your transcript

1

Add your recording

Paste a video or audio URL from any site — or drag and drop a file right into the browser

2

AI processes your audio

Whisper detects the language, splits speech by speaker, and adds timestamps automatically

3

Download or share

Read the text with AI summary online, export in your preferred format, or send a link to colleagues

About

DictAI is an AI-powered transcription service that converts audio and video into accurate text. Whether you're a marketer, product manager, content creator, podcaster, journalist, teacher, lawyer, researcher, student, or team — we make it easy to get searchable, shareable text from any media: interviews, lectures, calls, podcasts, webinars, and meetings.

Powered by Whisper

Using Whisper, one of the most accurate speech recognition models, supporting 35+ languages with speaker detection.

AI Summaries

Every transcription comes with an AI-generated summary highlighting key points, important facts, and author conclusions.

1000+ Sources

Extract audio from YouTube, Instagram, Vimeo, Google Drive, and hundreds of other platforms automatically.

Secure & Private

Your data is encrypted and processed securely. Delete anytime — we respect your privacy.

Pricing

Simple, transparent pricing. Start free, upgrade as you grow.

Free
Try it out
$0
  • 30 minutes / month
  • Files up to 200MB
  • Up to 30 min per file
  • Up to 1 files at once
  • Export TXT and Markdown
  • AI summary (paid plans)
Starter
For beginners and small tasks
$11/mo
  • 500 minutes / month
  • Files up to 500MB
  • Up to 3h per file
  • Up to 3 files at once
  • All export formats
  • AI summary & key highlights
  • Custom summary prompt
  • Share links
Popular
Pro
For regular use
$20/mo
  • 1000 minutes / month
  • Files up to 1GB
  • Up to 3h per file
  • Up to 5 files at once
  • All export formats
  • AI summary & key highlights
  • Custom summary prompt
  • Share links
  • Priority processing
Business
For teams and heavy workloads
$53/mo
  • 3000 minutes / month
  • Files up to 5GB
  • Up to 3h per file
  • Up to 10 files at once
  • All export formats
  • AI summary & key highlights
  • Custom summary prompt
  • Share links
  • Priority processing

FAQ

Frequently asked questions about DictAI

We support MP3, MP4, WAV, WEBM, M4A, AVI, MOV, OGG, and FLAC formats. Maximum file size depends on your plan: from 200 MB (Free) to 5 GB (Business).

We support 1000+ sites including YouTube, Instagram, Vimeo, SoundCloud, Twitter/X, Reddit, VK, Google Drive, and many more. Simply paste the URL and we handle the rest.

We use Whisper, one of the most accurate speech-to-text engines available. It supports 35+ languages with speaker diarization and smart formatting.

The AI summary provides a concise overview of the transcript, key topics discussed, important facts, and author conclusions — all generated automatically. You can also set a custom prompt for summaries in project settings.

Yes! You can generate a share link for any transcription. The recipient can view the transcript and summary without needing an account. You can revoke the link at any time.

1 credit = 1 minute of audio. Credits are deducted based on the actual duration of the audio, rounded up to the nearest minute.

You can export transcriptions as PDF, DOCX (Word), TXT, Markdown (MD), CSV, or subtitles (SRT, VTT). All formats include speaker labels and timestamps.

Yes. Files are stored in encrypted storage, transcriptions are processed securely, and we never share your data. You can delete your account and all data at any time.

Ready to turn speech into text?

Try free — 30 minutes of transcription, no credit card required.

Start for Free
Convert Audio & Video to Text with AI | DictAI