Uni Logo LuxASR Logo

LuxASR

Automatic Speech Recognition for Luxembourgish

Use LuxASR to transcribe Luxembourgish speech to text. Either upload an audio/video file, use the microphone to record your speech, or paste a YouTube link and hit ‚Transcribe!‘. LuxASR is fast: It can transcribe up to 170 words per second. Try the examples. LuxASR is now also availalable as smartphone app. You can find it in the App Store and Google Play.

Drag & Drop your audio/video file here
or click to browse (currently supported: wav, mp3, m4a or mp4)

Paste a YouTube link and LuxASR will transcribe the audio track.
Queue: -
With this interface, we are giving access to our most performant tool for automatic speech recognition of Luxembourgish (speech-to-text). It has been trained on 150+ hours of carefully controlled pairs of audio and transcription snippets and is achieving a word error rate below 10%, i.e. 10 errors per 100 running words (punctuation and capitalization included). We are providing this tool to facilitate the transcription of Luxembourgish audio recordings into written text for research purposes, but also for general public use.

Available options

Several audio input languages are available (default: Luxembourgish). If the recording contains more than one speaker, setting diarization to ‚On‘ will separate the text of every speaker in the recording along with time codes for their turns. Note that diarization adds some extra time to the recognition process. In the web interface, six output formats are available: enriched text (colored_text, downloadable as TXT/DOCX), SubRip Subtitles (srt), MAXQDA transcript (maxqda), Praat TextGrid (textgrid), fully aligned Praat TextGrid (textgrid_aligned, based on MMS forced alignment and MFA resources for Luxembourgish), and json. The API also accepts plain text output via outfmt=text. These files can be downloaded through the link below the transcription. The recognition duration takes up to 5% of the audio file’s duration (e.g., 3 minutes for 60 minutes of audio). Once the recognition process has started, an estimated time and a timer will be displayed to keep track of the progress.

As an experimental feature for the text translation to other languages has been added, which will output the recognized text in English, German, Portuguese, Spanish or French. Note that translations take more time to run and will run only for short audios (max. 3 minutes). The quality of these translations may vary. You can try also our stand-alone translation LuxMT.

The maximal size for upload is 500 MB. Audio files should be in WAV, MP3 or M4A format, video files in MP4 format.

FAQ

Timecodes for SRT have been improved and should be largely correct. Also, it is now possible to select the length of the subtitle segments (defaul 42 characters). Line breaks follow syntactic boundaries for better readability.

Long audios (up to a file size of 500 MB) can be processed by LuxASR. Some users have reported problems here and we have implemented a new queue system where you can monitor the status of your transcription job. Please ensure that your browser connection is not interrupted.

Sometimes short turns by speakers or many overlapping speaker turns are not segmented correctly. This is a limitation of the diarization method we are using; we are trying to improve this.

LuxASR can transcribe different languages in the same audio to a limited extent. The best results are achieved for Luxembourgish, as this is our main focus. Due to its closeness to Luxembourgish, German sections in an audio may sometimes contain errors or a mixture of Luxembourgish and German. As a workaround, to transcribe German-only audios, set the input language to German.

Translation into other languages is experimental and works only for short audios. For translation only, please use our dedicated system LuxMT.

A copy button has been added to conveniently copy the entire transcript to another application. On mobile devices you can also use the share button.

Use output format maxqda to generate transcript text for MAXQDA import. Each speaker paragraph ends with a timestamp in MAXQDA-compatible form [hh:mm:ss.x].

Issues with the misplacement of the apostrophe in Luxembourgish ‘d’’ (e.g. in ‘d’Land’) have been fixed.

We provide some of our fine-tuned ASR models with an Open Weights license. The smaller models (tiny, small, base, medium) are thus available on Hugging Face to be integrated into your apps. Our most performant model (large-v3-turbo), which is used on this website, in our API and Apps, is not yet freely available. Contact us if you are interested in acquiring a license.

If you encounter errors while using LuxASR, please report them to peter.gilles@uni.lu with the exact time when it occurred and the approximate file size of the audio.

API Access

We are opening API access now for limited access. We reserve the right to modify or suspend access to the API at any time. If you plan to integrate our service into another application, contact us first for permission and conditions. The LuxASR API can be reached via:

curl -X POST "https://luxasr.uni.lu/v2/asr?diarization=Enabled&outfmt=colored_text" \
  -H "accept: application/json" \
  -F "audio_file=@PATH/TO/AUDIO FILE;type=audio/wav"

The API returns the transcription in the specified output format.

Query Parameters

  • diarization: Can be set to Enabled (default) or Disabled to include or exclude speaker diarization.
  • outfmt: Specifies the output format. Supported values are:
    • colored_text – enriched text with interactive features and confidence highlighting
    • text – plain text transcript (default)
    • json – JSON output
    • srt – SubRip subtitle format
    • maxqda – transcript text with MAXQDA-compatible #hh:mm:ss.x# timestamps
    • textgrid – Praat TextGrid format
    • textgrid_aligned – aligned TextGrid format using MMS and MFA Luxembourgish resources

Accepted audio formats are .wav, .mp3, and .m4a.

Data protection

Note that the transcription and the translation are run on a dedicated server at the University of Luxembourg. All data thus stays within Luxembourg and the University’s network. Nobody has access to the uploaded audio or the text output. The audio data is streamed to this server and no files are stored on this server or in the network. No data is used to further train the model and no data is transferred to third parties.

Contact

Learn more about LuxASR. LuxASR is under constant development by Peter Gilles, Léopold Hillah, and Nina Hosseini-Kivanani at the University of Luxembourg and is supported by the Chambre des Députes du Grand-Duché de Luxembourg.