Example 1 
    Example 2 
    Example 3 
Play 
  
    Audio Language 
        
            Luxembourgish 
            English 
            French 
            German 
            Spanish 
            Portuguese 
         
    
    Number of Speakers 
  Auto (recommended) 
  1 speaker 
  2 speakers 
  3 speakers 
 
    
      Min Silence Duration
   
 
    
    300 ms (short) 
    500 ms 
    1,000 ms 
    1,500 ms 
    2,000 ms (default) 
    3,000 ms 
    4,000 ms 
   
    
      Output Format
       
     
        
            Text 
            
            Subtitles (SRT) 
            TextGrid 
            BAS TextGrid 
            JSON 
         
    
    
      Translation
         
         
        
            None 
            English 
            French 
            German 
            Spanish 
            Portuguese 
            Luxembourgish 
         
    
 
  Transcribe! 
  Clear 
  Transcription 
  
  
  
     
    
      Interactive Features:  Click any word to play audio from that point • Hover over words to see confidence levels
    
    
       
     
   
  
  
  
    Playback Speed: 
     
    1.0x 
    
        Stop
     
  
  
  The transcription appears here...
  
 
 
 
With this interface, we are giving access to our most performant tool for automatic speech recognition of Luxembourgish (speech-to-text). It has been trained on 150+ hours of carefully controlled pairs of audio and transcription snippets and is achieving a word error rate below 10%, i.e. 10 errors per 100 running words (punctuation and case included 😛). We are providing this tool to facilitate the transcription of Luxembourgish audio recordings into written text for research purposes, but also for general public use. The resulting text follows the current spelling rules for 2019.
Available options 
Several audio input  languages are available (default: Luxembourgish). If the recording contains more than one speaker, setting diarization  to ‚On‘ will separate the text of every speaker in the recording along with time codes for their turns. Note that diarization adds some extra time to the recognition process. Five output formats  are available: enriched text (with interactive features and confidence highlighting), plain text (txt), SubRip Subtitles (srt), JSON (with or without time codes for words) and Praat TextGrid. These files can be downloaded through the link below the transcription. The recognition duration takes up to 5% of the audio file’s duration. Once the recognition process has started, an estimated time and a timer will be displayed to keep track of the progress.
As an experimental feature for the text translation  to other languages has been added, which will output the recognized text in English, German, Portuguese, Spanish or French. Note that translations take more time to run and will run only for short audios. The quality of these translations may vary.
The maximal size for upload is 500 MB. The preferred file format for audio files is ‚wav‘ with a sampling frequency of 16,000 Hz.
For the new BAS TextGrid option, the audio and text are sent to the BAS services. Please review their privacy statement  for more information.
API Access 
We are opening API access now. The LuxASR API can be reached via:
curl -X POST "https://luxasr.uni.lu/v2/asr?diarization=Enabled&outfmt=text" \
  -H "accept: application/json" \
  -F "audio_file=@PATH/TO/AUDIO FILE;type=audio/wav"
 
The API returns the transcription in the specified output format.
Query Parameters 
  diarization : Can be set to Enabled (default) or Disabled to include or exclude speaker diarization. 
  outfmt : Specifies the output format. Supported values are:
    
      colored_text – enriched text with interactive features and confidence highlighting 
      text – plain text transcript (default) 
      json – detailed JSON output 
      srt – SubRip subtitle format 
      textgrid – Praat TextGrid format 
     
   
 
Accepted audio formats are .wav , .mp3 , and .m4a .
Python Script 
Below is a basic Python script that replicates the functionality of the curl command
with added flexibility. You can specify the audio file and optionally choose whether to enable diarization and which output format to use.
Download the Python script  to use the LuxASR API.
API Documentation 
View the API Documentation  for detailed information about the LuxASR API.
Usage 
python luxasr_transcribe.py path/to/your_audio.wav --diarization Enabled --outfmt json
 
Replace path/to/your_audio.wav with your actual audio file. The --diarization and --outfmt options are optional and default to Enabled and text respectively.
Disclaimer 
Note that the transcription and the translation are run on a dedicated server at the University of Luxembourg. All data thus stays within Luxembourg and the University’s network. Nobody has access to the uploaded audio or the text output. The audio data is streamed to this server and no files are stored on this server or in the network. No data is used to further train the model and no data is transferred to third parties.
Contact 
Learn more about LuxASR . LuxASR is under constant development by Peter Gilles , Léopold Hillah , and Nina Hosseini-Kivanani  at the University of Luxembourg and is supported by the Chambre des Députes du Grand-Duché de Luxembourg .