How to find duration of .amr files in Java
Author: Saurabh Bhardwaj
A little background first …
AMR, which stands for Adaptive Multi Rate, is an audio format that is largely used in audio recording applications. Unlike MP3 where audio is encoded at a constant bit rate, AMR files are encoded at variable bit rates ranging from 4.75 to 12.2 kbit/s. Therefore, it isn’t as simple as using bit rate & file size to compute the audio duration.
A typical use case for applications is listing out a set of amr files present on a device, along with their duration.
There must be some online utility / package that can do this …
Apache Tika is a content analysis toolkit that detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). Tika is a project of the Apache Software Foundation.
At the time of writing this, although Tika can detect amr files easily, it doesn’t provide the functionality to get their duration.
There is another software named MediaInfo that does all of this & can be installed on a server. It is known to have been commonly used in aws workflows. Apart from the fact that it has no documentation (since there is no API as such, usage is via curl & shell calls), it is a strong contender in terms of our use case.
MediaInfo can accurately give the duration of amr files, with a small trade-off of having to set it up on a server. At the time of writing this, it doesn’t provide any library that can be integrated with your application code.
Can’t I just process the file myself in Java ?
Sure you can, it’s a bit complicated though. Here’s how I did it… in Java. But first, we must understand the anatomy of an amr file.
Checking out the specifications
Developers at Nokia publish a wiki on AMR format long time ago. It is not available anymore, but one can find it in the web archives here. One thing to note from hereon is that AMR-NB i.e. AMR Narrow Band is the more common one & below approach works only with AMR-NB files.
Playing with bytes
In order to detect whether the file is an AMR file & to determine its duration, we need to parse the file, byte by byte. Lets say we have a file named “out.amr” which is a valid AMR file.
File file = new File("out.amr");
Since we have to work our way through bytes of this file, we can read the bytes into a byte array which we will later loop over.
InputStream stream = new FileInputStream(file);
byte[] bytesArr = new byte[((int) file.length())];
int numberOfBytes = stream.read(bytesArr);
Each AMR file consists of a 6-byte header that identifies the file as AMR audio. This header is always set to following byte values: 0x23, 0x21, 0x41, 0x4D, 0x52, 0x0A. Below is one simple way to identify whether a file is actually an AMR file.
// convert hex to int & compare with bytes (int)
if(!(
Integer.parseInt("23",16) == bytesArr[0] &&
Integer.parseInt("21",16) == bytesArr[1] &&
Integer.parseInt("41",16) == bytesArr[2] &&
Integer.parseInt("4D",16) == bytesArr[3] &&
Integer.parseInt("52",16) == bytesArr[4] &&
Integer.parseInt("0A",16) == bytesArr[5]
)
)
{
System.out.println(" Not a valid AMR file. Bye bye.. ");
return false;
}
Starting with the 7th byte, the file is arranged in subsequent frames of audio. Each frame is itself worth 20ms of duration. All we need to do, in order to get the duration is : find out the number of frames & multiply that with 20 ms. But since there is no defined boundary for these frames in a sequence of bytes, we need to figure out size of each frame in bytes so that we can count.
The secret of every frame’s size in bytes resides with the way they are encoded. Each of these frames can be encoded using one of 8 different levels of compression. Each of these 8 levels are mapped to a mode (called CMR, short for Codec Mode Request ).
Each CMR corresponds to a different bit rate (hence called “adaptive multi-rate“) as well as the size of frame in bytes. See below table for valid values of CMR & corresponding bit rate and frame size:

How to get this CMR then … ?
Each frame consists of a 1-byte header, then the rest of the frame is audio data. Imagine that below are the 8 bits of this header with least significant bit on the right. If we number these bytes starting from 0 as least significant, CMR is made up of bits numbered as 6,5,4,3.
Most significant -> [X C C C C X X X] <- Least significant
7 6 5 4 3 2 1 0
Valid CMR value ranges from 0 to 7 which can be represented only using ower 3 bits. The 4th bit numbered as 6 in above representation is used if the frame is actually an SID frame, which is used for producing comfort noise. We can ignore that as well. So, we need to process bits numbered as 5,4,3 to get our CMR in binary. Below is the code to achieve this.
int numberOfFrames=0; // we need to count this
// ... starting from 7th byte
for(int i=6; i < numberOfBytes; i++)
{
// the next byte will be a header for next frame so we process that to get frame size
int frameSizeInBytes = processFrameHeaderByte(bytesArr[i]);
// increment number of frames
numberOfFrames++;
// reduce by 1 because header byte is part of the frame too
int frameBytesToSkip = frameSizeInBytes-1;
//skip that many bytes to reach next frame
i=i+frameBytesToSkip;
}
Below is the function to process frame header byte:
/**
* returns frame size in bytes
* @param : headerByte
* @return : frame size in bytes (int)
*/
private int processFrameHeaderByte(byte headerByte)
{
// prepare a string of binary values, so that we convert it to int later
StringBuilder binStr = new StringBuilder();
// get the values at 5th, 4th, 3rd position in the byte & append to the string (most significant first)
binStr.append(getBit((int) headerByte, 5));
binStr.append(getBit((int) headerByte, 4));
binStr.append(getBit((int) headerByte, 3));
// convert to codec mode / CMR value, integer (0 - 7)
int codecmode = Integer.parseInt(binStr.toString(),2);
// use the map of codec mode & frame size to get size of this header byte's frame
Integer frameSize = codecModeFrameSizeMap.get(codecmode);
// return the size
if(frameSize!=null) {
return frameSize;
}else{
return 0;
}
}

Getting the bit at a given position can be done as simply as:
public byte getBit(int byteValue, int position)
{
return (byte) ((byteValue >> position) & 1);
}
The codec mode to frame size map in our code is :
public Map<Integer, Integer> getCodecModeFrameSizeMap()
{
Map<Integer, Integer> codecModeFrameSizeMap = new HashMap<Integer, Integer>();
codecModeFrameSizeMap.put(0,13);
codecModeFrameSizeMap.put(1,14);
codecModeFrameSizeMap.put(2,16);
codecModeFrameSizeMap.put(3,18);
codecModeFrameSizeMap.put(4,20);
codecModeFrameSizeMap.put(5,21);
codecModeFrameSizeMap.put(6,27);
codecModeFrameSizeMap.put(7,32);
return codecModeFrameSizeMap;
}
Finally after getting the number of frames, we can get duration in ms as follows
System.out.println("time in ms="+(numberOfFrames*20));
// outputs "time in ms=33900"
Or format the duration more elaborately as follows
System.out.println("duration = "+
((int)((int)((numberOfFrames*20)/1000)/60))+" minutes, "+
((int)((int)((numberOfFrames*20)/1000)%60))+" seconds, "+
((numberOfFrames*20)%1000)+" millis");
//outputs "duration = 0 minutes, 33 seconds, 900 millis"
How fast is it though ?
Even though we are processing bytes of the file, we are skipping n-1 bytes per frame, where n is frame size in bytes. Having the file available locally (ignoring time taken to fetch a file remotely), I ran this code for a 4MB AMR file. It took 41 ms to process the file & return the duration.