per-room credentials for real-time media
date: May 18, 2022
Ravel was a VR collaboration platform. Users joined shared 3D spaces where they could see each other, talk, present, and interact with content. Each room needed two real-time connections: Agora for audio/video and Photon for multiplayer state. Both services authenticate clients using tokens. The question was where and how to generate those tokens so they stay scoped, short-lived, and tied to the correct room.
Hardcoding API keys on the client was not an option. A leaked key would give anyone access to create or join arbitrary channels on our Agora account. With enterprise clients running sessions with 50 to 100 concurrent users, the blast radius of a credential leak was real.
Token generation on the backend
The backend generated tokens on demand. When a user joined a room, the frontend requested a token from our Spring Boot API, passing the channel name (derived from the room ID) and the user's ID. The backend signed the token server-side and returned it with an explicit expiry.
The core of this was HMAC-SHA1 signing. Agora's token format packs a set of privilege claims (join channel, publish audio, publish video, publish data) into a binary message, signs it with the app certificate, and base64-encodes the result.
static byte[] encodeHMAC(byte[] key, byte[] message)
throws NoSuchAlgorithmException, InvalidKeyException {
SecretKeySpec keySpec = new SecretKeySpec(key, "HmacSHA1");
Mac mac = Mac.getInstance("HmacSHA1");
mac.init(keySpec);
return mac.doFinal(message);
}
The app certificate never leaves the backend. The frontend receives only the derived token, which is useless without the certificate to forge new ones.
Scoping tokens to rooms and roles
Each token was bound to three things: a specific channel name, a specific user account, and a role. We defined two roles: ROLE_PUBLISHER and ROLE_SUBSCRIBER. Publishers could send audio and video. Subscribers could only receive.
public enum AgoraRole {
ROLE_PUBLISHER,
ROLE_SUBSCRIBER
}
The controller accepted the role as a query parameter alongside the channel and user ID:
@PostMapping(value = "/create", consumes = "application/json")
public ResponseEntity<AgoraTokenGetDto> createTokenForStreaming(
@RequestParam(name = "agoraRole") AgoraRole agoraRole,
@RequestBody AgoraTokenPostDto agoraTokenPostDto
) {
return ResponseEntity.ok()
.body(agoraTokenGenerator.generateAgoraStreamingToken(
agoraTokenPostDto, agoraRole));
}
The generator then built the token with the matching privilege set. A subscriber token could not be used to broadcast. A token for room A could not join room B. The AccessToken class packed each privilege with the same expiry timestamp:
builder.addPrivilege(AccessToken.Privileges.kJoinChannel, privilegeTs);
if (role == Role.Role_Publisher) {
builder.addPrivilege(AccessToken.Privileges.kPublishAudioStream, privilegeTs);
builder.addPrivilege(AccessToken.Privileges.kPublishVideoStream, privilegeTs);
builder.addPrivilege(AccessToken.Privileges.kPublishDataStream, privilegeTs);
}
The token also included a random salt and an internal timestamp, both packed into the signed message. This prevented replay: two tokens requested for the same user, channel, and role at different times would produce different signatures.
Expiry
Tokens expired after 24 hours. The response DTO included the expiry explicitly so the frontend could track it:
agoraTokenGetDto.setExpiresAt(OffsetDateTime.now().plusHours(24));
In practice, sessions rarely lasted that long. But the 24-hour window covered edge cases: users leaving a tab open overnight, reconnecting after a network drop, or rejoining the same room across multiple short sessions without re-authenticating. Short enough to limit damage from a leaked token, long enough to avoid disrupting active sessions.
Frontend consumption
On the client side, the Call component received the Agora token as a prop. It joined the channel on mount, passing the app ID, room ID (used as the Agora channel name), the server-generated token, and the user's session ID:
useEffect(() => {
join(appid, photonRoomId, agoraToken, sessionUserId);
return function () {
leave();
};
}, []);
The useAgora hook wrapped the Agora SDK. It managed local audio/video tracks, remote user subscriptions, mute state, and device selection. The hook never generated or modified tokens. It received one and used it to authenticate with Agora's servers. Clean separation: the backend owned credential generation, the frontend owned media state.
Capacity constraints
The frontend enforced soft limits on concurrent media streams. No more than 12 participants could enable audio simultaneously. No more than 10 could enable video. These limits came from Agora's performance characteristics at the time, not from token restrictions.
if (remoteUsers.filter((user) => user.hasAudio).length < 12) {
joinAudio();
} else {
warningNotification("No more than 12 participants can enable audio");
}
Tokens granted permission. Client-side logic managed resource budgets. Two separate concerns, handled in two separate places.
Host moderation
The moderation system operated independently of tokens. A host could mute everyone, mute individual users, or disable all webcams. These actions flowed through WebSocket state, not token revocation:
if (moderationRoomState.muteEveryone) {
if (!moderationUserState.overrideMuteEveryone) {
forceAudioMute();
}
}
Revoking a token mid-session would disconnect the user entirely. Moderation needed to be granular: mute someone's mic without kicking them from the room.
What held up
This pattern scaled cleanly from small demos to 100-person sessions. The backend generated tokens in single-digit milliseconds. Tokens were stateless (no database lookup required, no session table). The signing was deterministic given the inputs, so it could run on any backend instance without coordination.
The key design choice was keeping the token lifecycle simple: generate on join, expire after 24 hours, never revoke. Moderation and capacity management used separate, faster mechanisms. Each layer did one thing.
I was Technical Director and co-founder at Ravel from 2021 to October 2022.