Towards a generic orchestrator: The VRTogether project experience
Part 3/3: API Detail
In the third and last of our series of articles on the VRTogether Orchestrator (you can read Part 1 and Part 2 if you missed them) we cover the details of its current API.
In the following paragraphs, we will go into details of each categories of services:
Authentication and logging
The first step a user takes to access the VRT platform is to try to authenticate by using its credentials. This authentication phase is very simple so far as it is assumed to be executed in a safe environment. The orchestrator provides so far 2 main functions:
Login(userName, userPassword) Logout()
Regarding the management of the platform sessions, the orchestrator provides a set of exposed API functions to handle the multiple aspects of a collaborative and scripted experience. The Core orchestrator integrates an internal data model based on four main objects:
- User: a person who wants to share an immersive social experience with others persons.
- Session: a session gathers users that want to share an immersive social experience together based on a Scenario instance
- Scenario: a scenario refers to a virtual world composed with different locations called rooms. The scenario can include the description of the underlying logic.
- Room: a room is a virtual location which is part of the scenario instance.
When a user creates a session, a scenario must be attached to the session. A scenario, chosen between the available ones, is then instantiated and bound to the session. This attached scenario is called a ScenarioInstance. Several sessions can create a ScenarioInstance from the same Scenario. Each ScenarioInstance will run its own logics independently (scene events, gathering users into rooms, etc.).
Each session can have at most one “master” user (which might be different from the user that created the session). The master user is responsible for gathering and resolving the various interaction events emitted by all the users within the session. In practice, the master user is running the “server game loop” that ensures the scheduling and the consistency of the scenario logics during the session.
The internal data model basic representation is based on the following schema:
Regarding session management, the orchestrator provides a set of function, that allow first to get information on scenarios and instantiated scenarios.
GetScenarios() GetScenarioInfo(scenarioId) GetScenarioInstanceInfo(scenarioId)
Then there are several functions to create, join, leave, delete, or get information on session and rooms:
AddSession(sessionName, sessionDescription, scenarioId, [canBeMaster]) DeleteSession(sessionId) GetSessions() GetSessionInfo() JoinSession(sessionId, [canBeMaster]) LeaveSession() GetRooms() GetRoomInfo() JoinRoom(roomId) LeaveRoom()
And finally, there are some functions to start and stop a scenario which means a scenario have an internal clock and specific logic launching events and actions at specific time.
StartScenario() RestartScenario() StopScenario()
The state diagram below aims at showing how a user is getting connected to the orchestrator, then create a session with a scenario, join this session and then join a room of the scenario.
User data and message communication
This category includes simple but very powerful functions to handle specific user data and how user can communicate.
First, there are functions to get information on connected users or users in a session.
The orchestrator provides then a set of functions to manage user data. User data hold a set of properties bound to each user (e.g. the URLs to access the user audio / video / point cloud streams).
Those data are store and retrieve by the user after logging. Those data can be also updated.
GetUserData([userId]) UpdateUserData(userDataKey, userDataValue) UpdateUserDataArray(userDataArray) UpdateUserDataJson(userDataJson) ClearUserData()
A human behind the user has the means to communicate to other human users by sending textual messages to one specific user or to all users (“chat” functionality).
SendMessageToAll(message) SendMessage(userId, message)
The orchestrator includes also several functions to handle scene events dispatching between users. The orchestrator has no direct knowledge on the scene event commands or even the format itself of the event. Its role is just to dispatch and handle them with a logic similar to what can be found in some game engines: within a session, one user can be declared as the master. The master is the one that takes decisions regarding a session. Users are able to send events to the master. Then, the latter is able to process them and then dispatch processed events to one user or all users.
SendSceneEventToMaster(sceneEventData) SendSceneEventToUser(userId, sceneEventData) SendSceneEventToAllUsers(sceneEventData) SendSceneEventToUserDirect(userId, sceneEventData) SendSceneEventToAllUsersDirect(sceneEventData)
Pilot and monitor delivery components
VRTogether features an SFU (Stream Forwarding Unit) that duplicates streams to all the users of the media session:
The VRTogether experience also includes some external user (or fake user called “Live Presenter”) that is handled separately from the other users:
For now, the SFU “pool” (set of available SFU units) configuration is static by declaring ports and other information. But the instantiation of SFU instances is dynamically performed either by following a simple algorithm (like one SFU by session) or more complex algorithm (like round-robin) to optimize the resource in term of VM or server to the number of session and users. The module that pilot other components open the way to explore new challenges with regards to scalability.
Common time provider
The time API is rather simple because the role of the orchestrator is to forward messages to different components. The orchestrator has no responsibility to synchronize objects. However it provides this clock API for convenience.
Please note that the NTP clock was also evaluated against other clocks (PTP…) and integrated clock mechanisms (DVB-CSS). It was decided to keep the orchestration clock layer as thin as possible.
Logs and Analytics framework
Some logs functions have been added allowing to collect logs from other modules like SFU or Live Presenter. The retrieval is not provided by a socket.io function but thanks to a specific webserver on another port which a request like http://<server_url>:8081
The access to those logs is actually relevant for client developers that have then a way to retrieve in live logs from media distribution modules and validate or debug their own client implementation.
Other specific logs functions are going to be added to provide analytics on specific parameters (synchronization, bandwidth, number of streams etc..). Those function will allow developers to improve and tune their own client implementation.
Media transmission backup layer
The orchestrator also integrates some specific backup functions to be able to transmit any kind of streams from one user to other users. Those did not integrate any synchronization signalization (or transcoding capabilities) since it forwards raw packets
This layer allows a user to declare any number of typed streams (the type is a simple textual descriptor that must help the receiver to decode the stream) that can be transmitted through the Orchestrator API.
Then, any other user within the same session can register to these streams and then be notified for incoming data from these streams.
Typed streams declaration is made by the following API methods:
DeclareDataStream(dataStreamKind, dataStreamDescription) RemoveDataStream(dataStreamKind) RemoveAllDataStreams()
Then, data can be pushed to the Orchestrator with the following API method:
Other users can be informed of available data streams with the following API method:
Then, they can manage registration to data streams by the following API methods:
RegisterForDataStream(dataStreamUserId, dataStreamKind) UnregisterFromDataStream(dataStreamUserId, dataStreamKind) UnregisterFromAllDataStreams() GetRegisteredDataStreams()
For audio management, a more simple and direct approach can also be used, with the following method:
When a user uses this method for pushing audio, any other user within the same session is notified with the audio data, with no registration needed.
Towards a generic orchestrator and conclusion
This series of articles was our deep overview of VRTogether orchestration challenges and how we overcame them. Orchestration is neither a new problem nor a solved one.
At the beginning of the project we hoped to find a project that would provide us with some generic orchestration layer. Or to be able to reuse one of the partner’s existing components. However, this has proven for us to be a dead-end. Giving a deep thought about what our core business was allowed us to draft a quite extensive but really simple API.
Stay safe and looking forward to seeing you soon!
Author: Motion Spell
Come and follow us in this VR journey with i2CAT, CWI, TNO, CERTH, Artanim, Viaccess-Orca, TheMo and Motion Spell.
This project has been funded by the European Commission as part of the H2020 program, under the grant agreement 762111.