The open source OpenXR runtime
1# IPC Design and Implementation {#ipc-design}
2
3<!--
4Copyright 2021-2022, Collabora, Ltd. and the Monado contributors
5SPDX-License-Identifier: BSL-1.0
6-->
7
8[TOC]
9
10- Last updated: 12-September-2022
11
12When the service starts, an `xrt_instance` is created and selected, a native
13compositor is initialized by a system compositor, a shared memory segment for
14device data is initialized, and other internal state is set up. (See
15`ipc_server_process.c`.)
16
17There are three main communication needs:
18
19- The client shared library needs to be able to **locate** a running service, if
20 any, to start communication. (Auto-starting, where available, is handled by
21 platform-specific mechanisms: the client currently has no code to explicitly
22 start up the service.) This location mechanism must be able to establish or
23 share the RPC channel and shared memory access, often by passing a socket,
24 handle, or file descriptor.
25- The client and service must share a dedicated channel for IPC calls (also
26 known as **RPC** - remote procedure call), typically a socket. Importantly,
27 the channel must be able to carry both data messages and native graphics
28 buffer/sync handles (file descriptors, HANDLEs, AHardwareBuffers)
29- The service must share device data updating at various rates, shared by all
30 clients. This is typically done with a form of **shared memory**.
31
32Each platform's implementation has a way of meeting each of these needs. The
33specific way each need is met is highlighted below.
34
35## Linux Platform Details
36
37In an typical Linux environment, the Monado service can be launched one of two
38ways: manually, or by socket activation (e.g. from systemd). In either case,
39there is a Unix domain socket with a well-known name (known at compile time, and
40built-in to both the service executable and the client shared library) used by
41clients to connect to the service: this provides the **locating** function.
42This socket is polled in the service mainloop, using epoll, to detect any new
43client connections.
44
45Upon a client connection to this "locating" socket, the service will [accept][]
46the connection, returning a file descriptor (FD), which is passed to
47`start_client_listener_thread()` to start a thread specific to that client. The
48FD produced this way is now also used for the IPC calls - the **RPC** function -
49since it is specific to that client-server communication channel. One of the
50first calls made transports a duplicate of the **shared memory** segment file
51descriptor to the client, so it has (read) access to this data.
52
53[accept]: https://man7.org/linux/man-pages/man2/accept.2.html
54
55## Android Platform Details
56
57On Android, to pass platform objects, allow for service activation, and
58fit better within the idioms of the platform, Monado provides a Binder/AIDL
59service instead of a named socket. (The named sockets we typically use are not
60permitted by the platform, and "abstract" named sockets are currently available,
61but are not idiomatic for the platform and lack other useful capabilities.)
62Specifically, we provide a [foreground and started][foreground] (to be able to
63display), [bound][bound_service] [service][android_service] with an interface
64defined using [AIDL][]. (See also
65[this third-party guide about such AIDL services][AidlServices]) This is not
66like the system services which provide hardware data or system framework data
67from native code. this has a Java (JVM/Dalvik/ART) component provided by code in
68an APK, exposed by properties in the package manifest.
69
70[NdkBinder][] is not used because it is mainly suitable for the system type of
71binder services. An APK-based service would still require some JVM code to
72expose it, and since the AIDL service is used for so little, mixing languages
73did not make sense.
74
75The service we expose provides an implementation of our AIDL-described
76interface, `org.freedesktop.monado.ipc.IMonado`. This can be modified freely, as
77both the client and server are built at the same time and packaged in the same
78APK, even though they get loaded in different processes.
79
80[foreground]: https://developer.android.com/guide/components/foreground-services
81[bound_service]: https://developer.android.com/guide/components/bound-services
82[android_service]: https://developer.android.com/guide/components/services
83[aidl]: https://developer.android.com/guide/components/aidl
84[AidlServices]: https://devarea.com/android-services-and-aidl/
85[NdkBinder]: https://developer.android.com/ndk/reference/group/ndk-binder
86
87The first main purpose of this service is for automatic startup and the
88**locating** function: helping establish communication between the client and
89the service. The Android framework takes care of launching the service process
90when the client requests to bind our service by name and package. The framework
91also provides us with method calls when we're bound. In this way, the "entry point"
92of the Monado service on Android is the
93`org.freedesktop.monado.ipc.MonadoService` class, which exposes the
94implementation of our AIDL interface, `org.freedesktop.monado.ipc.MonadoImpl`.
95
96From there, the native-code mainloop starts when this service received a valid
97`Surface`. By default, the JVM code will signal the mainloop to shut down a short
98time after the last client disconnects, to work best within the platform.
99
100At startup, as on Linux, the shared memory segment is created. The [ashmem][]
101API is used to create/destroy an anonymous **shared memory** segment on Android,
102instead of standard POSIX shared memory, but is otherwise treated and used
103exactly the same as on standard Linux: file descriptors are duplicated and
104passed through IPC calls, etc.
105
106When the client side starts up, it creates an __anonymous socket pair__ to use
107for IPC calls (the **RPC** function) later. It then passes one of the two file
108descriptors into the AIDL method we defined named "connect". This transports the
109FD to the service process, which uses it as the unique communication channel for
110that client in its own thread. This replaces the socket pair produced by
111connecting/accepting the named socket as used in standard Linux.
112
113[ashmem]: https://developer.android.com/ndk/reference/group/memory
114
115The AIDL interface is also used for transporting some platform objects. At this
116time, the only one transported in this way is the [Surface][] injected into the
117client activity which is used for displaying rendered output. Surface only comes
118from client when [Display over other apps][] is disabled.
119
120The owner of surface will impact the service shutdown behavior. When the
121surface comes from the injected window, it becomes invalid when client activity
122destroys. Therefore the runtime service must be shutdown when client exits,
123because all the graphic resources are associated with that surface. On the other
124hand, when the owner of surface is the runtime service, it's capable to support
125multiple clients and client transition without shutdown.
126
127[Surface]: https://developer.android.com/reference/android/view/Surface
128[Display over other apps]: https://developer.android.com/reference/android/Manifest.permission#SYSTEM_ALERT_WINDOW
129
130### Synchronization
131
132Synchronization of new client connections is a special challenge on the Android
133platform, since new clients arrive using calls into JVM code while the mainloop is
134C/C++ code. Unlike Linux, we cannot simply use epoll to check if there are new
135connections to our locating socket.
136
137We have the following design goals/constraints:
138
139- All we need to communicate is an integer (file descriptor) within a process.
140- Make it fast in the server mainloop in the most common case that there are no
141 new clients.
142 - This suggests that we should be able to check if there may be a waiting
143 client in purely native code, without JNI.
144- Make it relatively fast in the server mainloop even when there is a client,
145 since it's the compositor thread.
146 - This might mean we want to do it all without JNI on the main thread.
147- The client should know (and be unblocked) when the server has accepted its
148 connection.
149 - This suggests that the method called in `MonadoImpl` should block until the
150 server consumes/accepts the connection.
151 - Not 100% sure this is required, but maybe.
152- Resources (file descriptors, etc) should not be leaked.
153 - Each should have a well-known owner at each point in time.
154- It is OK if only one new client is accepted per mainloop.
155 - The mainloop is high rate (compositor rate) and new client connections are
156 relatively infrequent.
157
158The IPC service creates a pipe as well as some state variables, two mutexes, and a
159condition variable.
160
161When the JVM Service code has a new client, it calls
162`ipc_server_mainloop_add_fd()` to pass the FD in. It takes two mutexes, in
163order: `ipc_server_mainloop::client_push_mutex` and
164`ipc_server_mainloop::accept_mutex`. The purpose of
165`ipc_server_mainloop::client_push_mutex` is to allow only one client into the
166client-acceptance handshake at a time, so that no acknowledgement of client
167accept is lost. Once those two mutexes are locked,
168`ipc_server_mainloop_add_fd()` writes the FD number to the pipe. Then, it waits
169on the condition variable (releasing `accept_mutex`) to see either that FD
170number or the special "shutting down" sentinel value in the `last_accepted_fd`
171variable. If it sees the FD number, that indicates that the other side of the
172communication (the mainloop) has taken ownership of the FD and will handle
173closing it. If it sees the sentinel value, or has an error at some point, it
174assumes that ownership is retained and it should close the FD itself.
175
176The other side of the communication works as follows: epoll is used to check if
177there is new data waiting on the pipe. If so, the
178`ipc_server_mainloop::accept_mutex` lock is taken, and an FD number is read from
179the pipe. A client thread is launched for that FD, then the `last_accepted_fd`
180variable is updated and the `ipc_server_mainloop::accept_cond` condition
181variable signalled.
182
183The initial plan required that the server also wait on
184`ipc_server_mainloop::accept_cond` for the `last_accepted_fd` to be reset back
185to `0` by the acknowledged client, thus preventing losing acknowledgements.
186However, it is undesirable for the clients to be able to block the
187compositor/server, so this wait was considered not acceptable. Instead, the
188`ipc_server_mainloop::client_push_mutex` is used so that at most one
189un-acknowledged client may have written to the pipe at any given time.
190
191## A Note on Graphics IPC
192
193The IPC mechanisms described previously are used solely for small data. Graphics
194data communication between application/client and server is done through sharing
195of buffers and synchronization primitives, without any copying or serialization
196of buffers within a frame loop.
197
198We use the system and graphics API provided mechanisms of sharing graphics
199buffers and sync primitives, which all result in some cross-API-usable handle
200type (generically processed as the types @ref xrt_graphics_buffer_handle_t and
201@ref xrt_graphics_sync_handle_t). On all supported platforms, there exist ways
202to share these handle types both within and between processes:
203
204- Linux and Android can send these handles, uniformly represented as file
205 descriptors, through a domain socket with a [SCM_RIGHTS][] message.
206- It is anticipated that Windows will use DuplicateHandle and send handle
207 numbers to achieve an equivalent result. ([reference][win32handles]) While
208 recent versions of Windows have added `AF_UNIX` domain socket support,
209 [`SCM_RIGHTS` is not supported][WinSCM_RIGHTS].
210
211The @ref xrt_compositor_native and @ref xrt_swapchain_native interfaces conceal
212the compositor's own graphics API choice, interacting with a client compositor
213solely through these generic handles. As such, even in single-process mode,
214buffers and sync primitives are generally exported to handles and imported back
215into another graphics API. (There is a small exception to this general statement
216to allow in-process execution on a software Vulkan implementation for CI
217purposes.)
218
219Generally, when possible, we allocate buffers on the server side in Vulkan, and
220import into the client compositor and API. On Android, to support application
221quotas and limits on allocation, etc, the client side allocates the buffer using
222a @ref xrt_image_native_allocator (aka XINA) and shares it to the server. When
223using D3D11 or D3D12 on Windows, buffers are allocated by the client compositor
224and imported into the native compositor, because Vulkan can import buffers from
225D3D, but D3D cannot import buffers allocated by Vulkan. See @ref swapchains-ipc
226for details.
227
228[SCM_RIGHTS]: https://man7.org/linux/man-pages/man3/cmsg.3.html
229[win32handles]: https://lackingrhoticity.blogspot.com/2015/05/passing-fds-handles-between-processes.html
230[WinSCM_RIGHTS]: https://devblogs.microsoft.com/commandline/af_unix-comes-to-windows/#unsupportedunavailable
231
232## 32 bits client support on 64 bits server
233
234On 64 bits system, the server process will typically be a 64 bits executable,
235clients may be 32 or 64 bits. As IPC either through socket or shared memory will
236share C struct directly, we must pay attention to memory layout.
237
238All data types must be fixed size and alignments must be based on the 64 bits
239targets. On Linux this mostly means `alignas(8)` must be used for (u)int64_t.