Skip to content

Experiment tracking#

One purpose of the XAI Demonstrator is to serve as a platform for academic research and user testing. Thus, it includes the facilities to record data for later analysis.

As this is the setting the XAI Demonstrator was developed for, we assume that it is embedded into a web-based experimentation platform such as oTree. You will not necessarily have to spin up your own XAI Demonstrator instance, but you will need to deploy additional services along with your experiment platform to record and collect the data.

How do I record data?#

To track the requests to and responses of any XAI Demonstrator use case, you can route the requests through an instance of the experiment-proxy service. This has the benefit that you do not need to make any changes to the use case or its configuration. See here for instructions on how to set up and configure an experiment-proxy instance. For a large number of experiment settings, this is more than sufficient.

If you would like to record additional data that is not part of the request or the response, you can use the record_data() function provided by the xaidemo utilities package that is a default dependency of all XAI Demonstrator use cases:

from xaidemo.tracking.record import record_data

def some_function_that_is_called_within_the_use_case(important_value):
    record_data(key="internal_state", value={"value": important_value})
You will find this data within the record under data[key] along with some metadata. Please see here for more detailed information.

Note that you must set EXPERIMENT=1 on the use case instance that you use for your experiment and instrument the FastAPI app to actually record any data:

from xaidemo import tracking

app = FastAPI(...)
tracking.instrument_app(app)

You also need to configure the COLLECTOR_URL environment variable. For further information, including a more detailed example, see here.

How do I collect and access the recorded data?#

The data-collector service receives the data and stores it as records in a database. It isn't deployed along with the regular XAI Demonstrator deployments, but you will have to set up your own instance. For more information, see here.

All data that stems from the same original request will be stored as a single record with a unique ID. Usually, XAI Demonstrator backends return some kind of ID in their responses (e.g., a prediction_id or explanation_id) that can be used to later identify the associated record. That's probably the easiest option in most circumstances. Note that the IDs generated by the use cases are not the record ID. Instead, you need to search for it in the data collected by the experiment-proxy that is stored under record[data]["tracked"]. For more information, see here.

Alternative for advanced users#

If you're familiar with OpenTelemetry, you can instrument your experiment code as well as the HTTP client you use to make requests to the XAI Demonstrator to pass along an OpenTelemetry context. The record ID is derived from the context information, and you can obtain it as follows from within the context:

from xaidemo.tracking.record import get_record_id

record_id = get_record_id()

This allows you to immediately know the record ID, even before making any calls. You can find an example in experiment-tracker/tests/integration/test_end_to_end.py.

Please note that it is not sufficient to initiate a new span but that you need to initiate a new context for each request. If that sounds very confusing, we suggest you use the method described above to retrospectively find the record ID to avoid issues during data collection.

See it in action by running the integration tests#

Spin up the local version with a dummy use case:

cd experiment-tracker
docker-compose --env-file .env.test up

Then, you can run the integration tests:

cd tests
./run.sh

experiment-tracker/docker-compose.yml shows how to set the environment variables for each of the involved services. The tests in experiment-tracker/tests/integration document common scenarios and show how data is recorded, transmitted, stored, and retrieved.

Is this reliable? Will it affect the performance of the use case?#

In principle, recording the data is reliable. We have taken care to surface any issues (such as attempts to store data that is not JSON-serializable) as early as possible so that any issues most likely be discovered while setting up an experiment.

However, since the data is transferred to the collector only after the response has been returned and plenty of things can go wrong when transmitting data across a network, there is no guarantee that all data sent out is actually recorded.

For the same reason, recording data should only have a minimal performance impact: Both the experiment-proxy and instrumented use cases only send out data after the response has been returned to the user. This is by far the most time-consuming part. The overhead while servicing a request is kept to a minimum and should be negligible compared to the computations performed by XAI Demonstrator use cases.

How does it work?#

The experiment tracking capabilities are provided by the xaidemo.tracking package in combination with the two services experiment-proxy and data-collector described above.

Both the proxy and the data collector are built on FastAPI (similar to the XAI Demonstrator use cases but with much less code) and are best understood by having a look at the source code.

xaidemo.tracking's instrument_app() adds a middleware to a FastAPI application that makes sure that all the data recorded during a request is sent to the data collector after the response has been sent.

Under the hood, the XAI Demonstrator's experiment tracking is built on