For every game we expose a number of tables for both the DEV and the LIVE environments. These tables can be queried from within the Data Mining tool in the platform or via Direct Access using Vertica SQL. The following tables are available for querying:
When using a database client connected to Direct Access the tables will have either the _dev or _live suffix to indicate the environment. In Data Mining, the data you are querying will depend on the environment you are in.
The events table contains all the data received on an environment – collected and enriched. The events data is the source for almost all other data in the platform. This is the most granular level of data we have containing all data points with the respective timestamps and parameters as a single row.
There are some predefined events that contain array-like lists of arbitrary length that can not be represented as a single row. Therefore these are normalised into rows linked together by the eventID, mainEventID, parentEventID and eventLevel.
Retention: The events table only contains the data for the last 30 days unless a larger data retention window has been agreed upon. This data is also exported into the S3 archive. This will make sure that even after the retention window of your game you will still be able to restore the data at a later point in time. This is something we can do for you in our warehouse or you can restore this data somewhere else.
Update interval: As events come in they are enriched in batches and inserted into the events table every few hours.
Columns: There are some predefined columns that contain enriched data such as the last known value for acquisitionChannel, userCountry and start date in the respective gaAcquistionChannel, gaUserCountry and gaUserStartDate columns.
The eventTimestamp is either the eventTimestamp sent along as part of the event or, if omitted, the server time at the moment the event was received in UTC. The time we have received the event will be recorded in the collectInsertedTimestamp column. Data can come in up to 31 days after the event was recorded due to the event being stored before uploading on the device. This means that if you were to look today at data for yesterday you might see slightly different numbers for that date when looking at it tomorrow.
We use the convertedProductAmount column to store the revenue received in the minor currency unit of your default currency so it is comparable between countries. This is based on the exchange rate at the time of processing the event. When using receipt validation we will store the result in the revenueValidated column, 0 indicates we did not perform any validation, 1 indicates validation succeeded and the receipt was considered valid, 2 means validation completed but the receipt was considered invalid, 3 indicates we were unable to validate the receipt due to technical reasons. Generally, status 3 indicates an incorrect configuration.
When a new parameter is received we will add this as a column to the events table if it does not already exist.
The tables with a fact_ prefix are aggregated tables based on the events data designed to show the measure charts with minimal loading times. Since they contain useful aggregates for custom dashboards and custom queries as well we expose them for your custom queries as well.
Retention: The data in the fact_ tables is there to stay. It will only be deleted when the whole game is deleted.
Update interval: The fact_ tables are updated at the same interval as the events table.
Columns: The columns are fixed for all games in the platform and will always contain the general filter values that applied at the time the row was written (event_date, age_group, gender, user_country, acquisition_channel, player_start_date, platform and user_id)
This table takes the last known values for the platform, user_country, acquisition_channel at a given moment in time and counts up per user_id and event_name how many events have been received.
This table takes a calendar day and joins all the users that have been active in the last 30 days onto this day. This means that doing a count(distinct userId) will give you the MAU with the month as a moving window of 30 days.
This counts how often a user has started, completed, failed and abandoned. This is split out by mission name. The count comes from the number of missionStarted, missionCompleted, missionFailed and missionAbandoned events. These events are available as templates for all games.
This table contains the number of products bought split out by the standard filter parameters. The data comes from the productsReceived object in the transaction event.
This is by far the most useful fact_ table. Here we have the revenue in the smallest unit of the default currency, the number of events, missions activity and the transactions split out per session. Keep in mind that we will assume receiving events in chronological order, if we have received events in a different order you might see a new session created with the same sessionID as a sessionID if it is different from the most recent one. The revenue column represents all IAP revenue excluding the invalidated revenue using receipt validation. The ad_revenue column is the sum of the impressionValue from the adImpression event as filled in by AdVantage.
This shows exactly the same as the fact_mau_users with a 7-day window instead of a 30-day window.
The user_metrics table contains one row per user_id and contains a combination of pre-defined and custom user level aggregates.
The parameters aggregates that can be configured in the platform will be stored in this table. This means that if we receive an event, then we will process this and enrich the user record for that user. We will keep event counters such as the eventGameStartedCount and we will keep metrics for parameters such as fieldUserLevelMax.
Retention: The user_metrics do not get cleared at all and will only ever be deleted when the complete game is deleted. Or when there is a data deletion request issued for a user.
Update interval: The user_metrics are used in several places. We maintain a near real-time database where we can make decisions on metric segments using decision point campaigns. This database unfortunately can not be queried across multiple users at the same time so we maintain a copy in the analytics database as well. We make sure to sync the analytics database multiple times a day based on the real-time database.
This table contains the current state of the event segments. Metric segments are based on simple operators on the user level metrics whereas event segments are based on a possibly complicated query that runs against any of the tables mentioned previously.
Retention: The data is based on what users are currently in the events segments.
Update interval: The event segments are updated at a daily cadence, if the definition is changed we will update the users within the next 5 minutes.
Columns: We expose the ID and name of a target list along with the user_id that is selected during the last update of that event segment.
|Table||Update Cadence||Configurable (?)||Data Level||Retention|
|events||2-3 hours||Yes||Event-level||Subject to contract|
|user_metrics||Multiple times per day||Yes||User-level||Unlimited|
|user_target_lists||Daily or on update of target list||No||All userId’s currently in segment||Current state only|