You can connect directly to your data and run SQL queries on your events and users stored in the deltaDNA data warehouse. You should be able to connect using any Postgres ODBC compliant tools.
- Some tools will come with their own built in Postgres ODBC driver, others may require you to download a driver and set-up a connection (DSN) before you can use them. The applications below marked with a (*) will require you to do this. Check out our Postgres ODBC Connection Guide for details on how to do this.
- Connect to your data using your favoured database manager. The following links show how to connect to your data using the following tools.
- DBeaver : A Free Universal Database Manager
- OpenOffice: A Free suite of office tools.
- Microsoft Excel (*) : Microsoft’s flagship spreadsheet software.
- Tableau (*) : Data visualisation tools
- R : Statistical computing and graphics programming language.
- ChartIO : Cloud based data exploration and visualization tools.
There are three types of data that can be accessed, in each case you have separate tables for your DEV and LIVE environments.
- Raw Event Data. This is raw event data containing all the events and parameters for your most recent events as determined by the data retention policy on your game, typically 31 days.
Tables : events_dev and events_live
- User Metrics Data. The user metrics table contains a row of data for every player in your game, regardless of how long ago they last played. Numerous metrics that are automatically tracked and added to this table for each user but you can also define which of your event parameters are tracked as metrics in the Game Parameters Manager tool. Engage Campaigns and A/B Tests access a real-time version of this data when running player segmentation queries, the user metrics that you access through direct access are a daily snapshot of the last known state for every player, updated overnight every 24 hours.
Tables: user_metrics_dev and user_metrics_live
- Aggregated Measure Chart Data : The remaining tables, prefixed with fact contain the aggregated data that is used to drive many of the Measure Dashboard Charts. So, you can run queries on aggregated data covering the entire history of your game.
Tables: Six tables for each environment, prefixed with fact_ covering events, mau, wau, sessions, missions and products
You will get much more from your data if you are familiar with the events and parameters that you are sending and how it is stored.
- Your events can be viewed and edited in the Event Management Tool. It is recommended that you familiarise yourself with the events that your game is sending. It can also be useful to use the Event Browser QA Tool to watch events as they arrive.
- The event table structure is very simple, your events are stored in a wide table with a column for each parameter that you have configured in the Game Parameters tool and at least one row for each event. Complex events can span multiple rows, more on this later.. When you use the Data Mining Tool or download any of your daily event archive files you will be accessing data in this wide table format.
There are a few things worth mentioning about your events data.
- date ranges : Only the last 30 days of data, your Hot Data, is kept live in the data warehouse, unless you have an extended data retention policy. This can be extended on a per-game basis, please contact support if you wish to extend this for your game.
- empty columns : Because there is a column for every parameter that has been configured on your game, there will be many columns with no data on any given row as the parameter wasn’t relevant to the particular event.
- additional columns : You will spot some additional columns that the platform has automatically added, these will be pretty obvious and self explanatory, many are prefixed with ga
- eventLevel : The example below shows 9 rows, but there are actually only 5 events here. The transaction events that you see are actually just two events, each with two nested child objects that were sent in an array. Use the eventLevel and mainEventID parameters to understand the parent / child relationships on nested events.
- convertedProductAmount : When you add a game to the platform you nominate a currency that you want your dashboards to display your revenue in. The productAmount and productType columns will always contain the local currency that was spent by the player and the convertedProductAmount column will contain the value converted to your nominated currency at the time of purchase.
- revenueValidated : The revenueValidated flag shows the status of revenue validation requests with the App Store if you are using this deltaDNA feature. The values you will see are:
- 0 : not validation undertaken
- 1 : revenue valid
- 2 : revenue invalid
- 3 : error during validation
You can download the example events above if you want to take a look at some of the other columns / parmaters from this example game.
The SQL Cookbook section will grow to contain a wealth of useful queries, feel free to share your own creations with us.
In App Purchase Revenue in last 7 days, by Currency.
SELECT productType "Currency",
SUM(productAmount) AS "Revenue (Local cents)" ,
ROUND(SUM(convertedProductAmount) / 100 , 2) AS "Revenue (USD)"
WHERE eventName = 'transaction'
AND productCategory like 'REAL_CURRENCY'
AND eventDate BETWEEN CURRENT_DATE - 7 AND CURRENT_DATE -1
GROUP BY productType
ORDER BY SUM(productAmount) DESC ;
We have created an online repository with some queries that can be used to look at your data, these queries are focussed on the use in data mining rather than via direct access. This means the environment specification _dev and _live is missing from the tables used in data mining. The repository can be found here: https://github.com/deltadna/sql-cookbook
There are a few things you should take in to consideration when running queries against your data.
Be careful with SELECT *
- It is always far more efficient to specify which columns you are interested in a column store like Vertica rather than using SELECT * . If you specify the column names your queries will run much more quickly.
- Conversely, if you are COUNTING columns, you should always use a COUNT(*) as this will let Vertica choose the best column to do the count
- Your events table stores all events from the last 30 days by default, this can be extended, please contact [email protected] for more info.
- The events table contains a column for each parameter in your game parameters list as well as some automatically populated columns
- Events that contain arrays or nested objects will appear as multiple rows in you events table. The eventLevel (0 or 1) and mainEventID columns are used to link parent and child events together. An example of this would be a transaction event where the player spends real currency to buy virtual currency, this would result in three rows in the database, one with the parent transaction details and one for each of the currencies involved in the transaction.
- The users table contains a one row for each user in your game. The effective_date column holds a record of when the user metric was last updated. Therefore an entry with the daysPlayedInLast7Days column populated with the value 3 is telling us the number of days played in the 7 days preceding from the effective date.
- You cannot perform potentially destructive SQL commands on your data.
- The deltaDNA PostgreSQL Driver should be used for running ad hoc analytics queries and reports. It shouldn’t be used to download data extracts, your daily data archives stored on Amazon S3 are better suited to this. A limit of 10MB has been placed on SQL data transfers to prevent the SQL driver being mis-used. Your query will fail and return the following error message if it exceeds the transfer limit, “ResultSet has exceeded maximum query data transfer.”.
- Any SQL queries running for longer than 30 minutes will be automatically terminated. If you have a particular query that is running for too long, let us know, as we may be able to advise on optimisations to your SQL that will make it run more quickly. We do monitor the analytics database and scale the cluster to accommodate the data volumes and usage.
- Your data is stored in a Vertica Analytics database, you will find a reference manual for the SQL functions it supports in the Vertica Online Reference Manual
- You cannot use any of the following reserved words in your SQL queries.
all_tables, columns, comments, constraint_columns, databases, dual, elastic_cluster, epochs, fault_groups, foreign_keys, grants, license_audits, nodes, odbc_columns, passwords, primary_keys, profile_parameters, profiles, projection_checkpoint_epochs, projection_columns, projection_delete_concerns, projections, resource_pool_defaults, resource_pools, roles, schemata, sequences, storage_locations, system_columns, system_tables, table_constraints, tables, types, user_audits, user_functions, user_procedures, user_transforms, users, view_columns, views, active_events, column_storage, configuration_changes, configuration_parameters, cpu_usage, critical_hosts, critical_nodes, current_session, data_collector, database_backups, database_connections, database_snapshots, delete_vectors, deploy_status, design_status, disk_resource_rejections, disk_storage, error_messages, event_configurations, execution_engine_profiles, host_resources, io_usage, load_streams, lock_usage, locks, login_failures, memory_usage, monitoring_events, network_interfaces, network_usage, node_resources, node_states, partition_reorganize_errors, partition_status, partitions, process_signals, projection_recoveries, projection_refreshes, projection_storage, projection_usage, query_events, query_metrics, query_plan_profiles, query_profiles, query_requests, rebalance_projection_status, rebalance_table_status, recovery_status, resource_acquisitions, resource_pool_status, resource_queues, resource_rejection_details, resource_rejections, resource_usage, session_profiles, sessions, storage_containers, storage_policies, storage_tiers, storage_usage, strata, strata_structures, system, system_resource_usage, system_services, system_sessions, transactions, tuning_recommendations, tuple_mover_operations, udx_fenced_processes, user_libraries, user_library_manifest, user_sessions, wos_container_storage, current_database, current_schema, current_user, dbname, has_table_privilege, session_user