Q: What is Premium Data Access?
A: Premium Data Access is a scalable and flexible data management solution. PDA lets you access your data in deltaDNA’s managed Snowflake warehouse. With PDA, you’ll be able to query your data across your entire game portfolio, create custom views, and have more power over your data.
Q: How much does PDA cost?
A: Premium Data Access uses a consumption-based pricing model comprising 3 elements.
- Data Warehouse Size: The number of virtual warehouses you use and their performance level. There are 8 sizes to choose from. X-Small to 4X-Large
- Cloud Service Usage: The length of time the warehouse is running queries
- Data Storage Volume: The volume of data stored, which is the least significant of the 3 elements by a wide margin.
Your client partner will help you estimate and optimise your costs based on your usage requirements. You can modify your cluster configuration to adjust the cost / performance profile at any time.
Q: How do I sign up for PDA?
A: Please speak to your Unity Client Partner, they will be happy to help.
Q: Can PDA be applied to just one of my games?
A: Premium Data Access is an account-level feature, so it applies to all of your games. However, you have account administration rights to control the visibility and access to individual aspects of the data to different users and roles.
Q: What information is required to sign up for PDA?
A: Your Unity client partner will need the following information to create your Snowflake Premium Data Access account.
- The name and email address of the account owner.
This is the individual that will be responsible for managing your Snowflake Premium Data account, including user access management. This individual may also be a deltaDNA account owner, but that is not essential.
- The name of the deltaDNA account containing the games and data.
Q: How long does it take to get set up for PDA?
A: It will take a few days to process and provision a new PDA account from the point at which we have received your details and finalised commercials.
Q: Where can I access my data using Premium Data Access?
A: You can access your data through the Snowflake Web Interface, a URL will be provided to you. Alternatively, you can connect to Snowflake from a variety of tools that support Snowflake data connectivity (Tableau, Looker, Redash, etc.).
Q: Can I create user accounts?
A: Yes, in fact, you will need to create and manage the accounts of your users. A PDA account will have a single Account Administrator assigned at the point of creation. This account is responsible for creating and managing the accounts of your users.
Snowflake Guide: “User Management”
Q: Can I set individual analyst roles and permissions?
A: Yes, account administrators can create and manage Snowflake users through SQL or the web interface.
Snowflake Guide: “User Management”
Q: Can I monitor cost?
A: Yes, the Account > Billing & Usages page in your Snowflake console provides a daily breakdown of the Snowflake credits used per day on each of your warehouses.
Snowflake Guide: “System Usage & Billing”
Q: Can I set up a threshold?
A: Yes, and in Snowflake these are referred to as “Resource Monitors”. To help control costs and avoid unexpected credit usage caused by running warehouses, Snowflake provides resource monitors. A virtual warehouse consumes Snowflake credits while it runs.
Snowflake Guide: “Working With Resource Monitors”
Q: Can I set up alerts
A: Yes. Similar to the previous response, regarding Resource Monitors, alerts (or Actions) can be set up to notify all account administrators (with notifications enabled) and suspend all assigned warehouses immediately, which cancels any statements being executed by the warehouses at the time.
Snowflake Guide: “Working With Resource Monitors – Actions”
Q: What warehouse size am I on?
A: The warehouse size can be viewed from the Warehouse tab in your Snowflake console. There is a column in the warehouse table that reads “Size”, which will display the sizes of all of your available warehouses.
Q: I need support, can you access my account and help me?
A: Our support agents and client partners don’t have access to your PDA account. If you need us to access your account to provide assistance, you will need to create a user account, assign roles and permissions as required and send us the credentials. You will also need to revoke the access permissions when we have completed the task.
Q: What type of knowledge is required to use PDA?
A: PDA provides you with your own managed data warehouse. At least one user on your account will need database administrator skills in order to create users, grant them access permissions and perform other more advanced tasks. However, the majority of users will just require SQL skills comparable to those used with the Direct Access and Data Mining tools.
Snowflake Documentation: “Snowflake Documentation”
Q: What is the data retention length?
A: There is a data retention limit of 13 months in PDA, only data within this range is accessible.
Q: How long after sending an event will my data be available in Premium Data Access?
A: Events will appear in PDA at the same cadence as they appear in the deltaDNA platform.
Q: Can we track events and parameters across multiple games?
A: Yes, you will see all the events and parameters from each of your games when running analysis on Snowflake PDA, there is a GAME_NAME and ENVIRONMENT_NAME field on each event.
Q: Can we track users across multiple games?
A: If you want to recognise the same player across multiple games you will need to ensure you populate a parameter on each of the games with a common identifier for the player. The ddnaCrossGameUserID parameter on the gameStarted event can be used for this purpose, or you can add something of your own. The deltaDNA SDKs generate an anonymous identity for the player that only persists for the duration of a single game install (the userID). The userID could be overwritten with this common identifier as well.
Q: Can we insert, update or delete data?
A: The standard tables provided by deltaDNA cannot be altered, but you can create your own tables and manipulate them.
Q: What Data can we access in PDA?
A: You can access the events and user_metrics tables as well as some of the more commonly used Measure aggregate tables.
Q: How is the Data structured in Snowflake PDA?
A: The table names and data structure are a little different in Snowflake PDA. The main differences are:
Your table names are prefixed with ACCOUNT_ because these are account level tables containing data from all your games e.g ACCOUNT_EVENTS, ACCOUNT_USERS, ACCOUNT_FACT_USER_SESSIONS_DAY.
- You can create your own views based on these to expose the data for just an individual game. E.g. MYGAME_EVENTS
- To accommodate the different event parameters and user metrics implementation across games, the event and metrics payloads are stored as JSON rather than being unpacked into individual columns
- ACCOUNT_NAME VARCHAR
- GAME_NAME VARCHAR
- ENVIRONMENT_NAME VARCHAR
- LOAD_ID NUMBER
- EVENT_ID NUMBER
- EVENT_TIMESTAMP TIMESTAMP
- EVENT_NAME VARCHAR
- USER_ID VARCHAR
- EVENT_JSON VARIANT
Where EVENT_JSON might contain something like
"collectInsertedTimestamp": "2021-02-11 06:35:57.299",
"eventDate": "2021-02-11 00:00:00.000",
"eventTimestamp": "2021-02-11 06:39:53.082",
"sdkVersion": "Fake SDK v1",
And individual parameters can be accessed like EVENT_JSON:platform
SELECT EVENT_JSON:PLATFORM::VARCHAR, COUNT(*)
WHERE GAME_NAME = 'demo-game'
AND EVENT_TIMESTAMP > CURRENT_DATE - 1
GROUP BY 1
ORDER BY 2 DESC ;
Q: Do you have any usage or performance tips?
A: Yes, here are a couple of quick pointers.
The data we expose are views, to enforce security, sitting on top of far larger data sets. As such, there are joins going on under the hood, so the more you can do to help the query optimiser out the better if you want the best performance. So, if looking at one game for example, use the game_id rather than game_name, or environment_id rather than game_name+environment_name as this can save the compiler considering access paths it doesn’t need to. If looking across your games and you just want to group on a user readable field then that’s where the name can be used without a performance impact just for presentation.
Most objects are append only, so will have something like an inserted or loaded timestamp to track changes. The events themselves follow more of a streaming pattern, so there is a load_id on there which is essentially a sequence (although it will have gaps in numbering, it is always going up). The only difference to this is the account_games and account_users, which are updated in place, account_users having a last_updated_timestamp if you want to do any deltas on that.
Q: What is “Time Travel” and how does it relate to data retention?
A: Time Travel enables the accessing of data that may have been changed or deleted in order to restore, duplicate or analyze it based on a previous state. Time Travel is not related to data retention.
Please note, if you take copies of user data to tables other than those provided by deltaDNA you will be responsible for maintaining GDPR compliance on copied data.
Snowflake Guide : “Understanding & Using Time Travel”
Q: What is the difference between a single and multi-cluster warehouse?
A: A multi-cluster warehouse scales to accommodate the demands of increased concurrency. For example, if an “analyst” warehouse is created and the type of queries run on it cause it to start queuing requests when it reaches 8 concurrent queries, users will need to wait longer for their queries to be processed. With the multi-cluster warehouse feature enabled, Snowflake will launch more warehouses of the same name to handle the level of concurrency required and reduce the waiting time. Without it, queries would remain queued until there are enough resources on the single warehouse to run them.
Snowflake Guide : “Multi-cluster Warehouses Improve Concurrency”
Q: What’s data masking?
A: Data masking is a security feature on Enterprise level PDA accounts. It can be used to hide sensitive database columns from certain user groups.
Snowflake Guide : “Column-level Security”
Q: What warehouse size should I be on?
- Data loading: The warehouse size should match the number of files being loaded and the amount of data in each file. For more details, see Data Loading Considerations.
- Queries in small-scale testing environments: smaller warehouse sizes (X-Small, Small, Medium) may be sufficient.
- Queries in large-scale production environments: larger warehouse sizes (Large, X-Large, 2X-Large, etc.) may be more cost-effective.
Snowflake Guide: “Warehouse Considerations”
|Total Monthly Event Count across account||Estimated Required Snowflake Warehouse Size|
Q: Why does my first query take longer to run
A: You have dedicated warehouses that spin up on demand to serve your queries and similarly go down when idle to save costs. This is why your first query of a session takes longer to run. You can control the size of your warehouse and the Auto Suspend duration from the Warehouse page on your Snowflake Console.
If you are only using PDA intermittently you can set the Auto Suspend from the Warehouses page of the Snowflake Console or with a query. (Note, to go below 5 minutes you will need to use a query)
E.g. to set Auto Suspend to 59 seconds
ALTER WAREHOUSE "WH_DEFAULT"
SET WAREHOUSE_SIZE = 'XSMALL' AUTO_SUSPEND = 59 AUTO_RESUME = TRUE MIN_CLUSTER_COUNT = 1 MAX_CLUSTER_COUNT = 2 SCALING_POLICY = 'STANDARD' COMMENT = '';