The Data Mining tool lets you create your own SQL queries against your event data, user metrics or aggregated measure data then visualise the results and create custom dashboards. This lets you create more complex queries than would be possible in Slice & Dice without needing to connect a third party SQL tool through Direct Access. It is suited to running quick ad hoc queries against your data.
Event data is held in a single wide table that is optimised for extremely quick queries. Each event is allocated a single row in the table, complex events like transaction that can contain arrays of items and currencies will be allocated multiple rows. The event table is updated with new events as they arrive at regular intervals, at the time of writing it is updated every 1-2 hours.
Columns in the database are uniquely named based on the event structure, as you extend or update your events the event table will gain additional columns for any new parameters added.
Each event has a unique eventID that identifies it, mainEventID, parentEventID and eventLevel parameters can be useful when querying complex events that contain arrays of product items and currencies.
User Metrics data is also queryable in the Data Mining tool. The user metrics table contains a single row of data for each player in your game. The columns in the table represent each user metric field.There are three types of user metric that you can access with the data mining tool:
- Generic Metrics. These are user metrics that are calculated automatically for every game e.g. totalSessionsPlayed ,
- Event Count Metrics. There will be an event count metric for each event type that your game sends e.g. eventLevelUpCount
- Parameter Metrics. There can be multiple metrics for any of the parameters that you send in your events. Some of these will be enabled by default, but you will need to check the relevant metrics checkboxes in your Game Parameters management screen if you wish a specific parameter to be used as a metric, or if you add a new parameter to your game. The metric options available depend on the parameter type.
- String based parameters can record the First and Last value of a parameter. Useful for recording the first item a player purchases or the last mission they play. e.g.fieldMissionNameLast
- Numeric parameters can record the First and Last values, but they can also recordMin, Max, Count and Sum aggregates. So you can track things like max gold balance etc.. e.g. fieldUserXPMax
The user metrics table is updated once per day, it contains a snapshot of the most recent player metrics for every player in your game, regardless of how long ago they last played.
Please note, the Engage platform does not use these user metrics snapshots when evaluating player segmentation criteria for in-app A/B tests and campaigns, it uses a version of user metrics that is up to the second accurate for all players currently playing the game.
Aggregated Measure Data that drives your Measure Charts is also queryable. The aggregated data is held in 6 fact tables containing daily, user level summaries for your entire game history.
- fact_user_sessions_day : contains aggregates for each player session.
- fact_wau_users : contains weekly user counts.
- fact_mau_users : contains monthly user counts
- fact_mission_users_day : contains details of missions played, completed, failed for each player, each day.
- fact_product_users_day : contains product purchase information for each player, each day.
- fact_event_type_users_day : contains counts of event types recorded for each player, each day.
Thankfully you don’t need to memorise all your event parameter or user metric field names, there is a legend at the right hand side of the screen that contains lists of your event and user metric columns and their type. This legend can be minimised by clicking the hide icon.
The toolbar at the top of the Data Mining page lets you give your queries a descriptive name and save them. You can load previously saved queries, run them and export their results to CSV file.
Building a Query
Type or paste your SQL in to the Query panel. The Vertica Analytics database that your data mining SQL queries are running on supports some excellent analytics functions. Please refer to the Vertica SQL reference manual for further info.
Please Note, the data mining tool is intended for running quick ad hoc queries, it should not be used to try and download large data exports. If you need to download large exports please use your S3 data archive. There are query timeout constraints and limits on the number of rows that can be returned to prevent the data mining tool being used for large exports.
Be careful with SELECT *
It is always far more efficient to specify which columns you are interested in a column store like Vertica rather than using SELECT * . If you specify the column names your queries will run much more quickly.
Conversely, if you are COUNTING columns, you should always use a COUNT(*) as this will let Vertica choose the best column to do the count
Visualizing the results
By default your query will return a table of results,
but you can also build visualizations of your queries and use them to populate custom dashboards.
You can toggle between Results table and Visualization views by clicking the Results or Visualization tabs.
- The Setup button inside the visualization tab is used to choose the visualization type, define labels for the X and Y axis of your chart as well as selecting which data column should be displayed on the X-Axis. The scale of the Y-Axis and whether it is stacked are also chosen here.
- The Add Plot button is used to define each series of data to be displayed on your chart. You can display multiple stacked series by clicking the Add Plot button multiple times and defining each of the series values and specifying the type of the series (Bar or Line) and other display options.
- Click the Draw button to re-draw your visualization using from your query results and visualization settings.
The visualization below shows an area chart containing the number of unique players per day
Pivoting your data
You can even summarize your data using the Pivot Table feature on the query results page.
Click the Pivot Table button then select the Row, Column and Data Fields you wish to pivot on.
You can then build Stacked Chart visualizations by clicking the Stacked checkbox on the Chart Setup page
then adding additional plots for each of the stacked groups or choosing the ALL COLUMNS option in the first plot
After saving your Data Mining query you can finally add it to a Custom Dashboard by dragging your query into one of the panels on the custom dashboard editor. Then you can choose whether the query will be displayed as a table or chart by selecting the appropriate option.