Data Flow
Last Updated: 2021-06-14The Grouparoo Application's main job is to collect data from Sources, store that data to build a robust Records, build Groups of Records, and finally syndicate this data with Destinations, your other marketing tools.
This document is place to centrally describe how all of this works!
Core Data Models
Records and Record Properties
The main object in Grouparoo is a Record. A Record by itself is simply an id and the time it was created and last updated. It is a "join table" to collect many Record Properties, which act as a key-value store for the data about Records. For example, a common Record Property would be key='email', value='evan@grouparoo.com', type='string'. We also store when each Record Property was created and updated.
Groups
Groups are a collection of Records.
There are 2 types of Groups: "Manual" and "Calculated".
- Manual
Groupscontain records that a Grouparoo user has added to the Group. The membership of ManualGroups does not change. - Calculated
Groupsare constantly recalculated againstRecord Properties. These dynamic lists ofRecordsare how you build up cohorts that change over time. You may have a Group ofAbandoned Carts Last Weekthat checks aRecord Propertyof "last_abandoned_card_date" and makes sure it was within the last 7 days. You might also choose to segment your customers by the language they speak with a "Locale"Record Property.
Records can be in many groups at once.
Apps
Apps are how you define connections to your tools like your databases, email vendors, CRMs, etc. With each App you tell Grouparoo how to connect so we can import and export your data.
Properties
Properties describe each Record Property and how they work. It's a bit like a "Schema" for the Records:
- What
Record Propertiesexist? You cannot just add anyRecord Propertyto aRecord. You need to define it first. - Is this
Record Propertyunique? If it is, Grouparoo will make sure that no other Record can have the same value for thatRecord Property. Common examples would beemailoruser_id. - What is the type of this
Record Property? A number, a string? We actually stringify all of our Record Property data for searching, but wen we render it back out again, we convert it. This also helps us know how to build search queries for CalculatedGroups. - How is this
Record Propertyto be retrieved? By which App? We talk more about this below.
Every Record Property also includes a reference to an App (perhaps "MySQL-Datawarehouse") and a statement/query/option of how to get it for each Record. Grouparoo will always be keeping your data in sync so we always need to know how to retrieve a Record's Record Properties.
- For an
App"MySQL-Datawarehouse" which is linked to thePropertyfor "First Name (string)" you might provide a SQL statement likeselect first_name from users where users.id={{ user_id }}. - You can also have more complex queries, like you could connect the
App"MySQL-Datawarehouse" to thePropertyfor "ltv (number)" which might doselect (sum(purchases.total) - sum(refunds.total)) as ltv from users join purchases on purchases.user_id = users.id join refunds.user_id = users.id where users.id={{ user_id }} - For a an App
CSVresponsible for "VIP (boolean)" you might choose "column" from a dropdown indicating that any column labeled "VIP" should be allowed to set thisRecord Property
Data into Grouparoo
Sources, Schedules, and Runs
When you add an App to Grouparoo, you can also configure if you want Grouparoo to periodically check that App for new data. This might be accomplished by checking a database table for new/updated rows or asking an API for recent changes. This is called a Schedule. You can create many Schedules from a single App, against each Source. An App can have many Sources, often correlating to a logical collection in the App, e.g. Tables in a Database.
Perhaps the App "MySQL-Datawarehouse" has 4 Schedules:
- "MySQL-Datawarehouse:users". This Schedule checks the "users" table every 10 minutes for new or updated records
- "MySQL-Datawarehouse:carts". This Schedule checks the "carts" table every hour for only new carts
- "MySQL-Datawarehouse:purchase". This Schedule checks the "purchase" table every hour for only new purchases.
- "MySQL-Datawarehouse:refunds". This Schedule checks the "refunds" table every hour for only new refunds
Each App works a little differently, but you will be asked what to check for and how often. Each instance of a periodic check is called a Run. When new data is found, the Run will produce Imports.
Imports
Imports are created when a Schedules' Run finds new or updated data. Either way, an Import tells Grouparoo that something has changed that we should take note of.
Most Imports trigger the update of a Record. An Import can be as simple as {user_id: 3, firstName: 'Evan'}. This tells us that we should update the Record we have for user #3, and if we don't have a Record for User #3, we should create one.
We know that the Import's data we get should be saved to the related Record if:
- The
Import's payload contains a uniqueRecord Propertyas defined by theProperties - The other data key-value pairs are also defined by the
Properties.
Imports are stored in Grouparoo for later use. You can choose to delete old Imports after a period of time.
Record Hydration
Any time an Import is recorded by Grouparoo, and that Import can be linked to a Record, Grouparoo will fully update that Record, checking all the Apps that are connected to it via Record Properties and asking for new data. You can see which Record Properties should be updated by checking if they are in the pending state or not. In this way, you can be sure that your Records are always up to date, and we can recover quickly from any new or missed data. We call this step "Record Hydration".
This means that if an Import comes in from the "MySQL-Datawarehouse:purchase" Import Schedules, Grouparoo will automatically check all Record Properties, including "ltv", "first_name" and everything else.
Data out of Grouparoo
Destinations
When configuring your Destinations, you'll be asked which Groups and Record Properties you want to send (or all of them). You may want to sync "email" and "first_name" to your email tool for everyone in the Group "lapsed customers" and "USA Customers" so you can send them a coupon as part of your re-engagement campaign. Oh, and the Group "lapsed customers" is Dynamic, based on LTV>0, and last_purchase at least 2 months ago, while "USA Customers" is Dynamic where locale=en-us... all of which Grouparoo is keeping up-to-date for you.
To accomplish this you, would create an Destination for the App "Mailchimp", toggle on the "lapsed-customers" and "USA Customers" groups, and finally the requested Record Properties of "email" and "first_name".
When Grouparoo sends data via a Destination, Exports are created.
Exports
The Grouparoo App for Mailchimp will handle using the Mailchimp API to send the Export for you. Exports keep track of the exact data which was sent to the Destination. Grouparoo will try to send data in bulk when possible, handle retries, and skip any updates that won't result in any new information being sent to the Destination.
You can choose to delete old Exports after a period of time.
Plugins and the Grouparoo Marketplace
Plugins are Grouparoo's way of providing more Apps to Grouparoo, allowing you to import and export data from more and more sources and destinations.
Having Problems?
If you are having trouble, visit the list of common issues or open a Github issue to get support.