Data Cloud provides the ability to integrate multiple sources of data and create your own data lake to use with AI. However, having access to a myriad of potential sources means we also have an array of ways to authenticate against them. Authentication patterns have flourished over the years of web development to help keep the connections between data and users secure. This means that admins who are looking at the various connectors they can use to pull data into Data Cloud may leave a trail of potentially confusing authentication routines.
In this blog, we’ll break down the basics of authentication and the various forms it takes for Data Cloud. We’ll draw some broad strokes on security in general to give a baseline on how these methods work, and then point back to their usage in Data Cloud.
Who, what, where, when
Let’s break down authentication in really broad strokes. It relies on:
Proving who you are (biometrics)
Proving what you know (credentials)
Detecting things that fall out of pattern, like changes in location (data security)
The first one (biometrics) is basically proving who you are through a way that’s inherent to who you are. This could be a fingerprint scan (now pretty common on most smartphones) or a device only you should have access to, like a security fob. This does interleave into online security sometimes (as anyone who’s had to use a special USB key to access a VPN will know), but it’s not something that will factor into connecting Amazon Web Services (AWS) for a Data Cloud connection for a group of users (who clearly cannot share said USB key). The third one (data security) is why you get fraud alerts when you use your credit card in a place or in a way that might seem strange to your purchasing decision.
For this blog, we’ll focus more on our second broad stroke, credentials.
What do you know?
The best circle of trust is the smallest
Many admins are familiar with the Principle of Least Privilege. It’s a basic concept of security that says if a user doesn’t need access, it’s a security risk for them to have it. Since we’re leaning into “what you know,” this concept is 10 times more important. If you wear a T-shirt with your Netflix username and password on it, don’t be shocked when your account is hacked. The circle of trust here should be only the user and whatever information the system requires to work.
Security through obscurity
It’s possible some connectors rely on an age-old concept: security through obscurity. Let’s say you have a file that has such a randomly generated URL that it would be impossible to guess through human means. This means that linking to the file is relatively safe if only humans attempt to find this URL.
Needless to say, this is only safe if 1) A human can’t guess it, 2) Software can’t find it, and 3) A user with malintent doesn’t have access to the URL. This is the easiest solution to implement but easily the riskiest, and therefore is very rarely used, if ever, with sensitive solutions like Data Cloud.
Just the facts: Basic Authentication
Let’s start with the simplest concept: Basic Authentication. This is what we all do all the time and relies on knowing a username and password. If you can replicate the same username and password on the system, you get access. Almost universally, the password is encrypted on the system side so that the only person who knows the credentials is the user (keeping that circle of trust). The problem with Basic Authentication is that if it’s used repeatedly and automatically, it means those credentials need to be stored somewhere, even if they’re stored securely and encrypted. And so the circle of trust is based on how well those credentials are stored. It’s simple and easy to implement but, even with decent encryption, means you’re trusting a system to provide that security.
For a solution like Data Cloud, Basic Authentication has a specific flaw: session handling. Basic Authentication presumes that once you’re done using a site (or similar resource), it will close out your session and have you log back in again. This keeps someone with malintent from using previously logged-in clients and skipping the part where they need to know your credentials. However, Data Cloud relies on a constant flow of information, making this concept difficult if not untenable.
The secret handshake you know: OAuth
Easily the most widely implemented security method on the planet is Open Authorization, or OAuth. If you’ve ever logged in to an app, whether it be a mobile game or through an existing login like Facebook or Google, you’ve used OAuth. This allows you to centralize basic authentication (from above) from trusted sources and let them negotiate for you. Using an OAuth provider gives you two major advantages over Basic Authentication.
The third party only gets a token that proves you authorized access, not your username or password.
The OAuth provider can assert limits to the privileges the third party has; they only get the data you allow to have to make the third party work.
To use a Data Cloud connector that uses OAuth, there will be a one-time exchange to make a handshake between the two systems, and then a token is saved to keep the connection. That token might have a preset expiry date to force a new agreement and can be revoked at any time. OAuth resolves the problem of long-standing sessions by allowing the access token to refresh itself after it expires, creating a sustainable but easily-policed way to maintain a handshake over long periods of time.
Latch Key apps: API keys
Many application programming interfaces (APIs) require a developer key or access token to be able to use the interface. This makes contacting the endpoint more secure than just having a URL in many ways. For one thing, it often has to be passed in the body of the request itself, meaning it can’t be accessed just by putting a URL into a web browser. Also, the key is computer generated, unique, and nearly impossible to guess. This allows the system running the endpoint to track what the application accessing it is doing. For instance, many APIs have throttle limits, and the key allows them to track usage.
For Data Cloud, you may only need to put the API key into the connector configuration. Other connectors may require you to generate a token based on the key and then use that token. See the specific connector documentation for steps.
The sci-fi token: JWT
A JSON Web Token (JWT) is standard to use JavaScript Object Notation (JSON) as a means to encapsulate a token similar to the Access Token of OAuth, but makes it more informative and secure. JSON is a very common and easy-to-read data format widely used on the internet. The token itself can be encrypted, to keep it safe from third parties stealing and using it, and can also keep a payload of data that describes things like who the user of the token is and how the token should be used. If an OAuth handshake is a letter, a JWT is a dictionary. It also, however, generally requires a programmatic method, including the ability to sign the encryption, to generate. So, it gives more fidelity than an OAuth handshake but also requires more to generate.
Similar to API keys, JWTs require an exchange with the server to be generated. See the specific connector documentation for steps.
Authentication in a nutshell
To recap, authentication comes in many forms. Data Cloud connectors will have different kinds depending on the mechanisms being used by the third party. Using them isn’t a complicated task, but they are often buried in engineering talk that can make them seem daunting at first. When in doubt, refer to the specific documentation for the connector and, as always, keep an eye on admin.salesforce.com for more.
Resources
Trailhead: Connect Your Data Sources
Trailhead: Maximize Data Integration With Connectors
Salesforce Site: Data Cloud Connectors Directory
External Site: YouTube: Creating Data Streams Using Salesforce CRM Connector | Data Cloud Decoded
The post Authenticating Data Cloud Connectors: A Primer for Admins appeared first on Salesforce Admins.