Approximately some 33%-50% of my assignments tend to deal with data focused ETL / ELT projects.
In majority of these projects a fantastic service called Azure Data Factory is used to orchestrate and control pipeline flows.
Scenarios
Every now and then a secret or long lived a token leaks in the pipeline inputs / output logs.

- Leaking secrets in pipeline might invalidate this particular values secret storing policy (secrets should be stored in Azure Key Vault, which enforces audit and access policies to the secret)
- Sometimes there are long-lived tokens stored in run logs that fetched from another service as part of the pipeline
Hunting with EAST
EAST is our (Nixu Microsoft Team & yours truly) upcoming Azure Security Scanning tool https://github.com/jsa2/EAST
Example pipeline
- Output of Azure Function exposes access token (this is just short-lived one-hour token, but serves as example)
- Variable named secret is stored as variable

- Start the scan with east
node ./plugins/main.js --batch=5 --nativescope=true --roleAssignments=false --helperTexts=true --namespace=microsoft.datafactory --checkAad=false --scanAuditLogs
- East now highlights the likely possibilites for leaked secrets

End of blog
Stay tuned for more sneak peeks!
0 comments on “Hunting for secrets in Azure Data Factory pipeline run inputs and outputs”