How AppClerk Scans Your Codebase for Compliance

When you connect a project to AppClerk, our system performs a comprehensive analysis of your codebase to understand exactly what data your application collects and how it's used. This process, which happens automatically in the background, forms the foundation for generating accurate, compliant privacy policies.

In this technical deep dive, we'll explore how AppClerk's scanning engine works under the hood.

The Scanning Architecture

AppClerk's scanner operates in multiple phases:

Repository Access - Securely clones or accesses your GitHub repository
Framework Detection - Identifies your project's technology stack
Dependency Parsing - Analyzes package manifests and lock files
Code Analysis - Scans source files for SDK usage and patterns
Configuration Reading - Examines platform-specific config files
Compliance Mapping - Matches findings to privacy requirements

Each phase builds upon the previous one, creating a complete picture of your app's data practices.

Framework Detection

The scanner first determines what type of project you're working with. This is crucial because different frameworks have different scanning strategies.

React Native Detection

For React Native projects, AppClerk looks for:

package.json with React Native dependencies
android/ and ios/ directories
react-native.config.js or metro.config.js

Expo Detection

Expo projects are identified by:

app.json or app.config.js with Expo configuration
expo package in dependencies
eas.json for Expo Application Services

Flutter Detection

Flutter projects are recognized through:

pubspec.yaml file
lib/ directory structure
Flutter-specific dependencies

Web Application Detection

For web projects, the scanner:

Analyzes HTML files for tracking scripts
Checks for analytics libraries in JavaScript
Examines form fields and data collection points

Once the framework is identified, the scanner applies framework-specific analysis rules.

Dependency Analysis

The next step is parsing your dependency files to understand what third-party services you're using.

Package.json Analysis (Node.js/React Native)

For projects using npm/yarn, AppClerk reads package.json and identifies packages that indicate data collection:

{
  "dependencies": {
    "@react-native-firebase/analytics": "^18.0.0",
    "stripe": "^14.0.0",
    "@sentry/react-native": "^5.0.0"
  }
}

The scanner recognizes:

@react-native-firebase/* → Firebase services
stripe → Payment processing
@sentry/* → Error tracking and crash reporting

Pubspec.yaml Analysis (Flutter)

For Flutter projects, the scanner reads pubspec.yaml:

dependencies:
  firebase_analytics: ^10.0.0
  stripe_payment: ^2.0.0
  sentry_flutter: ^7.0.0

Similar patterns are detected and mapped to privacy policy requirements.

Lock File Verification

AppClerk also checks lock files (package-lock.json, yarn.lock, pubspec.lock) to verify the exact versions being used, which helps determine specific compliance requirements for each SDK version.

SDK and Service Detection

Beyond dependency files, the scanner analyzes your source code for SDK initialization and usage patterns.

Pattern Matching

The scanner uses regex patterns to find SDK usage:

// Firebase Analytics
firebase.analytics().logEvent('event_name', { ... })

// Stripe
stripe.paymentIntents.create({ ... })

// Sentry
Sentry.captureException(error)

Each pattern match is recorded with:

SDK name and type (analytics, payment, crash reporting, etc.)
File location where it's used
Context about how it's being used

Import Statement Analysis

The scanner also tracks import statements:

import { initializeApp } from "firebase/app";
import { getAnalytics } from "firebase/analytics";
import Stripe from "stripe";

This helps identify SDKs even when they're not actively used in the scanned code paths.

Permission Mapping

For mobile apps, permissions declared in platform-specific files directly indicate data collection.

iOS Permissions (Info.plist)

AppClerk scans ios/Info.plist or app.json for iOS permissions:

<key>NSLocationWhenInUseUsageDescription</key>
<string>We need your location to show nearby restaurants</string>

<key>NSCameraUsageDescription</key>
<string>Camera access for profile photos</string>

Each permission maps to specific data types:

NSLocationWhenInUseUsageDescription → Location Data
NSCameraUsageDescription → Photos/Images
NSContactsUsageDescription → Contact Information
NSHealthShareUsageDescription → Health Information

Android Permissions (AndroidManifest.xml)

For Android, the scanner reads AndroidManifest.xml:

<uses-permission android:name="android.permission.ACCESS_FINE_LOCATION" />
<uses-permission android:name="android.permission.CAMERA" />
<uses-permission android:name="android.permission.READ_CONTACTS" />

These are mapped to the same data categories as iOS permissions.

Expo Permissions (app.json)

For Expo apps, permissions are declared in app.json:

{
  "expo": {
    "ios": {
      "infoPlist": {
        "NSLocationWhenInUseUsageDescription": "..."
      }
    },
    "android": {
      "permissions": ["ACCESS_FINE_LOCATION"]
    }
  }
}

The scanner handles all these formats consistently.

Data Collection Inference

Once SDKs and permissions are identified, AppClerk infers what data is actually being collected.

SDK-to-Data Mapping

Each detected SDK has a known data collection profile:

Firebase Analytics collects:

Device identifiers (IDFA, Android ID)
Usage data (app interactions, screen views)
Device information (model, OS version)

Stripe collects:

Payment information (card details, processed server-side)
Transaction metadata
Billing addresses

Sentry collects:

Error logs and stack traces
Device information
User actions leading to errors

Permission-to-Data Mapping

Permissions directly indicate data access:

Location permissions → Precise or approximate location data
Camera permission → Photo/video capture
Contacts permission → Contact list access
Microphone permission → Audio recordings

Combining Signals

The scanner combines multiple signals to build a complete picture:

Firebase Analytics detected + Location permission →
"Location data collected for analytics purposes"

Stripe detected + No payment permissions →
"Payment information processed securely (not stored locally)"

Compliance Rule Matching

Finally, AppClerk matches your app's data collection profile against compliance requirements.

App Store Requirements

For iOS apps, the scanner checks against Apple's 14 data type categories:

Contact Info
Health & Fitness
Financial Info
Location
Sensitive Info
Contacts
User Content
Browsing History
Search History
Identifiers
Purchases
Usage Data
Diagnostics
Other Data

Each category requires specific disclosures in your App Store Privacy form.

Play Store Requirements

For Android apps, Google Play's Data Safety section requires similar disclosures, which the scanner maps automatically.

The scanner also flags data collection that may require GDPR compliance:

Personal data collection from EU users
Data sharing with third parties
User rights (access, deletion, portability)

Continuous Monitoring

AppClerk doesn't just scan once. When you:

Push new code to your repository
Update dependencies
Add new SDKs
Request additional permissions

The scanner re-runs automatically and flags any changes that require policy updates. You'll receive notifications like:

"New SDK detected: @react-native-community/geolocation. Your privacy policy may need to include location data collection disclosure."

Conclusion

AppClerk's scanning engine combines multiple analysis techniques—dependency parsing, code pattern matching, permission detection, and compliance rule mapping—to automatically understand your app's data practices. This eliminates the manual work of auditing your codebase and ensures your privacy policy accurately reflects what your app actually does.

The result? A privacy policy that's not just compliant, but continuously maintained as your app evolves.